Bedrock

A fully managed platform to build GenAI apps via a single API across many model providers (Anthropic, Meta, Mistral, Cohere, Amazon Nova, etc.), with first-class features for guardrails, RAG (Knowledge Bases), agents, fine-tuning/custom models, and enterprise security (VPC endpoints, KMS). ([Amazon Web Services, Inc.][1])

Author

Benedict Thekkel

Model lineup & modalities (high level)

Amazon Nova family (Micro/Lite/Pro + Canvas/Reel + Premier/Sonic variants): text + multimodal (image/video/speech) options, region coverage varies. (AWS Documentation)
3rd-party FMs: Anthropic Claude, Meta Llama, Mistral, Cohere, AI21 (Jamba), Stability, TwelveLabs, Writer (availability varies by region). (Amazon Web Services, Inc.)

Tip: The Models doc lists model IDs, regions, and streaming support—bookmark it. (AWS Documentation)

The core building blocks

Capability	What it solves	When to use
Converse API	One messages API across models; supports tools/function-calling & streaming.	Unify your client/server code paths. (AWS Documentation)
Guardrails	Safety, PII redaction, topic filters; works with Agents & KBs.	Enterprise policy enforcement and factuality checks. (AWS Documentation)
Knowledge Bases (RAG)	Managed ingestion, chunking, embeddings, vector index; prompt-augmentation out of the box.	Ship RAG fast; swap indexes later if needed. (AWS Documentation)
Agents (incl. AgentCore)	Orchestrate tools, KBs, multi-step tasks; new AgentCore adds gateway, memory, browser, runtime.	Complex workflows, autonomous task execution. (AWS Documentation)
Custom Models	Fine-tuning/continued pre-training/distillation on managed jobs.	Domain specialization without hosting your own stack. (AWS Documentation)

Pricing (mental model)

On-demand: pay per input/output token (or media unit). Best to start. (Amazon Web Services, Inc.)
Provisioned Throughput (PTT): dedicated capacity, discounted for steady high volume. (Amazon Web Services, Inc.)
Batch mode: cheaper for large offline jobs.
Customization costs: extra for fine-tuning/hosting artifacts—budget separately. (DEV Community)

Rule of thumb: start on on-demand → measure → move heavy/steady tenants to PTT.

Security, privacy & networking (the important bits)

Private networking: use VPC endpoints/PrivateLink for Bedrock + S3; keep traffic off the public internet. (AWS Documentation)
Encryption: use KMS for prompts, logs, and Knowledge Base resources (AWS-owned key by default; bring CMK if needed). (AWS Documentation)
Data usage: AWS shared responsibility model applies; you control content/config. (AWS Documentation)
Guardrails: plug in at inference time (Converse/stream) or within Agents/KBs. (AWS Documentation)

Quick starts

1) Call a model (Python, Converse API)

import boto3, json
brt = boto3.client("bedrock-runtime", region_name="ap-southeast-2")

resp = brt.converse(
  modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
  messages=[{"role":"user","content":[{"text":"Summarise RAG in 3 bullets."}]}],
  inferenceConfig={"maxTokens": 512, "temperature": 0.2}
)
print(resp["output"]["message"]["content"][0]["text"])

Converse works similarly across supported models; add tool schemas for function-calling when you need it. (AWS Documentation)

2) Spin up a Knowledge Base (RAG)

Point it at S3, pick an embedding model and vector store (Bedrock-managed options).
Bedrock handles ingestion/chunking/embedding; you query via KB APIs or let Agents use it. (AWS Documentation)

3) Add Guardrails (policy + PII + contextual grounding)

Create a guardrail policy; attach it in Converse or to your Agent/KB.
Use grounding checks to reduce hallucinations against your RAG context. (AWS Documentation)

Architecture patterns (pick your lane)

A. Pure managed Client → API (private) → Bedrock (Converse) → Guardrails → Knowledge Base Good for: fastest path, least ops.

B. Agentic workflows User → Agent (AgentCore) → Tools (APIs/Lambda), KB, Code-Interpreter, Browser → Model Good for: multi-step tasks, enterprise workflows. (TechRadar)

C. Hybrid RAG Bedrock for inference + OpenSearch/Aurora pgvector for vectors; swap in/out as needs evolve. (KB still a great default.) (AWS Documentation)

Tuning & throughput tips

Prefer streaming for UX; it’s supported on most chat models. (AWS Documentation)
Keep temperature low for factual tasks; raise only for ideation.
Cache retrievals and dedupe media in multi-modal workloads.
For stable high QPS, move to Provisioned Throughput and right-size capacity. (Amazon Web Services, Inc.)

Regions & availability

Model availability (and streaming) is per-model, per-region—check the table before you pick IDs for prod (e.g., some Nova variants in ap-southeast-2 today, others via cross-region). (AWS Documentation)

Common pitfalls (and fixes)

“Model not found” in region → switch to a supported region or use the cross-region listing. (AWS Documentation)
Hallucinations in RAG → enable contextual grounding guardrails and tighten retrieval metadata filters. (Amazon Web Services, Inc.)
Data egress surprises → use VPC endpoints; keep S3 + Bedrock private. (AWS Documentation)
Latency spikes at scale → adopt PTT; split tenants by workload. (Amazon Web Services, Inc.)

When Bedrock vs roll-your-own?

Choose Bedrock when you want model choice, managed safety/RAG/agents, and private networking without owning infra.
Roll your own (EKS + vLLM/TGI) when you need custom runtimes, super-tight $/token, or niche models not offered in Bedrock. (You can still keep Bedrock for managed bits like KB/Guardrails.)

If you tell me your target region (likely ap-southeast-2), model(s), and latency/QPS/budget, I’ll sketch a minimal IaC plan (Terraform/CDK) plus a load test harness you can run today to size on-demand vs provisioned throughput.

Code

Available Models

import boto3
import os
import json
from dotenv import load_dotenv
from rich.console import Console
from rich.table import Table
from rich import box

# Load environment variables
load_dotenv()

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

# Initialize clients
bedrock_client = boto3.client(
    "bedrock",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

console = Console()
table = Table(
    title="🧠 AWS Bedrock Models You Can Invoke",
    header_style="bold cyan",
    box=box.ROUNDED,
)

table.add_column("✅ Model Name", style="bold white", no_wrap=True)
table.add_column("Provider", style="magenta")
table.add_column("Status", style="green")
table.add_column("Model ID (for invoke_model)", style="yellow")

accessible_models = []

# Fetch models
resp = bedrock_client.list_foundation_models()

for model in resp.get("modelSummaries", []):
    name = model.get("modelName")
    provider = model.get("providerName")
    status = model.get("modelLifecycle", {}).get("status")
    model_id = model.get("modelId")

    if status and status.upper() == "ACTIVE":
        status_icon = "[green]ACTIVE"
        accessible_models.append(model_id)
    else:
        status_icon = f"[red]{status or 'Unknown'}"

    table.add_row(name or "—", provider or "—", status_icon, model_id or "—")

console.print(table)

console.print(f"\n[bold green]Total ACTIVE models:[/bold green] {len(accessible_models)}\n")

if accessible_models:
    console.print("[bold cyan]You can use any of these model IDs like this:[/bold cyan]")
    example_model = accessible_models[0]
    console.print(
        f"""
[bold white]Example usage:[/bold white]

response = bedrock_runtime.invoke_model(
    modelId="{example_model}",
    body=json.dumps(body),
)
"""
    )
else:
    console.print("[red]No active models found. Enable model access in the AWS Bedrock console.[/red]")

                                   🧠 AWS Bedrock Models You Can Invoke                                    
╭─────────────────────────────────┬──────────────┬────────┬───────────────────────────────────────────────╮
│ ✅ Model Name                   │ Provider     │ Status │ Model ID (for invoke_model)                   │
├─────────────────────────────────┼──────────────┼────────┼───────────────────────────────────────────────┤
│ Stable Image Remove Background  │ Stability AI │ ACTIVE │ stability.stable-image-remove-background-v1:0 │
│ Stable Image Style Guide        │ Stability AI │ ACTIVE │ stability.stable-image-style-guide-v1:0       │
│ Stable Image Control Sketch     │ Stability AI │ ACTIVE │ stability.stable-image-control-sketch-v1:0    │
│ Claude Sonnet 4                 │ Anthropic    │ ACTIVE │ anthropic.claude-sonnet-4-20250514-v1:0       │
│ Stable Image Erase Object       │ Stability AI │ ACTIVE │ stability.stable-image-erase-object-v1:0      │
│ Stable Image Control Structure  │ Stability AI │ ACTIVE │ stability.stable-image-control-structure-v1:0 │
│ Stable Image Search and Recolor │ Stability AI │ ACTIVE │ stability.stable-image-search-recolor-v1:0    │
│ gpt-oss-120b                    │ OpenAI       │ ACTIVE │ openai.gpt-oss-120b-1:0                       │
│ Pegasus v1.2                    │ TwelveLabs   │ ACTIVE │ twelvelabs.pegasus-1-2-v1:0                   │
│ Stable Image Style Transfer     │ Stability AI │ ACTIVE │ stability.stable-style-transfer-v1:0          │
│ Embed v4                        │ Cohere       │ ACTIVE │ cohere.embed-v4:0                             │
│ Claude Sonnet 4.5               │ Anthropic    │ ACTIVE │ anthropic.claude-sonnet-4-5-20250929-v1:0     │
│ Marengo Embed v2.7              │ TwelveLabs   │ ACTIVE │ twelvelabs.marengo-embed-2-7-v1:0             │
│ Stable Image Search and Replace │ Stability AI │ ACTIVE │ stability.stable-image-search-replace-v1:0    │
│ Qwen3-Coder-30B-A3B-Instruct    │ Qwen         │ ACTIVE │ qwen.qwen3-coder-30b-a3b-v1:0                 │
│ Qwen3 32B (dense)               │ Qwen         │ ACTIVE │ qwen.qwen3-32b-v1:0                           │
│ Stable Image Inpaint            │ Stability AI │ ACTIVE │ stability.stable-image-inpaint-v1:0           │
│ gpt-oss-20b                     │ OpenAI       │ ACTIVE │ openai.gpt-oss-20b-1:0                        │
│ Claude Opus 4.1                 │ Anthropic    │ ACTIVE │ anthropic.claude-opus-4-1-20250805-v1:0       │
│ Nova Pro                        │ Amazon       │ ACTIVE │ amazon.nova-pro-v1:0                          │
│ Titan Text Large                │ Amazon       │ ACTIVE │ amazon.titan-tg1-large                        │
│ Titan Image Generator G1        │ Amazon       │ ACTIVE │ amazon.titan-image-generator-v1:0             │
│ Titan Image Generator G1        │ Amazon       │ ACTIVE │ amazon.titan-image-generator-v1               │
│ Titan Image Generator G1 v2     │ Amazon       │ ACTIVE │ amazon.titan-image-generator-v2:0             │
│ Nova Premier                    │ Amazon       │ ACTIVE │ amazon.nova-premier-v1:0:8k                   │
│ Nova Premier                    │ Amazon       │ ACTIVE │ amazon.nova-premier-v1:0:20k                  │
│ Nova Premier                    │ Amazon       │ ACTIVE │ amazon.nova-premier-v1:0:1000k                │
│ Nova Premier                    │ Amazon       │ ACTIVE │ amazon.nova-premier-v1:0:mm                   │
│ Nova Premier                    │ Amazon       │ ACTIVE │ amazon.nova-premier-v1:0                      │
│ Nova Pro                        │ Amazon       │ ACTIVE │ amazon.nova-pro-v1:0:24k                      │
│ Nova Pro                        │ Amazon       │ ACTIVE │ amazon.nova-pro-v1:0:300k                     │
│ Nova Lite                       │ Amazon       │ ACTIVE │ amazon.nova-lite-v1:0:24k                     │
│ Nova Lite                       │ Amazon       │ ACTIVE │ amazon.nova-lite-v1:0:300k                    │
│ Nova Lite                       │ Amazon       │ ACTIVE │ amazon.nova-lite-v1:0                         │
│ Nova Canvas                     │ Amazon       │ ACTIVE │ amazon.nova-canvas-v1:0                       │
│ Nova Reel                       │ Amazon       │ ACTIVE │ amazon.nova-reel-v1:0                         │
│ Nova Reel                       │ Amazon       │ ACTIVE │ amazon.nova-reel-v1:1                         │
│ Nova Micro                      │ Amazon       │ ACTIVE │ amazon.nova-micro-v1:0:24k                    │
│ Nova Micro                      │ Amazon       │ ACTIVE │ amazon.nova-micro-v1:0:128k                   │
│ Nova Micro                      │ Amazon       │ ACTIVE │ amazon.nova-micro-v1:0                        │
│ Nova Sonic                      │ Amazon       │ ACTIVE │ amazon.nova-sonic-v1:0                        │
│ Titan Text Embeddings v2        │ Amazon       │ ACTIVE │ amazon.titan-embed-g1-text-02                 │
│ Titan Text G1 - Lite            │ Amazon       │ ACTIVE │ amazon.titan-text-lite-v1:0:4k                │
│ Titan Text G1 - Lite            │ Amazon       │ ACTIVE │ amazon.titan-text-lite-v1                     │
│ Titan Text G1 - Express         │ Amazon       │ ACTIVE │ amazon.titan-text-express-v1:0:8k             │
│ Titan Text G1 - Express         │ Amazon       │ ACTIVE │ amazon.titan-text-express-v1                  │
│ Titan Embeddings G1 - Text      │ Amazon       │ ACTIVE │ amazon.titan-embed-text-v1:2:8k               │
│ Titan Embeddings G1 - Text      │ Amazon       │ ACTIVE │ amazon.titan-embed-text-v1                    │
│ Titan Text Embeddings V2        │ Amazon       │ ACTIVE │ amazon.titan-embed-text-v2:0:8k               │
│ Titan Text Embeddings V2        │ Amazon       │ ACTIVE │ amazon.titan-embed-text-v2:0                  │
│ Titan Multimodal Embeddings G1  │ Amazon       │ ACTIVE │ amazon.titan-embed-image-v1:0                 │
│ Titan Multimodal Embeddings G1  │ Amazon       │ ACTIVE │ amazon.titan-embed-image-v1                   │
│ SDXL 1.0                        │ Stability AI │ LEGACY │ stability.stable-diffusion-xl-v1:0            │
│ SDXL 1.0                        │ Stability AI │ LEGACY │ stability.stable-diffusion-xl-v1              │
│ Jamba 1.5 Large                 │ AI21 Labs    │ ACTIVE │ ai21.jamba-1-5-large-v1:0                     │
│ Jamba 1.5 Mini                  │ AI21 Labs    │ ACTIVE │ ai21.jamba-1-5-mini-v1:0                      │
│ Claude Instant                  │ Anthropic    │ LEGACY │ anthropic.claude-instant-v1:2:100k            │
│ Claude                          │ Anthropic    │ LEGACY │ anthropic.claude-v2:0:18k                     │
│ Claude                          │ Anthropic    │ LEGACY │ anthropic.claude-v2:0:100k                    │
│ Claude                          │ Anthropic    │ LEGACY │ anthropic.claude-v2:1:18k                     │
│ Claude                          │ Anthropic    │ LEGACY │ anthropic.claude-v2:1:200k                    │
│ Claude 3 Sonnet                 │ Anthropic    │ LEGACY │ anthropic.claude-3-sonnet-20240229-v1:0:28k   │
│ Claude 3 Sonnet                 │ Anthropic    │ LEGACY │ anthropic.claude-3-sonnet-20240229-v1:0:200k  │
│ Claude 3 Sonnet                 │ Anthropic    │ LEGACY │ anthropic.claude-3-sonnet-20240229-v1:0       │
│ Claude 3 Haiku                  │ Anthropic    │ ACTIVE │ anthropic.claude-3-haiku-20240307-v1:0:48k    │
│ Claude 3 Haiku                  │ Anthropic    │ ACTIVE │ anthropic.claude-3-haiku-20240307-v1:0:200k   │
│ Claude 3 Haiku                  │ Anthropic    │ ACTIVE │ anthropic.claude-3-haiku-20240307-v1:0        │
│ Claude 3 Opus                   │ Anthropic    │ ACTIVE │ anthropic.claude-3-opus-20240229-v1:0:12k     │
│ Claude 3 Opus                   │ Anthropic    │ ACTIVE │ anthropic.claude-3-opus-20240229-v1:0:28k     │
│ Claude 3 Opus                   │ Anthropic    │ ACTIVE │ anthropic.claude-3-opus-20240229-v1:0:200k    │
│ Claude 3 Opus                   │ Anthropic    │ ACTIVE │ anthropic.claude-3-opus-20240229-v1:0         │
│ Claude 3.5 Sonnet               │ Anthropic    │ ACTIVE │ anthropic.claude-3-5-sonnet-20240620-v1:0     │
│ Claude 3.5 Sonnet v2            │ Anthropic    │ ACTIVE │ anthropic.claude-3-5-sonnet-20241022-v2:0     │
│ Claude 3.7 Sonnet               │ Anthropic    │ ACTIVE │ anthropic.claude-3-7-sonnet-20250219-v1:0     │
│ Claude 3.5 Haiku                │ Anthropic    │ ACTIVE │ anthropic.claude-3-5-haiku-20241022-v1:0      │
│ Claude Opus 4                   │ Anthropic    │ ACTIVE │ anthropic.claude-opus-4-20250514-v1:0         │
│ Command R                       │ Cohere       │ ACTIVE │ cohere.command-r-v1:0                         │
│ Command R+                      │ Cohere       │ ACTIVE │ cohere.command-r-plus-v1:0                    │
│ Embed English                   │ Cohere       │ ACTIVE │ cohere.embed-english-v3:0:512                 │
│ Embed English                   │ Cohere       │ ACTIVE │ cohere.embed-english-v3                       │
│ Embed Multilingual              │ Cohere       │ ACTIVE │ cohere.embed-multilingual-v3:0:512            │
│ Embed Multilingual              │ Cohere       │ ACTIVE │ cohere.embed-multilingual-v3                  │
│ Rerank 3.5                      │ Cohere       │ ACTIVE │ cohere.rerank-v3-5:0                          │
│ DeepSeek-R1                     │ DeepSeek     │ ACTIVE │ deepseek.r1-v1:0                              │
│ Llama 3 8B Instruct             │ Meta         │ ACTIVE │ meta.llama3-8b-instruct-v1:0                  │
│ Llama 3 70B Instruct            │ Meta         │ ACTIVE │ meta.llama3-70b-instruct-v1:0                 │
│ Llama 3.1 8B Instruct           │ Meta         │ ACTIVE │ meta.llama3-1-8b-instruct-v1:0                │
│ Llama 3.1 70B Instruct          │ Meta         │ ACTIVE │ meta.llama3-1-70b-instruct-v1:0               │
│ Llama 3.2 11B Instruct          │ Meta         │ ACTIVE │ meta.llama3-2-11b-instruct-v1:0               │
│ Llama 3.2 90B Instruct          │ Meta         │ ACTIVE │ meta.llama3-2-90b-instruct-v1:0               │
│ Llama 3.2 1B Instruct           │ Meta         │ ACTIVE │ meta.llama3-2-1b-instruct-v1:0                │
│ Llama 3.2 3B Instruct           │ Meta         │ ACTIVE │ meta.llama3-2-3b-instruct-v1:0                │
│ Llama 3.3 70B Instruct          │ Meta         │ ACTIVE │ meta.llama3-3-70b-instruct-v1:0               │
│ Llama 4 Scout 17B Instruct      │ Meta         │ ACTIVE │ meta.llama4-scout-17b-instruct-v1:0           │
│ Llama 4 Maverick 17B Instruct   │ Meta         │ ACTIVE │ meta.llama4-maverick-17b-instruct-v1:0        │
│ Mistral 7B Instruct             │ Mistral AI   │ ACTIVE │ mistral.mistral-7b-instruct-v0:2              │
│ Mixtral 8x7B Instruct           │ Mistral AI   │ ACTIVE │ mistral.mixtral-8x7b-instruct-v0:1            │
│ Mistral Large (24.02)           │ Mistral AI   │ ACTIVE │ mistral.mistral-large-2402-v1:0               │
│ Mistral Small (24.02)           │ Mistral AI   │ ACTIVE │ mistral.mistral-small-2402-v1:0               │
│ Pixtral Large (25.02)           │ Mistral AI   │ ACTIVE │ mistral.pixtral-large-2502-v1:0               │
╰─────────────────────────────────┴──────────────┴────────┴───────────────────────────────────────────────╯

Total ACTIVE models: 90

You can use any of these model IDs like this:

Example usage:

response = bedrock_runtime.invoke_model(
    modelId="stability.stable-image-remove-background-v1:0",
    body=json.dumps(body),
)

Invoke Model

model = bedrock_client.get_foundation_model(modelIdentifier="anthropic.claude-3-5-sonnet-20240620-v1:0")
print(json.dumps(model, indent=2))

{
  "ResponseMetadata": {
    "RequestId": "dccb652d-2cc7-4ffb-99fc-3a743cfe8c39",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Mon, 13 Oct 2025 06:05:36 GMT",
      "content-type": "application/json",
      "content-length": "711",
      "connection": "keep-alive",
      "x-amzn-requestid": "dccb652d-2cc7-4ffb-99fc-3a743cfe8c39"
    },
    "RetryAttempts": 0
  },
  "modelDetails": {
    "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0",
    "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0",
    "modelName": "Claude 3.5 Sonnet",
    "providerName": "Anthropic",
    "inputModalities": [
      "TEXT",
      "IMAGE"
    ],
    "outputModalities": [
      "TEXT"
    ],
    "responseStreamingSupported": true,
    "customizationsSupported": [],
    "inferenceTypesSupported": [
      "ON_DEMAND",
      "INFERENCE_PROFILE"
    ],
    "modelLifecycle": {
      "status": "ACTIVE"
    }
  }
}

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
region

'ap-southeast-2'

import boto3, os, json
from dotenv import load_dotenv

load_dotenv()

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

bedrock_runtime = boto3.client(
    "bedrock-runtime",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

prompt = """
You are a knowledgeable assistant.
Explain in simple terms how AWS Bedrock works and
how it differs from SageMaker.
"""

body = {
    "messages": [
        {"role": "user", "content": [{"type": "text", "text": prompt}]}
    ],
    "max_tokens": 400,
    "temperature": 0.7,
}

response = bedrock_runtime.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    body=json.dumps(body),
)


result = json.loads(response["body"].read())
print(result["output"]["message"]["content"][0]["text"])

---------------------------------------------------------------------------
ValidationException                       Traceback (most recent call last)
Cell In[4], line 15
      1 prompt = """
      2 You are a knowledgeable assistant.
      3 Explain in simple terms how AWS Bedrock works and
      4 how it differs from SageMaker.
      5 """
      7 body = {
      8     "messages": [
      9         {"role": "user", "content": [{"type": "text", "text": prompt}]}
   (...)     12     "temperature": 0.7,
     13 }
---> 15 response = bedrock_runtime.invoke_model(
     16     modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
     17     body=json.dumps(body),
     18 )
     21 result = json.loads(response["body"].read())
     22 print(result["output"]["message"]["content"][0]["text"])

File ~/Documents/Knowledge/.venv/lib/python3.12/site-packages/botocore/client.py:602, in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
    598     raise TypeError(
    599         f"{py_operation_name}() only accepts keyword arguments."
    600     )
    601 # The "self" in this scope is referring to the BaseClient.
--> 602 return self._make_api_call(operation_name, kwargs)

File ~/Documents/Knowledge/.venv/lib/python3.12/site-packages/botocore/context.py:123, in with_current_context.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    121 if hook:
    122     hook()
--> 123 return func(*args, **kwargs)

File ~/Documents/Knowledge/.venv/lib/python3.12/site-packages/botocore/client.py:1078, in BaseClient._make_api_call(self, operation_name, api_params)
   1074     error_code = request_context.get(
   1075         'error_code_override'
   1076     ) or error_info.get("Code")
   1077     error_class = self.exceptions.from_code(error_code)
-> 1078     raise error_class(parsed_response, operation_name)
   1079 else:
   1080     return parsed_response

ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.

Inference

import boto3, os, json
from dotenv import load_dotenv

load_dotenv()

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

# --- Control-plane client (metadata)
bedrock_client = boto3.client(
    "bedrock",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

profiles = bedrock_client.list_inference_profiles()

print("\n🧠 Available Inference Profiles:\n")
for p in profiles.get("inferenceProfileSummaries", []):
    name = p.get("inferenceProfileName")
    arn = p.get("inferenceProfileArn")
    model_id = p.get("modelId") or p.get("modelArn") or "—"
    region_hint = arn.split(":")[3] if arn else "?"
    print(f"- {name}")
    print(f"  ARN: {arn}")
    print(f"  Model: {model_id}")
    print(f"  Region: {region_hint}")
    print()


🧠 Available Inference Profiles:

- APAC Anthropic Claude 3 Sonnet
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-sonnet-20240229-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3.5 Sonnet
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-5-sonnet-20240620-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3 Haiku
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-haiku-20240307-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3.5 Sonnet v2
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3.7 Sonnet
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Nova Micro
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.amazon.nova-micro-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Nova Lite
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.amazon.nova-lite-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Nova Pro
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.amazon.nova-pro-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Claude Sonnet 4
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-sonnet-4-20250514-v1:0
  Model: —
  Region: ap-southeast-2

- Global Embed v4
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/global.cohere.embed-v4:0
  Model: —
  Region: ap-southeast-2

- Global Claude Sonnet 4.5
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/global.anthropic.claude-sonnet-4-5-20250929-v1:0
  Model: —
  Region: ap-southeast-2

- AU AU Anthropic Claude Sonnet 4.5
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/au.anthropic.claude-sonnet-4-5-20250929-v1:0
  Model: —
  Region: ap-southeast-2

# 🧠 Bedrock Claude Inference (APAC, Anthropic)
import boto3, os, json
from dotenv import load_dotenv
from rich.console import Console
from rich.panel import Panel
from rich.table import Table

console = Console()
load_dotenv()

# --- Config ---
region = "ap-southeast-2"  # Sydney
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

# --- Bedrock Clients ---
bedrock_client = boto3.client(
    "bedrock",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)
bedrock_runtime = boto3.client(
    "bedrock-runtime",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

# --- Detect latest Anthropic inference profile ---
profiles = bedrock_client.list_inference_profiles()["inferenceProfileSummaries"]
anthropic_profiles = [p for p in profiles if "anthropic" in p["inferenceProfileArn"]]
anthropic_profiles.sort(key=lambda p: p["inferenceProfileName"], reverse=True)

if not anthropic_profiles:
    raise RuntimeError("No Anthropic inference profiles found in your account.")

profile = anthropic_profiles[0]
model_arn = profile["inferenceProfileArn"]
model_name = profile["inferenceProfileName"]

# --- Display available profiles ---
table = Table(title="Available Anthropic Profiles (Sorted)")
table.add_column("Name")
table.add_column("ARN", overflow="fold")
table.add_column("Region")

for p in anthropic_profiles:
    table.add_row(p["inferenceProfileName"], p["inferenceProfileArn"], p["inferenceProfileArn"].split(":")[3])
console.print(table)
console.print(Panel.fit(f"Using latest profile:\n[b]{model_name}[/b]\n{model_arn}", title="Model Selected"))

                                       Available Anthropic Profiles (Sorted)                                       
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Name                                ┃ ARN                                                      ┃ Region         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ Global Claude Sonnet 4.5            │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/global.anthropic.claude-sonnet-4-5-20250929-v1:0   │                │
│ AU AU Anthropic Claude Sonnet 4.5   │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/au.anthropic.claude-sonnet-4-5-20250929-v1:0       │                │
│ APAC Claude Sonnet 4                │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-sonnet-4-20250514-v1:0       │                │
│ APAC Anthropic Claude 3.7 Sonnet    │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0     │                │
│ APAC Anthropic Claude 3.5 Sonnet v2 │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0     │                │
│ APAC Anthropic Claude 3.5 Sonnet    │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-5-sonnet-20240620-v1:0     │                │
│ APAC Anthropic Claude 3 Sonnet      │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-sonnet-20240229-v1:0       │                │
│ APAC Anthropic Claude 3 Haiku       │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-haiku-20240307-v1:0        │                │
└─────────────────────────────────────┴──────────────────────────────────────────────────────────┴────────────────┘

╭──────────────────────────────────────────────── Model Selected ────────────────────────────────────────────────╮
│ Using latest profile:                                                                                          │
│ Global Claude Sonnet 4.5                                                                                       │
│ arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/global.anthropic.claude-sonnet-4-5-20250929-v1:0 │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

# --- Run prompt ---
prompt = """
You are a knowledgeable assistant.
Explain in simple terms how AWS Bedrock works and how it differs from SageMaker.
"""

body = {
    "anthropic_version": "bedrock-2023-05-31",  # ✅ required for Anthropic models
    "messages": [
        {"role": "user", "content": [{"type": "text", "text": prompt}]}
    ],
    "max_tokens": 400,
    "temperature": 0.7,
}


console.print("[cyan]Running inference...[/cyan]")
response = bedrock_runtime.invoke_model(
    modelId=model_arn,
    body=json.dumps(body),
)

# Parse new format
result = json.loads(response["body"].read())

# New Anthropic format
if "content" in result:
    # Most recent schema
    text_parts = [c["text"] for c in result["content"] if c["type"] == "text"]
    answer = "\n".join(text_parts)
# Old (legacy) fallback
elif "output" in result:
    answer = result["output"]["message"]["content"][0]["text"]
else:
    answer = json.dumps(result, indent=2)

console.print(Panel.fit(answer.strip(), title="Claude Response", border_style="green"))

Running inference...

╭──────────────────────────────────────────────── Claude Response ────────────────────────────────────────────────╮
│ # AWS Bedrock vs SageMaker: Simple Explanation                                                                  │
│                                                                                                                 │
│ ## AWS Bedrock                                                                                                  │
│                                                                                                                 │
│ **What it is:** A fully managed service that gives you easy access to pre-built AI models from leading          │
│ companies through a simple API.                                                                                 │
│                                                                                                                 │
│ **How it works:**                                                                                               │
│ - Choose from ready-made foundation models (like Claude, Llama, Stable Diffusion, etc.)                         │
│ - Call the model through an API - no setup required                                                             │
│ - Customize models with your own data using simple techniques                                                   │
│ - Pay only for what you use                                                                                     │
│                                                                                                                 │
│ **Think of it like:** Renting a fully furnished apartment - everything is ready to use, just move in and start  │
│ living.                                                                                                         │
│                                                                                                                 │
│ ## AWS SageMaker                                                                                                │
│                                                                                                                 │
│ **What it is:** A comprehensive machine learning platform for building, training, and deploying your own custom │
│ ML models.                                                                                                      │
│                                                                                                                 │
│ **How it works:**                                                                                               │
│ - Build models from scratch or use existing frameworks                                                          │
│ - Prepare and process your data                                                                                 │
│ - Train models using your own algorithms                                                                        │
│ - Deploy and manage the infrastructure                                                                          │
│ - Fine-tune everything to your specific needs                                                                   │
│                                                                                                                 │
│ **Think of it like:** Building your own house - you have complete control but need to handle construction,      │
│ materials, and maintenance.                                                                                     │
│                                                                                                                 │
│ ## Key Differences                                                                                              │
│                                                                                                                 │
│ | Aspect | Bedrock | SageMaker |                                                                                │
│ |--------|---------|-----------|                                                                                │
│ | **Complexity** | Simple, low-code | More technical, requires ML expertise |                                   │
│ | **Use Case** | Using existing AI models | Building custom ML solutions |                                      │
│ | **Setup Time** | Minutes | Hours to days |                                                                    │
│ | **Best For** | Quick AI integration, chatbots, content generation | Custom models, specific business problems │
│ |                                                                                                               │
│                                                                                                                 │
│ **Bottom line:** Use Bedrock for quick access to powerful AI models. Use SageMaker when you need custom ML      │
│ solutions tailored to unique requirements.                                                                      │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯