Bedrock

A fully managed platform to build GenAI apps via a single API across many model providers (Anthropic, Meta, Mistral, Cohere, Amazon Nova, etc.), with first-class features for guardrails, RAG (Knowledge Bases), agents, fine-tuning/custom models, and enterprise security (VPC endpoints, KMS). ([Amazon Web Services, Inc.][1])
Author

Benedict Thekkel

Model lineup & modalities (high level)

  • Amazon Nova family (Micro/Lite/Pro + Canvas/Reel + Premier/Sonic variants): text + multimodal (image/video/speech) options, region coverage varies. (AWS Documentation)
  • 3rd-party FMs: Anthropic Claude, Meta Llama, Mistral, Cohere, AI21 (Jamba), Stability, TwelveLabs, Writer (availability varies by region). (Amazon Web Services, Inc.)

Tip: The Models doc lists model IDs, regions, and streaming support—bookmark it. (AWS Documentation)


The core building blocks

Capability What it solves When to use
Converse API One messages API across models; supports tools/function-calling & streaming. Unify your client/server code paths. (AWS Documentation)
Guardrails Safety, PII redaction, topic filters; works with Agents & KBs. Enterprise policy enforcement and factuality checks. (AWS Documentation)
Knowledge Bases (RAG) Managed ingestion, chunking, embeddings, vector index; prompt-augmentation out of the box. Ship RAG fast; swap indexes later if needed. (AWS Documentation)
Agents (incl. AgentCore) Orchestrate tools, KBs, multi-step tasks; new AgentCore adds gateway, memory, browser, runtime. Complex workflows, autonomous task execution. (AWS Documentation)
Custom Models Fine-tuning/continued pre-training/distillation on managed jobs. Domain specialization without hosting your own stack. (AWS Documentation)

Pricing (mental model)

  • On-demand: pay per input/output token (or media unit). Best to start. (Amazon Web Services, Inc.)
  • Provisioned Throughput (PTT): dedicated capacity, discounted for steady high volume. (Amazon Web Services, Inc.)
  • Batch mode: cheaper for large offline jobs.
  • Customization costs: extra for fine-tuning/hosting artifacts—budget separately. (DEV Community)

Rule of thumb: start on on-demand → measure → move heavy/steady tenants to PTT.


Security, privacy & networking (the important bits)

  • Private networking: use VPC endpoints/PrivateLink for Bedrock + S3; keep traffic off the public internet. (AWS Documentation)
  • Encryption: use KMS for prompts, logs, and Knowledge Base resources (AWS-owned key by default; bring CMK if needed). (AWS Documentation)
  • Data usage: AWS shared responsibility model applies; you control content/config. (AWS Documentation)
  • Guardrails: plug in at inference time (Converse/stream) or within Agents/KBs. (AWS Documentation)

Quick starts

1) Call a model (Python, Converse API)

import boto3, json
brt = boto3.client("bedrock-runtime", region_name="ap-southeast-2")

resp = brt.converse(
  modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
  messages=[{"role":"user","content":[{"text":"Summarise RAG in 3 bullets."}]}],
  inferenceConfig={"maxTokens": 512, "temperature": 0.2}
)
print(resp["output"]["message"]["content"][0]["text"])

Converse works similarly across supported models; add tool schemas for function-calling when you need it. (AWS Documentation)

2) Spin up a Knowledge Base (RAG)

  • Point it at S3, pick an embedding model and vector store (Bedrock-managed options).
  • Bedrock handles ingestion/chunking/embedding; you query via KB APIs or let Agents use it. (AWS Documentation)

3) Add Guardrails (policy + PII + contextual grounding)

  • Create a guardrail policy; attach it in Converse or to your Agent/KB.
  • Use grounding checks to reduce hallucinations against your RAG context. (AWS Documentation)

Architecture patterns (pick your lane)

A. Pure managed Client → API (private) → Bedrock (Converse)GuardrailsKnowledge Base Good for: fastest path, least ops.

B. Agentic workflows User → Agent (AgentCore) → Tools (APIs/Lambda), KB, Code-Interpreter, Browser → Model Good for: multi-step tasks, enterprise workflows. (TechRadar)

C. Hybrid RAG Bedrock for inference + OpenSearch/Aurora pgvector for vectors; swap in/out as needs evolve. (KB still a great default.) (AWS Documentation)


Tuning & throughput tips

  • Prefer streaming for UX; it’s supported on most chat models. (AWS Documentation)
  • Keep temperature low for factual tasks; raise only for ideation.
  • Cache retrievals and dedupe media in multi-modal workloads.
  • For stable high QPS, move to Provisioned Throughput and right-size capacity. (Amazon Web Services, Inc.)

Regions & availability

Model availability (and streaming) is per-model, per-region—check the table before you pick IDs for prod (e.g., some Nova variants in ap-southeast-2 today, others via cross-region). (AWS Documentation)


Common pitfalls (and fixes)

  • “Model not found” in region → switch to a supported region or use the cross-region listing. (AWS Documentation)
  • Hallucinations in RAG → enable contextual grounding guardrails and tighten retrieval metadata filters. (Amazon Web Services, Inc.)
  • Data egress surprises → use VPC endpoints; keep S3 + Bedrock private. (AWS Documentation)
  • Latency spikes at scale → adopt PTT; split tenants by workload. (Amazon Web Services, Inc.)

When Bedrock vs roll-your-own?

  • Choose Bedrock when you want model choice, managed safety/RAG/agents, and private networking without owning infra.
  • Roll your own (EKS + vLLM/TGI) when you need custom runtimes, super-tight $/token, or niche models not offered in Bedrock. (You can still keep Bedrock for managed bits like KB/Guardrails.)

If you tell me your target region (likely ap-southeast-2), model(s), and latency/QPS/budget, I’ll sketch a minimal IaC plan (Terraform/CDK) plus a load test harness you can run today to size on-demand vs provisioned throughput.

Code

Available Models

import boto3
import os
import json
from dotenv import load_dotenv
from rich.console import Console
from rich.table import Table
from rich import box

# Load environment variables
load_dotenv()

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

# Initialize clients
bedrock_client = boto3.client(
    "bedrock",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

console = Console()
table = Table(
    title="🧠 AWS Bedrock Models You Can Invoke",
    header_style="bold cyan",
    box=box.ROUNDED,
)

table.add_column("✅ Model Name", style="bold white", no_wrap=True)
table.add_column("Provider", style="magenta")
table.add_column("Status", style="green")
table.add_column("Model ID (for invoke_model)", style="yellow")

accessible_models = []

# Fetch models
resp = bedrock_client.list_foundation_models()

for model in resp.get("modelSummaries", []):
    name = model.get("modelName")
    provider = model.get("providerName")
    status = model.get("modelLifecycle", {}).get("status")
    model_id = model.get("modelId")

    if status and status.upper() == "ACTIVE":
        status_icon = "[green]ACTIVE"
        accessible_models.append(model_id)
    else:
        status_icon = f"[red]{status or 'Unknown'}"

    table.add_row(name or "—", provider or "—", status_icon, model_id or "—")

console.print(table)

console.print(f"\n[bold green]Total ACTIVE models:[/bold green] {len(accessible_models)}\n")

if accessible_models:
    console.print("[bold cyan]You can use any of these model IDs like this:[/bold cyan]")
    example_model = accessible_models[0]
    console.print(
        f"""
[bold white]Example usage:[/bold white]

response = bedrock_runtime.invoke_model(
    modelId="{example_model}",
    body=json.dumps(body),
)
"""
    )
else:
    console.print("[red]No active models found. Enable model access in the AWS Bedrock console.[/red]")
                                   🧠 AWS Bedrock Models You Can Invoke                                    
╭─────────────────────────────────┬──────────────┬────────┬───────────────────────────────────────────────╮
│ ✅ Model Name                    Provider      Status  Model ID (for invoke_model)                   │
├─────────────────────────────────┼──────────────┼────────┼───────────────────────────────────────────────┤
│ Stable Image Remove Background   Stability AI  ACTIVE  stability.stable-image-remove-background-v1:0 │
│ Stable Image Style Guide         Stability AI  ACTIVE  stability.stable-image-style-guide-v1:0       │
│ Stable Image Control Sketch      Stability AI  ACTIVE  stability.stable-image-control-sketch-v1:0    │
│ Claude Sonnet 4                  Anthropic     ACTIVE  anthropic.claude-sonnet-4-20250514-v1:0       │
│ Stable Image Erase Object        Stability AI  ACTIVE  stability.stable-image-erase-object-v1:0      │
│ Stable Image Control Structure   Stability AI  ACTIVE  stability.stable-image-control-structure-v1:0 │
│ Stable Image Search and Recolor  Stability AI  ACTIVE  stability.stable-image-search-recolor-v1:0    │
│ gpt-oss-120b                     OpenAI        ACTIVE  openai.gpt-oss-120b-1:0                       │
│ Pegasus v1.2                     TwelveLabs    ACTIVE  twelvelabs.pegasus-1-2-v1:0                   │
│ Stable Image Style Transfer      Stability AI  ACTIVE  stability.stable-style-transfer-v1:0          │
│ Embed v4                         Cohere        ACTIVE  cohere.embed-v4:0                             │
│ Claude Sonnet 4.5                Anthropic     ACTIVE  anthropic.claude-sonnet-4-5-20250929-v1:0     │
│ Marengo Embed v2.7               TwelveLabs    ACTIVE  twelvelabs.marengo-embed-2-7-v1:0             │
│ Stable Image Search and Replace  Stability AI  ACTIVE  stability.stable-image-search-replace-v1:0    │
│ Qwen3-Coder-30B-A3B-Instruct     Qwen          ACTIVE  qwen.qwen3-coder-30b-a3b-v1:0                 │
│ Qwen3 32B (dense)                Qwen          ACTIVE  qwen.qwen3-32b-v1:0                           │
│ Stable Image Inpaint             Stability AI  ACTIVE  stability.stable-image-inpaint-v1:0           │
│ gpt-oss-20b                      OpenAI        ACTIVE  openai.gpt-oss-20b-1:0                        │
│ Claude Opus 4.1                  Anthropic     ACTIVE  anthropic.claude-opus-4-1-20250805-v1:0       │
│ Nova Pro                         Amazon        ACTIVE  amazon.nova-pro-v1:0                          │
│ Titan Text Large                 Amazon        ACTIVE  amazon.titan-tg1-large                        │
│ Titan Image Generator G1         Amazon        ACTIVE  amazon.titan-image-generator-v1:0             │
│ Titan Image Generator G1         Amazon        ACTIVE  amazon.titan-image-generator-v1               │
│ Titan Image Generator G1 v2      Amazon        ACTIVE  amazon.titan-image-generator-v2:0             │
│ Nova Premier                     Amazon        ACTIVE  amazon.nova-premier-v1:0:8k                   │
│ Nova Premier                     Amazon        ACTIVE  amazon.nova-premier-v1:0:20k                  │
│ Nova Premier                     Amazon        ACTIVE  amazon.nova-premier-v1:0:1000k                │
│ Nova Premier                     Amazon        ACTIVE  amazon.nova-premier-v1:0:mm                   │
│ Nova Premier                     Amazon        ACTIVE  amazon.nova-premier-v1:0                      │
│ Nova Pro                         Amazon        ACTIVE  amazon.nova-pro-v1:0:24k                      │
│ Nova Pro                         Amazon        ACTIVE  amazon.nova-pro-v1:0:300k                     │
│ Nova Lite                        Amazon        ACTIVE  amazon.nova-lite-v1:0:24k                     │
│ Nova Lite                        Amazon        ACTIVE  amazon.nova-lite-v1:0:300k                    │
│ Nova Lite                        Amazon        ACTIVE  amazon.nova-lite-v1:0                         │
│ Nova Canvas                      Amazon        ACTIVE  amazon.nova-canvas-v1:0                       │
│ Nova Reel                        Amazon        ACTIVE  amazon.nova-reel-v1:0                         │
│ Nova Reel                        Amazon        ACTIVE  amazon.nova-reel-v1:1                         │
│ Nova Micro                       Amazon        ACTIVE  amazon.nova-micro-v1:0:24k                    │
│ Nova Micro                       Amazon        ACTIVE  amazon.nova-micro-v1:0:128k                   │
│ Nova Micro                       Amazon        ACTIVE  amazon.nova-micro-v1:0                        │
│ Nova Sonic                       Amazon        ACTIVE  amazon.nova-sonic-v1:0                        │
│ Titan Text Embeddings v2         Amazon        ACTIVE  amazon.titan-embed-g1-text-02                 │
│ Titan Text G1 - Lite             Amazon        ACTIVE  amazon.titan-text-lite-v1:0:4k                │
│ Titan Text G1 - Lite             Amazon        ACTIVE  amazon.titan-text-lite-v1                     │
│ Titan Text G1 - Express          Amazon        ACTIVE  amazon.titan-text-express-v1:0:8k             │
│ Titan Text G1 - Express          Amazon        ACTIVE  amazon.titan-text-express-v1                  │
│ Titan Embeddings G1 - Text       Amazon        ACTIVE  amazon.titan-embed-text-v1:2:8k               │
│ Titan Embeddings G1 - Text       Amazon        ACTIVE  amazon.titan-embed-text-v1                    │
│ Titan Text Embeddings V2         Amazon        ACTIVE  amazon.titan-embed-text-v2:0:8k               │
│ Titan Text Embeddings V2         Amazon        ACTIVE  amazon.titan-embed-text-v2:0                  │
│ Titan Multimodal Embeddings G1   Amazon        ACTIVE  amazon.titan-embed-image-v1:0                 │
│ Titan Multimodal Embeddings G1   Amazon        ACTIVE  amazon.titan-embed-image-v1                   │
│ SDXL 1.0                         Stability AI  LEGACY  stability.stable-diffusion-xl-v1:0            │
│ SDXL 1.0                         Stability AI  LEGACY  stability.stable-diffusion-xl-v1              │
│ Jamba 1.5 Large                  AI21 Labs     ACTIVE  ai21.jamba-1-5-large-v1:0                     │
│ Jamba 1.5 Mini                   AI21 Labs     ACTIVE  ai21.jamba-1-5-mini-v1:0                      │
│ Claude Instant                   Anthropic     LEGACY  anthropic.claude-instant-v1:2:100k            │
│ Claude                           Anthropic     LEGACY  anthropic.claude-v2:0:18k                     │
│ Claude                           Anthropic     LEGACY  anthropic.claude-v2:0:100k                    │
│ Claude                           Anthropic     LEGACY  anthropic.claude-v2:1:18k                     │
│ Claude                           Anthropic     LEGACY  anthropic.claude-v2:1:200k                    │
│ Claude 3 Sonnet                  Anthropic     LEGACY  anthropic.claude-3-sonnet-20240229-v1:0:28k   │
│ Claude 3 Sonnet                  Anthropic     LEGACY  anthropic.claude-3-sonnet-20240229-v1:0:200k  │
│ Claude 3 Sonnet                  Anthropic     LEGACY  anthropic.claude-3-sonnet-20240229-v1:0       │
│ Claude 3 Haiku                   Anthropic     ACTIVE  anthropic.claude-3-haiku-20240307-v1:0:48k    │
│ Claude 3 Haiku                   Anthropic     ACTIVE  anthropic.claude-3-haiku-20240307-v1:0:200k   │
│ Claude 3 Haiku                   Anthropic     ACTIVE  anthropic.claude-3-haiku-20240307-v1:0        │
│ Claude 3 Opus                    Anthropic     ACTIVE  anthropic.claude-3-opus-20240229-v1:0:12k     │
│ Claude 3 Opus                    Anthropic     ACTIVE  anthropic.claude-3-opus-20240229-v1:0:28k     │
│ Claude 3 Opus                    Anthropic     ACTIVE  anthropic.claude-3-opus-20240229-v1:0:200k    │
│ Claude 3 Opus                    Anthropic     ACTIVE  anthropic.claude-3-opus-20240229-v1:0         │
│ Claude 3.5 Sonnet                Anthropic     ACTIVE  anthropic.claude-3-5-sonnet-20240620-v1:0     │
│ Claude 3.5 Sonnet v2             Anthropic     ACTIVE  anthropic.claude-3-5-sonnet-20241022-v2:0     │
│ Claude 3.7 Sonnet                Anthropic     ACTIVE  anthropic.claude-3-7-sonnet-20250219-v1:0     │
│ Claude 3.5 Haiku                 Anthropic     ACTIVE  anthropic.claude-3-5-haiku-20241022-v1:0      │
│ Claude Opus 4                    Anthropic     ACTIVE  anthropic.claude-opus-4-20250514-v1:0         │
│ Command R                        Cohere        ACTIVE  cohere.command-r-v1:0                         │
│ Command R+                       Cohere        ACTIVE  cohere.command-r-plus-v1:0                    │
│ Embed English                    Cohere        ACTIVE  cohere.embed-english-v3:0:512                 │
│ Embed English                    Cohere        ACTIVE  cohere.embed-english-v3                       │
│ Embed Multilingual               Cohere        ACTIVE  cohere.embed-multilingual-v3:0:512            │
│ Embed Multilingual               Cohere        ACTIVE  cohere.embed-multilingual-v3                  │
│ Rerank 3.5                       Cohere        ACTIVE  cohere.rerank-v3-5:0                          │
│ DeepSeek-R1                      DeepSeek      ACTIVE  deepseek.r1-v1:0                              │
│ Llama 3 8B Instruct              Meta          ACTIVE  meta.llama3-8b-instruct-v1:0                  │
│ Llama 3 70B Instruct             Meta          ACTIVE  meta.llama3-70b-instruct-v1:0                 │
│ Llama 3.1 8B Instruct            Meta          ACTIVE  meta.llama3-1-8b-instruct-v1:0                │
│ Llama 3.1 70B Instruct           Meta          ACTIVE  meta.llama3-1-70b-instruct-v1:0               │
│ Llama 3.2 11B Instruct           Meta          ACTIVE  meta.llama3-2-11b-instruct-v1:0               │
│ Llama 3.2 90B Instruct           Meta          ACTIVE  meta.llama3-2-90b-instruct-v1:0               │
│ Llama 3.2 1B Instruct            Meta          ACTIVE  meta.llama3-2-1b-instruct-v1:0                │
│ Llama 3.2 3B Instruct            Meta          ACTIVE  meta.llama3-2-3b-instruct-v1:0                │
│ Llama 3.3 70B Instruct           Meta          ACTIVE  meta.llama3-3-70b-instruct-v1:0               │
│ Llama 4 Scout 17B Instruct       Meta          ACTIVE  meta.llama4-scout-17b-instruct-v1:0           │
│ Llama 4 Maverick 17B Instruct    Meta          ACTIVE  meta.llama4-maverick-17b-instruct-v1:0        │
│ Mistral 7B Instruct              Mistral AI    ACTIVE  mistral.mistral-7b-instruct-v0:2              │
│ Mixtral 8x7B Instruct            Mistral AI    ACTIVE  mistral.mixtral-8x7b-instruct-v0:1            │
│ Mistral Large (24.02)            Mistral AI    ACTIVE  mistral.mistral-large-2402-v1:0               │
│ Mistral Small (24.02)            Mistral AI    ACTIVE  mistral.mistral-small-2402-v1:0               │
│ Pixtral Large (25.02)            Mistral AI    ACTIVE  mistral.pixtral-large-2502-v1:0               │
╰─────────────────────────────────┴──────────────┴────────┴───────────────────────────────────────────────╯
Total ACTIVE models: 90

You can use any of these model IDs like this:
Example usage:

response = bedrock_runtime.invoke_model(
    modelId="stability.stable-image-remove-background-v1:0",
    body=json.dumps(body),
)

Invoke Model

model = bedrock_client.get_foundation_model(modelIdentifier="anthropic.claude-3-5-sonnet-20240620-v1:0")
print(json.dumps(model, indent=2))
{
  "ResponseMetadata": {
    "RequestId": "dccb652d-2cc7-4ffb-99fc-3a743cfe8c39",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Mon, 13 Oct 2025 06:05:36 GMT",
      "content-type": "application/json",
      "content-length": "711",
      "connection": "keep-alive",
      "x-amzn-requestid": "dccb652d-2cc7-4ffb-99fc-3a743cfe8c39"
    },
    "RetryAttempts": 0
  },
  "modelDetails": {
    "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0",
    "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0",
    "modelName": "Claude 3.5 Sonnet",
    "providerName": "Anthropic",
    "inputModalities": [
      "TEXT",
      "IMAGE"
    ],
    "outputModalities": [
      "TEXT"
    ],
    "responseStreamingSupported": true,
    "customizationsSupported": [],
    "inferenceTypesSupported": [
      "ON_DEMAND",
      "INFERENCE_PROFILE"
    ],
    "modelLifecycle": {
      "status": "ACTIVE"
    }
  }
}
region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
region
'ap-southeast-2'
import boto3, os, json
from dotenv import load_dotenv

load_dotenv()

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

bedrock_runtime = boto3.client(
    "bedrock-runtime",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)
prompt = """
You are a knowledgeable assistant.
Explain in simple terms how AWS Bedrock works and
how it differs from SageMaker.
"""

body = {
    "messages": [
        {"role": "user", "content": [{"type": "text", "text": prompt}]}
    ],
    "max_tokens": 400,
    "temperature": 0.7,
}

response = bedrock_runtime.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    body=json.dumps(body),
)


result = json.loads(response["body"].read())
print(result["output"]["message"]["content"][0]["text"])
---------------------------------------------------------------------------
ValidationException                       Traceback (most recent call last)
Cell In[4], line 15
      1 prompt = """
      2 You are a knowledgeable assistant.
      3 Explain in simple terms how AWS Bedrock works and
      4 how it differs from SageMaker.
      5 """
      7 body = {
      8     "messages": [
      9         {"role": "user", "content": [{"type": "text", "text": prompt}]}
   (...)     12     "temperature": 0.7,
     13 }
---> 15 response = bedrock_runtime.invoke_model(
     16     modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
     17     body=json.dumps(body),
     18 )
     21 result = json.loads(response["body"].read())
     22 print(result["output"]["message"]["content"][0]["text"])

File ~/Documents/Knowledge/.venv/lib/python3.12/site-packages/botocore/client.py:602, in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
    598     raise TypeError(
    599         f"{py_operation_name}() only accepts keyword arguments."
    600     )
    601 # The "self" in this scope is referring to the BaseClient.
--> 602 return self._make_api_call(operation_name, kwargs)

File ~/Documents/Knowledge/.venv/lib/python3.12/site-packages/botocore/context.py:123, in with_current_context.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    121 if hook:
    122     hook()
--> 123 return func(*args, **kwargs)

File ~/Documents/Knowledge/.venv/lib/python3.12/site-packages/botocore/client.py:1078, in BaseClient._make_api_call(self, operation_name, api_params)
   1074     error_code = request_context.get(
   1075         'error_code_override'
   1076     ) or error_info.get("Code")
   1077     error_class = self.exceptions.from_code(error_code)
-> 1078     raise error_class(parsed_response, operation_name)
   1079 else:
   1080     return parsed_response

ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.

Inference

import boto3, os, json
from dotenv import load_dotenv

load_dotenv()

region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

# --- Control-plane client (metadata)
bedrock_client = boto3.client(
    "bedrock",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

profiles = bedrock_client.list_inference_profiles()

print("\n🧠 Available Inference Profiles:\n")
for p in profiles.get("inferenceProfileSummaries", []):
    name = p.get("inferenceProfileName")
    arn = p.get("inferenceProfileArn")
    model_id = p.get("modelId") or p.get("modelArn") or "—"
    region_hint = arn.split(":")[3] if arn else "?"
    print(f"- {name}")
    print(f"  ARN: {arn}")
    print(f"  Model: {model_id}")
    print(f"  Region: {region_hint}")
    print()

🧠 Available Inference Profiles:

- APAC Anthropic Claude 3 Sonnet
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-sonnet-20240229-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3.5 Sonnet
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-5-sonnet-20240620-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3 Haiku
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-haiku-20240307-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3.5 Sonnet v2
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0
  Model: —
  Region: ap-southeast-2

- APAC Anthropic Claude 3.7 Sonnet
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Nova Micro
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.amazon.nova-micro-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Nova Lite
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.amazon.nova-lite-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Nova Pro
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.amazon.nova-pro-v1:0
  Model: —
  Region: ap-southeast-2

- APAC Claude Sonnet 4
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/apac.anthropic.claude-sonnet-4-20250514-v1:0
  Model: —
  Region: ap-southeast-2

- Global Embed v4
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/global.cohere.embed-v4:0
  Model: —
  Region: ap-southeast-2

- Global Claude Sonnet 4.5
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/global.anthropic.claude-sonnet-4-5-20250929-v1:0
  Model: —
  Region: ap-southeast-2

- AU AU Anthropic Claude Sonnet 4.5
  ARN: arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/au.anthropic.claude-sonnet-4-5-20250929-v1:0
  Model: —
  Region: ap-southeast-2
# 🧠 Bedrock Claude Inference (APAC, Anthropic)
import boto3, os, json
from dotenv import load_dotenv
from rich.console import Console
from rich.panel import Panel
from rich.table import Table

console = Console()
load_dotenv()

# --- Config ---
region = "ap-southeast-2"  # Sydney
aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")

# --- Bedrock Clients ---
bedrock_client = boto3.client(
    "bedrock",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)
bedrock_runtime = boto3.client(
    "bedrock-runtime",
    region_name=region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
)

# --- Detect latest Anthropic inference profile ---
profiles = bedrock_client.list_inference_profiles()["inferenceProfileSummaries"]
anthropic_profiles = [p for p in profiles if "anthropic" in p["inferenceProfileArn"]]
anthropic_profiles.sort(key=lambda p: p["inferenceProfileName"], reverse=True)

if not anthropic_profiles:
    raise RuntimeError("No Anthropic inference profiles found in your account.")

profile = anthropic_profiles[0]
model_arn = profile["inferenceProfileArn"]
model_name = profile["inferenceProfileName"]

# --- Display available profiles ---
table = Table(title="Available Anthropic Profiles (Sorted)")
table.add_column("Name")
table.add_column("ARN", overflow="fold")
table.add_column("Region")

for p in anthropic_profiles:
    table.add_row(p["inferenceProfileName"], p["inferenceProfileArn"], p["inferenceProfileArn"].split(":")[3])
console.print(table)
console.print(Panel.fit(f"Using latest profile:\n[b]{model_name}[/b]\n{model_arn}", title="Model Selected"))
                                       Available Anthropic Profiles (Sorted)                                       
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Name                                 ARN                                                       Region         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ Global Claude Sonnet 4.5            │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/global.anthropic.claude-sonnet-4-5-20250929-v1:0   │                │
│ AU AU Anthropic Claude Sonnet 4.5   │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/au.anthropic.claude-sonnet-4-5-20250929-v1:0       │                │
│ APAC Claude Sonnet 4                │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-sonnet-4-20250514-v1:0       │                │
│ APAC Anthropic Claude 3.7 Sonnet    │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0     │                │
│ APAC Anthropic Claude 3.5 Sonnet v2 │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0     │                │
│ APAC Anthropic Claude 3.5 Sonnet    │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-5-sonnet-20240620-v1:0     │                │
│ APAC Anthropic Claude 3 Sonnet      │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-sonnet-20240229-v1:0       │                │
│ APAC Anthropic Claude 3 Haiku       │ arn:aws:bedrock:ap-southeast-2:762233760445:inference-pr │ ap-southeast-2 │
│                                     │ ofile/apac.anthropic.claude-3-haiku-20240307-v1:0        │                │
└─────────────────────────────────────┴──────────────────────────────────────────────────────────┴────────────────┘
╭──────────────────────────────────────────────── Model Selected ────────────────────────────────────────────────╮
│ Using latest profile:                                                                                          │
│ Global Claude Sonnet 4.5                                                                                       │
│ arn:aws:bedrock:ap-southeast-2:762233760445:inference-profile/global.anthropic.claude-sonnet-4-5-20250929-v1:0 │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
# --- Run prompt ---
prompt = """
You are a knowledgeable assistant.
Explain in simple terms how AWS Bedrock works and how it differs from SageMaker.
"""

body = {
    "anthropic_version": "bedrock-2023-05-31",  # ✅ required for Anthropic models
    "messages": [
        {"role": "user", "content": [{"type": "text", "text": prompt}]}
    ],
    "max_tokens": 400,
    "temperature": 0.7,
}


console.print("[cyan]Running inference...[/cyan]")
response = bedrock_runtime.invoke_model(
    modelId=model_arn,
    body=json.dumps(body),
)

# Parse new format
result = json.loads(response["body"].read())

# New Anthropic format
if "content" in result:
    # Most recent schema
    text_parts = [c["text"] for c in result["content"] if c["type"] == "text"]
    answer = "\n".join(text_parts)
# Old (legacy) fallback
elif "output" in result:
    answer = result["output"]["message"]["content"][0]["text"]
else:
    answer = json.dumps(result, indent=2)

console.print(Panel.fit(answer.strip(), title="Claude Response", border_style="green"))
Running inference...
╭──────────────────────────────────────────────── Claude Response ────────────────────────────────────────────────╮
 # AWS Bedrock vs SageMaker: Simple Explanation                                                                  
                                                                                                                 
 ## AWS Bedrock                                                                                                  
                                                                                                                 
 **What it is:** A fully managed service that gives you easy access to pre-built AI models from leading          
 companies through a simple API.                                                                                 
                                                                                                                 
 **How it works:**                                                                                               
 - Choose from ready-made foundation models (like Claude, Llama, Stable Diffusion, etc.)                         
 - Call the model through an API - no setup required                                                             
 - Customize models with your own data using simple techniques                                                   
 - Pay only for what you use                                                                                     
                                                                                                                 
 **Think of it like:** Renting a fully furnished apartment - everything is ready to use, just move in and start  
 living.                                                                                                         
                                                                                                                 
 ## AWS SageMaker                                                                                                
                                                                                                                 
 **What it is:** A comprehensive machine learning platform for building, training, and deploying your own custom 
 ML models.                                                                                                      
                                                                                                                 
 **How it works:**                                                                                               
 - Build models from scratch or use existing frameworks                                                          
 - Prepare and process your data                                                                                 
 - Train models using your own algorithms                                                                        
 - Deploy and manage the infrastructure                                                                          
 - Fine-tune everything to your specific needs                                                                   
                                                                                                                 
 **Think of it like:** Building your own house - you have complete control but need to handle construction,      
 materials, and maintenance.                                                                                     
                                                                                                                 
 ## Key Differences                                                                                              
                                                                                                                 
 | Aspect | Bedrock | SageMaker |                                                                                
 |--------|---------|-----------|                                                                                
 | **Complexity** | Simple, low-code | More technical, requires ML expertise |                                   
 | **Use Case** | Using existing AI models | Building custom ML solutions |                                      
 | **Setup Time** | Minutes | Hours to days |                                                                    
 | **Best For** | Quick AI integration, chatbots, content generation | Custom models, specific business problems 
 |                                                                                                               
                                                                                                                 
 **Bottom line:** Use Bedrock for quick access to powerful AI models. Use SageMaker when you need custom ML      
 solutions tailored to unique requirements.                                                                      
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Back to top