A fully managed platform to build GenAI apps via a single API across many model providers (Anthropic, Meta, Mistral, Cohere, Amazon Nova, etc.), with first-class features for guardrails, RAG (Knowledge Bases), agents, fine-tuning/custom models, and enterprise security (VPC endpoints, KMS). ([Amazon Web Services, Inc.][1])
Author
Benedict Thekkel
Model lineup & modalities (high level)
Amazon Nova family (Micro/Lite/Pro + Canvas/Reel + Premier/Sonic variants): text + multimodal (image/video/speech) options, region coverage varies. (AWS Documentation)
3rd-party FMs: Anthropic Claude, Meta Llama, Mistral, Cohere, AI21 (Jamba), Stability, TwelveLabs, Writer (availability varies by region). (Amazon Web Services, Inc.)
Tip: The Models doc lists model IDs, regions, and streaming support—bookmark it. (AWS Documentation)
The core building blocks
Capability
What it solves
When to use
Converse API
One messages API across models; supports tools/function-calling & streaming.
Create a guardrail policy; attach it in Converse or to your Agent/KB.
Use grounding checks to reduce hallucinations against your RAG context. (AWS Documentation)
Architecture patterns (pick your lane)
A. Pure managed Client → API (private) → Bedrock (Converse) → Guardrails → Knowledge Base Good for: fastest path, least ops.
B. Agentic workflows User → Agent (AgentCore) → Tools (APIs/Lambda), KB, Code-Interpreter, Browser → Model Good for: multi-step tasks, enterprise workflows. (TechRadar)
C. Hybrid RAG Bedrock for inference + OpenSearch/Aurora pgvector for vectors; swap in/out as needs evolve. (KB still a great default.) (AWS Documentation)
Tuning & throughput tips
Prefer streaming for UX; it’s supported on most chat models. (AWS Documentation)
Keep temperature low for factual tasks; raise only for ideation.
Cache retrievals and dedupe media in multi-modal workloads.
For stable high QPS, move to Provisioned Throughput and right-size capacity. (Amazon Web Services, Inc.)
Regions & availability
Model availability (and streaming) is per-model, per-region—check the table before you pick IDs for prod (e.g., some Nova variants in ap-southeast-2 today, others via cross-region). (AWS Documentation)
Common pitfalls (and fixes)
“Model not found” in region → switch to a supported region or use the cross-region listing. (AWS Documentation)
Hallucinations in RAG → enable contextual grounding guardrails and tighten retrieval metadata filters. (Amazon Web Services, Inc.)
Data egress surprises → use VPC endpoints; keep S3 + Bedrock private. (AWS Documentation)
Choose Bedrock when you want model choice, managed safety/RAG/agents, and private networking without owning infra.
Roll your own (EKS + vLLM/TGI) when you need custom runtimes, super-tight $/token, or niche models not offered in Bedrock. (You can still keep Bedrock for managed bits like KB/Guardrails.)
If you tell me your target region (likely ap-southeast-2), model(s), and latency/QPS/budget, I’ll sketch a minimal IaC plan (Terraform/CDK) plus a load test harness you can run today to size on-demand vs provisioned throughput.
Code
import boto3import osfrom dotenv import load_dotenv# Load environment variables from .env (optional)load_dotenv()region = os.getenv("AWS_BEDROCK_REGION_NAME", "us-east-1")aws_access_key_id = os.getenv("AWS_BEDROCK_ACCESS_KEY_ID")aws_secret_access_key = os.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY")# Initialize the Bedrock control-plane clientbedrock_client = boto3.client("bedrock", region_name=region, aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key,)# List available foundation modelsresp = bedrock_client.list_foundation_models()print("\n🧠 Available AWS Bedrock Models:\n")for model in resp.get("modelSummaries", []): name = model.get("modelName") provider = model.get("providerName") status = model.get("modelLifecycle", {}).get("status")print(f"- {name} ({provider}) — {status}")
🧠 Available AWS Bedrock Models:
- Stable Image Remove Background (Stability AI) — ACTIVE
- Stable Image Style Guide (Stability AI) — ACTIVE
- Stable Image Control Sketch (Stability AI) — ACTIVE
- Claude Sonnet 4 (Anthropic) — ACTIVE
- Stable Image Erase Object (Stability AI) — ACTIVE
- Stable Image Control Structure (Stability AI) — ACTIVE
- Stable Image Search and Recolor (Stability AI) — ACTIVE
- gpt-oss-120b (OpenAI) — ACTIVE
- Pegasus v1.2 (TwelveLabs) — ACTIVE
- Stable Image Style Transfer (Stability AI) — ACTIVE
- Embed v4 (Cohere) — ACTIVE
- Claude Sonnet 4.5 (Anthropic) — ACTIVE
- Marengo Embed v2.7 (TwelveLabs) — ACTIVE
- Stable Image Search and Replace (Stability AI) — ACTIVE
- Qwen3-Coder-30B-A3B-Instruct (Qwen) — ACTIVE
- Qwen3 32B (dense) (Qwen) — ACTIVE
- Stable Image Inpaint (Stability AI) — ACTIVE
- gpt-oss-20b (OpenAI) — ACTIVE
- Claude Opus 4.1 (Anthropic) — ACTIVE
- Titan Text Large (Amazon) — ACTIVE
- Titan Image Generator G1 (Amazon) — ACTIVE
- Titan Image Generator G1 (Amazon) — ACTIVE
- Titan Image Generator G1 v2 (Amazon) — ACTIVE
- Nova Premier (Amazon) — ACTIVE
- Nova Premier (Amazon) — ACTIVE
- Nova Premier (Amazon) — ACTIVE
- Nova Premier (Amazon) — ACTIVE
- Nova Premier (Amazon) — ACTIVE
- Nova Pro (Amazon) — ACTIVE
- Nova Pro (Amazon) — ACTIVE
- Nova Pro (Amazon) — ACTIVE
- Nova Lite (Amazon) — ACTIVE
- Nova Lite (Amazon) — ACTIVE
- Nova Lite (Amazon) — ACTIVE
- Nova Canvas (Amazon) — ACTIVE
- Nova Reel (Amazon) — ACTIVE
- Nova Reel (Amazon) — ACTIVE
- Nova Micro (Amazon) — ACTIVE
- Nova Micro (Amazon) — ACTIVE
- Nova Micro (Amazon) — ACTIVE
- Nova Sonic (Amazon) — ACTIVE
- Titan Text Embeddings v2 (Amazon) — ACTIVE
- Titan Text G1 - Lite (Amazon) — ACTIVE
- Titan Text G1 - Lite (Amazon) — ACTIVE
- Titan Text G1 - Express (Amazon) — ACTIVE
- Titan Text G1 - Express (Amazon) — ACTIVE
- Titan Embeddings G1 - Text (Amazon) — ACTIVE
- Titan Embeddings G1 - Text (Amazon) — ACTIVE
- Titan Text Embeddings V2 (Amazon) — ACTIVE
- Titan Text Embeddings V2 (Amazon) — ACTIVE
- Titan Multimodal Embeddings G1 (Amazon) — ACTIVE
- Titan Multimodal Embeddings G1 (Amazon) — ACTIVE
- SDXL 1.0 (Stability AI) — LEGACY
- SDXL 1.0 (Stability AI) — LEGACY
- Jamba 1.5 Large (AI21 Labs) — ACTIVE
- Jamba 1.5 Mini (AI21 Labs) — ACTIVE
- Claude Instant (Anthropic) — LEGACY
- Claude (Anthropic) — LEGACY
- Claude (Anthropic) — LEGACY
- Claude (Anthropic) — LEGACY
- Claude (Anthropic) — LEGACY
- Claude 3 Sonnet (Anthropic) — LEGACY
- Claude 3 Sonnet (Anthropic) — LEGACY
- Claude 3 Sonnet (Anthropic) — LEGACY
- Claude 3 Haiku (Anthropic) — ACTIVE
- Claude 3 Haiku (Anthropic) — ACTIVE
- Claude 3 Haiku (Anthropic) — ACTIVE
- Claude 3 Opus (Anthropic) — ACTIVE
- Claude 3 Opus (Anthropic) — ACTIVE
- Claude 3 Opus (Anthropic) — ACTIVE
- Claude 3 Opus (Anthropic) — ACTIVE
- Claude 3.5 Sonnet (Anthropic) — ACTIVE
- Claude 3.5 Sonnet v2 (Anthropic) — ACTIVE
- Claude 3.7 Sonnet (Anthropic) — ACTIVE
- Claude 3.5 Haiku (Anthropic) — ACTIVE
- Claude Opus 4 (Anthropic) — ACTIVE
- Command R (Cohere) — ACTIVE
- Command R+ (Cohere) — ACTIVE
- Embed English (Cohere) — ACTIVE
- Embed English (Cohere) — ACTIVE
- Embed Multilingual (Cohere) — ACTIVE
- Embed Multilingual (Cohere) — ACTIVE
- Rerank 3.5 (Cohere) — ACTIVE
- DeepSeek-R1 (DeepSeek) — ACTIVE
- Llama 3 8B Instruct (Meta) — ACTIVE
- Llama 3 70B Instruct (Meta) — ACTIVE
- Llama 3.1 8B Instruct (Meta) — ACTIVE
- Llama 3.1 70B Instruct (Meta) — ACTIVE
- Llama 3.2 11B Instruct (Meta) — ACTIVE
- Llama 3.2 90B Instruct (Meta) — ACTIVE
- Llama 3.2 1B Instruct (Meta) — ACTIVE
- Llama 3.2 3B Instruct (Meta) — ACTIVE
- Llama 3.3 70B Instruct (Meta) — ACTIVE
- Llama 4 Scout 17B Instruct (Meta) — ACTIVE
- Llama 4 Maverick 17B Instruct (Meta) — ACTIVE
- Mistral 7B Instruct (Mistral AI) — ACTIVE
- Mixtral 8x7B Instruct (Mistral AI) — ACTIVE
- Mistral Large (24.02) (Mistral AI) — ACTIVE
- Mistral Small (24.02) (Mistral AI) — ACTIVE
- Pixtral Large (25.02) (Mistral AI) — ACTIVE