S3
✅ What is S3?
S3 (Simple Storage Service) is Amazon’s object storage service.
You store objects (files) inside buckets (containers).
Key Attributes
- Scalable: Store unlimited data.
- Durable: 99.999999999% (11 nines) durability.
- Highly available: Redundant across multiple AZs.
- Secure: Fine-grained access control with IAM, bucket policies.
- Cost-effective: Pay-as-you-go pricing.
✅ Core Concepts
Term | Meaning |
---|---|
Bucket | A container for objects. Globally unique across AWS. |
Object | Files (data) and their metadata stored in S3. |
Key | Unique identifier for an object inside a bucket (like a full file path). |
Region | Buckets live in AWS regions. Data residency matters! |
Storage Class | Defines durability, availability, and cost. (Standard, Infrequent Access, Glacier, etc.) |
Versioning | Store multiple versions of the same object. |
Access Control | IAM Policies, Bucket Policies, ACLs, Block Public Access, S3 Access Points. |
✅ How S3 Works
- Create a bucket.
- Upload objects into the bucket.
- Control access (IAM, policies).
- Retrieve objects via HTTP/HTTPS REST API or SDKs.
✅ Common Use Cases
- Hosting static websites (HTML, CSS, JS)
- Backup & restore
- Disaster recovery
- Data lakes (for analytics with Athena, Redshift)
- Media storage (images, videos)
- Software distribution (app binaries, updates)
- Serverless apps (Lambda, API Gateway integration)
- CDN origin for CloudFront
✅ Security in S3
- IAM Policies: Fine-grained user and role permissions.
- Bucket Policies: Public/private access control at bucket level.
- Block Public Access (BPA): Prevent unintended public exposure.
- S3 Access Points: Simplify access management in large-scale environments.
- Encryption:
- Server-Side Encryption (SSE):
- SSE-S3 (AWS-managed keys)
- SSE-KMS (customer-managed keys via KMS)
- SSE-C (customer-provided keys)
- SSE-S3 (AWS-managed keys)
- Client-Side Encryption: You encrypt before upload.
- Server-Side Encryption (SSE):
- MFA Delete: Require MFA to delete versions or objects.
- CloudTrail Logs: Audit access and actions.
✅ Storage Classes & Costs
Storage Class | Use Case | Durability/Availability | Retrieval Cost | Price |
---|---|---|---|---|
Standard | General purpose | 11 9’s / 99.99% | No retrieval fees | $23/TB |
Intelligent-Tiering | Unknown access patterns | 11 9’s / 99.9-99.99% | Minimal, auto-tiering | |
Standard-IA | Infrequent access | 11 9’s / 99.9% | Retrieval fees apply | |
One Zone-IA | Cheap infrequent access (1 AZ only) | 11 9’s / 99.5% | Retrieval fees apply | |
Glacier Instant Retrieval | Archival, quick retrieval | 11 9’s / 99.9% | Low retrieval costs | $4/TB |
Glacier Flexible Retrieval (Formerly Glacier) | Long-term archive (minutes/hours) | 11 9’s / 99.99% | Low-cost deep archive | |
Glacier Deep Archive | Compliance archive (12+ hours) | 11 9’s / 99.99% | Very cheap, long retrieval |
✅ Performance
- Scale: Unlimited objects and petabytes of data.
- Performance tips:
- Prefix parallelization: S3 supports thousands of requests per second per prefix.
- Multipart Uploads: Upload large files in chunks (recommended for files over 100MB).
- Transfer Acceleration: Speeds uploads over long distances (uses CloudFront edge locations).
✅ Hosting Static Websites
Create a public bucket.
Enable Static Website Hosting in bucket settings.
Upload
index.html
,error.html
.Optional: Use CloudFront for SSL/TLS and custom domain.
Example:
http://your-bucket-name.s3-website-region.amazonaws.com
✅ Event Notifications
Trigger events to: - Lambda - SQS - SNS Examples:
- Trigger Lambda on upload to process images. - Send a message to SQS when a file is deleted.
✅ Cross-Origin Resource Sharing (CORS)
Allow or restrict which domains can access your bucket resources (needed for frontends accessing S3 via JS):
[
{
"AllowedOrigins": ["*"],
"AllowedMethods": ["GET", "POST", "PUT"],
"AllowedHeaders": ["*"]
}
]
✅ Cross-Region Replication (CRR)
Automatically replicate objects to a bucket in another AWS Region for: - Compliance - Disaster recovery - Latency improvements
✅ Logging and Monitoring
- Server Access Logging: Logs for object-level API access.
- CloudTrail: Tracks bucket and object activity for compliance/auditing.
- CloudWatch Metrics:
- Number of objects
- Bytes stored
- 4xx/5xx errors
- AWS Config Rules: Ensure compliance (e.g., no public buckets).
✅ CLI & SDK Example Commands
CLI Upload
aws s3 cp myfile.txt s3://my-bucket/
CLI Sync
aws s3 sync ./local-folder s3://my-bucket/ --delete
Presigned URL (temporary access to private file)
aws s3 presign s3://my-bucket/myfile.txt --expires-in 3600
✅ Backup & Restore
- Use versioning and MFA Delete for strong backup policies.
- Object Lock: WORM (Write Once, Read Many) for compliance (financial/legal).
✅ Data Lifecycle Policies
Automate: - Transition to IA or Glacier - Delete expired versions - Clean up incomplete multipart uploads
Example:
{
"Rules": [
{
"ID": "TransitionToGlacier",
"Prefix": "",
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "GLACIER"
}
]
}
]
}
✅ Security Best Practices
- Block Public Access unless explicitly needed.
- Encrypt data at rest (SSE-KMS recommended).
- Restrict IAM permissions to the principle of least privilege.
- Enable CloudTrail logging.
- Use VPC Endpoints to avoid public internet traffic.
✅ Pricing Overview
- Storage costs (GB/month)
- Request costs (PUT, GET, etc.)
- Data transfer costs (out to the internet or cross-region)
- Replication & Lifecycle transition costs
💡 Use the AWS Pricing Calculator to estimate your costs:
https://calculator.aws.amazon.com
🚀 Next Steps
- Secure your buckets → IAM + Bucket Policy + BPA
- Versioning & Object Lock → Data Protection
- Lifecycle policies → Cost Optimization
- Integrate with other AWS services (Lambda, CloudFront, Athena)
- Audit and monitor → CloudTrail, CloudWatch
TL;DR
Concept | Quick Note |
---|---|
Durability | 11 9’s → almost indestructible |
Access | IAM + Policies + ACLs + BPA |
Storage Classes | Optimize cost by choosing the right one |
Encryption | SSE-S3, SSE-KMS, SSE-C (and TLS in transit) |
Static Site Hosting | Cheap, easy, scalable |
Cross-region replication | Multi-region redundancy |