Docker, Kubernetes & AWS
Containers, orchestration, and cloud architecture — from fundamentals to production patterns with trade-offs and bottlenecks.
Docker — Containers & Images
Container vs VM — Differences & Trade-offs
Must Know
| Virtual Machine (VM) | Container | |
|---|---|---|
| Isolation | Full OS isolation (hypervisor) | Process isolation (shared OS kernel) |
| Boot time | Minutes | Milliseconds |
| Size | GBs (full OS image) | MBs (only app + libs) |
| Security | Strong (separate kernel) | Weaker (shared kernel, namespaces) |
| Overhead | High (hypervisor + full OS) | Near-zero (host kernel) |
| Portability | Slower (large images) | Fast (OCI images, Docker Hub) |
| Use case | Multi-OS, legacy apps, full isolation | Microservices, CI/CD, cloud-native |
VM Stack: Container Stack:
┌─────────────────────┐ ┌─────────────────────┐
│ Application │ │ App A │ App B │ App C │
├─────────────────────┤ ├───────┴───────┴───────┤
│ Guest OS │ │ Container Runtime │
├─────────────────────┤ │ (Docker / containerd)│
│ Hypervisor │ ├───────────────────────┤
├─────────────────────┤ │ Host OS / Kernel │
│ Host Hardware │ ├───────────────────────┤
└─────────────────────┘ │ Host Hardware │
└───────────────────────┘
Docker uses: Linux namespaces (PID, net, mnt, uts, ipc)
cgroups (CPU, memory, I/O limits)
Union filesystem (OverlayFS — layer-based)
Docker Images — Layers, Union FS, Build Cache
Docker
Union Filesystem (OverlayFS)
Image = stack of read-only layers. Container adds writable layer on top. Layers are shared across images (same base layer stored once). Changes are copy-on-write to writable layer.Image Layers (bottom to top):
┌─────────────────────────┐ ← Writable container layer
├─────────────────────────┤ ← COPY app/ . (your code)
├─────────────────────────┤ ← RUN pip install -r requirements.txt
├─────────────────────────┤ ← COPY requirements.txt .
├─────────────────────────┤ ← RUN apt-get install python3
└─────────────────────────┘ ← FROM ubuntu:22.04 (base)
Cache Invalidation: if layer N changes, all layers N+1, N+2... rebuild!
→ Put rarely-changing things early (base, dependencies)
→ Put frequently-changing things late (your application code)
# Optimized Production Dockerfile (multi-stage)
# Stage 1: Build
FROM maven:3.9-openjdk-21 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline -q # cache deps layer separately
COPY src ./src
RUN mvn package -DskipTests -q
# Stage 2: Runtime (minimal image)
FROM eclipse-temurin:21-jre-alpine # smaller: jre not jdk, alpine not ubuntu
WORKDIR /app
# Non-root user for security
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
COPY --from=builder /app/target/app.jar ./app.jar
EXPOSE 8080
# Use exec form (not shell form) for proper signal handling
ENTRYPOINT ["java", "-XX:+UseContainerSupport", "-jar", "app.jar"]
Bottlenecks & Anti-patterns:
Running as root in container (security risk)
Large base images (ubuntu:latest = 77MB vs alpine = 7MB)
Storing secrets in image layers (they persist even if deleted later)
COPY . . first → any file change rebuilds all layersRUN apt-get update && apt-get install... in separate layers (stale cache)Running as root in container (security risk)
Large base images (ubuntu:latest = 77MB vs alpine = 7MB)
Storing secrets in image layers (they persist even if deleted later)
Docker Networking — Bridge, Host, Overlay
Networking
| Network Mode | How | Use Case | Trade-off |
|---|---|---|---|
| bridge (default) | Private virtual network on host; containers communicate by name within network | Single host multi-container apps | Must publish ports (-p 8080:80) to host |
| host | Container uses host's network namespace directly | Performance-critical, low latency | No isolation, port conflicts |
| none | No network | Fully isolated batch jobs | No connectivity |
| overlay | Multi-host virtual network (Docker Swarm/K8s) | Cross-host container communication | Encapsulation overhead (VXLAN) |
| macvlan | Container gets its own MAC/IP on physical network | Legacy apps needing real network presence | Requires promiscuous mode on NIC |
Container DNS
On user-defined bridge networks, Docker runs a built-in DNS. Containers resolve each other by container name. e.g., web service can curl http://db:5432 — Docker resolves "db" to the database container IP automatically.Docker Compose — Multi-Container Apps
DockerDev
# docker-compose.yml — production-grade example
version: '3.9'
services:
api:
build: ./api
image: myapp/api:latest
ports:
- "8080:8080"
environment:
- DB_HOST=postgres
- REDIS_URL=redis://cache:6379
- KAFKA_BROKERS=kafka:9092
depends_on:
postgres: { condition: service_healthy } # wait until healthy
cache: { condition: service_started }
restart: unless-stopped
deploy:
resources:
limits: { cpus: '0.5', memory: 512M }
reservations: { memory: 256M }
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 10s
timeout: 5s
retries: 3
postgres:
image: postgres:16-alpine
volumes:
- pg_data:/var/lib/postgresql/data # named volume: persists data
environment:
POSTGRES_DB: mydb
POSTGRES_USER: user
POSTGRES_PASSWORD_FILE: /run/secrets/pg_password # use secrets not env
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user -d mydb"]
interval: 5s
retries: 5
cache:
image: redis:7-alpine
command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
kafka:
image: confluentinc/cp-kafka:7.5.0
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
volumes:
pg_data: # named volume managed by Docker (survives container restart)
# docker-compose up -d -- start in background
# docker-compose logs -f api -- tail service logs
# docker-compose scale api=3 -- run 3 instances
Docker Volumes — Bind Mounts, Named Volumes, tmpfs
Storage
| Type | Storage Location | Lifecycle | Use Case |
|---|---|---|---|
| Named Volume | Docker-managed (/var/lib/docker/volumes) | Persists until explicitly deleted | DB data, app data, production persistence |
| Bind Mount | Specific host path (e.g. ./src:/app/src) | Exists as long as host path exists | Dev mode: hot reload, config injection |
| tmpfs | Host RAM only | Container lifetime only | Sensitive data (secrets), high-speed temp |
# Named volume
docker run -v myapp_data:/var/lib/mysql mysql:8
# Bind mount (dev hot reload)
docker run -v $(pwd)/src:/app/src -v $(pwd)/node_modules:/app/node_modules node:20
# tmpfs (never written to disk)
docker run --tmpfs /tmp:size=100m,mode=1777 myapp
# Inspect volumes
docker volume ls
docker volume inspect myapp_data
# Backup volume
docker run --rm -v myapp_data:/source -v $(pwd):/backup alpine \
tar czf /backup/backup.tar.gz -C /source .
Kubernetes — Container Orchestration
Kubernetes Architecture — Control Plane & Nodes
Must Know
Kubernetes Cluster:
┌──────────────────────────────────────────────────────────────┐
│ Control Plane │
│ ├── kube-apiserver → REST API; all components talk here │
│ ├── etcd → Distributed KV store (cluster state)│
│ ├── kube-scheduler → Assigns pods to nodes │
│ ├── kube-controller-manager → Reconciliation loops │
│ └── cloud-controller → Cloud provider integration │
├──────────────────────────────────────────────────────────────┤
│ Worker Node 1 Worker Node 2 │
│ ├── kubelet (agent) ├── kubelet │
│ ├── kube-proxy (iptables) ├── kube-proxy │
│ ├── containerd ├── containerd │
│ ├── Pod A: [Container1] ├── Pod C: [Container1] │
│ └── Pod B: [Container1] └── Pod D: [Container1, C2] │
└──────────────────────────────────────────────────────────────┘
| Component | Role |
|---|---|
| etcd | Source of truth for all cluster state. Must be HA (3+ nodes). Loss = cluster state loss. |
| kube-apiserver | All requests go through here. Authentication, admission control, validation. |
| kube-scheduler | Watches for unscheduled pods. Selects best node based on resources, affinity, taints. |
| controller-manager | Runs control loops: ReplicaSet controller, Deployment controller, Node controller, etc. |
| kubelet | Agent on each node. Ensures containers in Pods are running and healthy. |
| kube-proxy | Maintains network rules (iptables/ipvs) for Service load balancing. |
Core Kubernetes Objects
Must Know
| Object | What it does |
|---|---|
| Pod | Smallest deployable unit. 1+ containers sharing network/storage. Ephemeral — don't manage directly. |
| Deployment | Manages ReplicaSets; rolling updates, rollback. For stateless apps. |
| StatefulSet | Like Deployment but for stateful apps (DBs). Stable pod names, ordered startup, persistent volumes per pod. |
| DaemonSet | One pod per node. For cluster-wide agents: log collector (Fluentd), metrics (Prometheus Node Exporter). |
| Service | Stable virtual IP + DNS for a set of pods. Types: ClusterIP, NodePort, LoadBalancer, ExternalName. |
| Ingress | HTTP/HTTPS routing rules → Services. Host-based + path-based routing. Needs Ingress Controller (Nginx, Traefik). |
| ConfigMap | Non-sensitive config as key-value pairs, injected as env vars or files. |
| Secret | Base64-encoded sensitive data (not encrypted by default! Use sealed-secrets or Vault for real security). |
| HPA | Horizontal Pod Autoscaler: scales pods based on CPU/memory/custom metrics. |
| PVC/PV | PersistentVolumeClaim: requests storage. PersistentVolume: actual storage resource. |
# Production Deployment manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
labels: { app: api }
spec:
replicas: 3
selector: { matchLabels: { app: api } }
strategy:
type: RollingUpdate
rollingUpdate: { maxSurge: 1, maxUnavailable: 0 } # zero-downtime
template:
metadata: { labels: { app: api } }
spec:
containers:
- name: api
image: myapp/api:v1.2.3 # pin to digest in prod: @sha256:...
ports: [{ containerPort: 8080 }]
resources:
requests: { cpu: "100m", memory: "256Mi" } # scheduler uses this
limits: { cpu: "500m", memory: "512Mi" } # OOMKilled if exceeded
readinessProbe: # traffic only when ready
httpGet: { path: /ready, port: 8080 }
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe: # restart if unhealthy
httpGet: { path: /health, port: 8080 }
initialDelaySeconds: 15
env:
- name: DB_PASSWORD
valueFrom: { secretKeyRef: { name: db-secret, key: password } }
---
apiVersion: v1
kind: Service
metadata: { name: api-service }
spec:
selector: { app: api }
ports: [{ port: 80, targetPort: 8080 }]
type: ClusterIP
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: { name: api-hpa }
spec:
scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: api-service }
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource: { name: cpu, target: { type: Utilization, averageUtilization: 70 } }
Service Types, Ingress & Service Mesh
Networking
| Service Type | Accessible From | Use Case |
|---|---|---|
| ClusterIP (default) | Within cluster only | Internal microservice communication |
| NodePort | NodeIP:Port from outside | Dev/test; not for production |
| LoadBalancer | Cloud LB public IP | Expose single service externally; costs money per service |
| Ingress + IngressController | HTTP/HTTPS routing via single LB | Route multiple services via path/host — cost-effective |
Ingress Routing:
Internet → ALB/NLB → Ingress Controller (Nginx pod)
api.example.com/v1/users → api-service:80
api.example.com/v1/orders → order-service:80
app.example.com/ → frontend-service:80
Service Mesh (Istio/Linkerd):
Sidecar proxy (Envoy) injected into each pod
→ mTLS between all services
→ Circuit breaking, retries, timeout
→ Distributed tracing
→ Traffic shifting (canary: 10% to v2, 90% to v1)
Bottleneck: Service mesh adds ~2ms latency per hop (sidecar proxy). Worth it for security and observability at scale, but overhead for small clusters.
AWS Core Services
EC2 — Elastic Compute Cloud
AWSCompute
| Purchase Option | Discount | Best For |
|---|---|---|
| On-Demand | 0% | Unpredictable, short-term workloads |
| Reserved (1–3yr) | Up to 72% | Steady-state baseline workloads |
| Spot Instances | Up to 90% | Fault-tolerant batch jobs, can be interrupted with 2-min notice |
| Savings Plans | Up to 66% | Flexible (compute family agnostic) |
| Dedicated Hosts | Variable | Compliance, licensing (SQL Server, Oracle) |
Instance Families
- m — General Purpose: m7g, m6i (balanced CPU/memory)
- c — Compute Optimized: c7g (CPU-intensive, ML inference)
- r — Memory Optimized: r7i (in-memory DB, Redis, Kafka)
- i — Storage Optimized: i4i (NVMe SSD, Cassandra, HDFS)
- g — GPU: g5 (ML training, graphics)
- t — Burstable: t3 (dev/test, variable workloads)
EC2 Bottlenecks & Best Practices
- Network: use enhanced networking (ENA), placement groups for HPC
- Storage I/O: use io2 Block Express for >64K IOPS (RDBMS)
- CPU: c instances for compute-bound; Graviton (ARM) = 20% cheaper, 40% better perf/watt
- Right-sizing: use Compute Optimizer + CloudWatch metrics
- Spot: use Spot with ASG + diversified instance types (fallback)
Auto Scaling Group (ASG)
- Min/desired/max capacity; scales on CloudWatch metrics (CPU, custom)
- Target Tracking: "keep CPU at 60%" — simplest, recommended
- Step Scaling: different actions at different thresholds
- Predictive Scaling: ML-based forecast, scales in advance
- Always pair with ELB — unhealthy instances replaced automatically
S3 — Simple Storage Service
AWSStorage
| Storage Class | Use Case | Retrieval | Cost |
|---|---|---|---|
| S3 Standard | Frequently accessed (app assets, active data) | Immediate | $$$ |
| S3 Standard-IA | Infrequent access, must be available quickly | Immediate | $$ |
| S3 One Zone-IA | Recreatable data (thumbnails) | Immediate | $ |
| S3 Intelligent-Tiering | Unknown access patterns — auto-moves between tiers | Immediate | $$+monitoring fee |
| S3 Glacier Instant | Archive, rare access but instant needed | Milliseconds | $ |
| S3 Glacier Flexible | Archive, 12-hour retrieval OK | 1–12 hours | ¢ |
| S3 Glacier Deep Archive | Long-term (7-10 year retention) | Up to 48 hours | ¢¢ |
S3 Key Features & Best Practices
- Durability: 99.999999999% (11 9s) — stores across 3 AZs minimum
- Versioning: Enable on important buckets — protects from delete/overwrite
- Lifecycle rules: Auto-transition to cheaper tier or delete after N days
- Presigned URLs: Temporary access to private objects (e.g., 15-min download link)
- S3 Transfer Acceleration: Uses CloudFront edge for faster uploads globally
- Multipart upload: Required for >5GB; recommended for >100MB
- Block Public Access: Enable at account level — prevents accidental public exposure
- S3 Event Notifications: Trigger Lambda/SQS/SNS on put/delete (data processing pipeline)
S3 Bottleneck: S3 has 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix. Distribute objects across multiple prefixes (don't use date prefix for high-throughput writes).
RDS & Aurora — Managed Databases
AWSDatabase
| RDS (Standard) | Aurora | |
|---|---|---|
| Engines | MySQL, Postgres, MariaDB, Oracle, SQL Server | MySQL-compatible, Postgres-compatible |
| Storage | EBS volumes, manual scaling | Shared distributed storage, auto-scales to 128TB |
| Read Replicas | Up to 5, eventual consistency | Up to 15 Aurora Replicas, <10ms lag |
| Failover | 1–2 minutes (Multi-AZ standby) | ~30 seconds (Aurora Replica promoted) |
| Performance | Baseline | 5x MySQL, 3x Postgres performance |
| Cost | Lower for simple workloads | Higher base, better at scale |
Multi-AZ vs Read Replicas
- Multi-AZ: Synchronous standby in another AZ. Automatic failover. NOT for read scaling — standby is idle.
- Read Replicas: Async replication. Use for read scaling, analytics. Can be in different region.
- Aurora Serverless v2: Auto-scales compute capacity in fine-grained increments (0.5 ACU units). Good for variable workloads.
RDS Bottlenecks:
Connection pool exhaustion — use RDS Proxy (connection pooling, IAM auth)
Long-running queries blocking writes — use query timeout, read replica for analytics
Storage I/O limit — use io1/io2 with provisioned IOPS for write-heavy workloads
Connection pool exhaustion — use RDS Proxy (connection pooling, IAM auth)
Long-running queries blocking writes — use query timeout, read replica for analytics
Storage I/O limit — use io1/io2 with provisioned IOPS for write-heavy workloads
ElastiCache — Redis & Memcached
AWSCache
| Redis | Memcached | |
|---|---|---|
| Data types | Rich (String, Hash, List, Set, ZSet, Geo, Streams) | Simple key-value only |
| Persistence | RDB + AOF options | No persistence |
| Replication | Primary + replicas, cluster mode | No replication |
| Multi-threading | Single-threaded (I/O multithreading in 6.0+) | Multi-threaded |
| Choose when | Session, leaderboards, pub/sub, complex data, persistence | Simple cache, multi-CPU, horizontal scale |
ElastiCache Redis — Cluster Mode
- Cluster disabled: Single primary + up to 5 replicas. All data on one shard. Simple.
- Cluster enabled: Sharded across up to 500 node groups. Scales horizontally. Cross-shard commands limited.
- Global Datastore: Cross-region replication for multi-region low-latency reads.
AWS Networking
VPC — Virtual Private Cloud
AWSMust Know
VPC Architecture (Production):
VPC: 10.0.0.0/16 (65,536 IPs)
│
├── Public Subnet AZ-a: 10.0.0.0/24 → Internet Gateway → public internet
│ └── NAT Gateway (for private → internet outbound)
│ └── ALB / NLB (load balancers)
│ └── Bastion host (SSH jump server)
│
├── Private Subnet AZ-a: 10.0.10.0/24 → NO direct internet access
│ └── EC2 App Servers, ECS tasks
│ └── RDS (never in public subnet!)
│
├── Public Subnet AZ-b: 10.0.1.0/24 (for HA — multi-AZ)
└── Private Subnet AZ-b: 10.0.11.0/24
Route Tables:
Public: 0.0.0.0/0 → Internet Gateway
Private: 0.0.0.0/0 → NAT Gateway (outbound only)
| Component | Purpose |
|---|---|
| Internet Gateway (IGW) | Allows public subnet instances to communicate with internet |
| NAT Gateway | Allows private subnet instances to initiate outbound internet (not inbound) |
| Security Group | Stateful virtual firewall at instance level (allow rules only) |
| NACL | Stateless firewall at subnet level (allow + deny rules, order matters) |
| VPC Peering | Private network connection between two VPCs (no transitive routing) |
| Transit Gateway | Hub-and-spoke: connect many VPCs + on-prem (replaces VPC peering mesh) |
| VPC Endpoints | Private connection to AWS services (S3, DynamoDB) without NAT/IGW |
| PrivateLink | Expose services privately to other VPCs |
Load Balancers — ALB, NLB, CLB
AWSNetworking
| ALB (Application) | NLB (Network) | CLB (Classic, deprecated) | |
|---|---|---|---|
| Layer | L7 (HTTP/HTTPS/WebSocket/gRPC) | L4 (TCP/UDP/TLS) | L4+L7 |
| Routing | Path, host, header, query, IP | IP:Port only | Basic |
| Latency | ~400ms overhead | ~<1ms (ultra low) | Medium |
| TLS offload | Yes (ACM certs) | Yes (passthrough optional) | Yes |
| WebSocket | Limited | ||
| Static IP | (DNS name) | (per-AZ Elastic IP) | |
| Best for | Web apps, microservices, HTTP APIs | TCP/UDP, gaming, IoT, VoIP, Kubernetes service LoadBalancer | Legacy |
ALB Target Groups
Register targets as EC2 instances, IP addresses, Lambda functions, or another ALB. Health checks per target group. Weighted target groups for blue/green and canary deployments.CloudFront — CDN & Edge Caching
AWSCDN
User Request Flow:
User → DNS → CloudFront Edge (450+ PoPs globally)
→ Cache HIT: serve from edge (<5ms)
→ Cache MISS: forward to Origin (S3 / ALB / custom HTTP)
→ Cache response at edge
→ Serve user
Origins: S3, ALB, EC2, API Gateway, on-prem HTTP endpoint
CloudFront Key Features
- Signed URLs/Cookies: Time-limited access to private content (streaming, downloads)
- Lambda@Edge / CloudFront Functions: Run code at edge (auth, redirects, header manipulation)
- Origin Shield: Additional caching layer to protect origin from traffic spikes
- Field-Level Encryption: Encrypt specific POST fields at edge (credit cards)
- WAF integration: Block malicious requests at edge before they reach your origin
- Invalidation: /images/* to clear specific cached paths ($0.005/1000 paths)
Cache Invalidation: CloudFront TTL-based. Use versioned file names (style.v1.2.3.css) instead of frequent invalidations for static assets.
Route 53 — DNS & Traffic Policies
AWSDNS
| Routing Policy | Use Case |
|---|---|
| Simple | Single resource, no health checks |
| Failover | Active-passive: route to secondary if primary unhealthy |
| Weighted | Traffic splitting (10% to v2, 90% to v1) — canary releases |
| Latency-based | Route to region with lowest latency from user |
| Geolocation | Route based on user's geographic location (compliance, language) |
| Geoproximity | Shift traffic between regions using bias (traffic routing, migration) |
| Multivalue Answer | Return multiple IPs, client-side LB with health checks |
Route 53 Health Checks
Monitor endpoint health from multiple AWS regions. Integrate with failover routing. Can trigger CloudWatch alarms. Also supports monitoring CloudWatch metrics (so you can failover based on error rate, not just HTTP 200).AWS Data & Analytics
DynamoDB — NoSQL at Scale
AWSPopular
Data Model
- Table: Collection of items (no fixed schema except primary key)
- Partition Key: Determines partition (must have high cardinality). Hash-based.
- Sort Key: Optional. Enables range queries within a partition.
- GSI (Global Secondary Index): Query on non-key attributes. Separate partition+sort key.
- LSI (Local Secondary Index): Alternate sort key for same partition key.
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('users')
# PutItem
table.put_item(Item={
'user_id': 'u123', # Partition key
'email': 'user@example.com',
'created_at': 1716825600,
'profile': {'name': 'Alice', 'age': 30}
})
# GetItem (single item by PK)
response = table.get_item(Key={'user_id': 'u123'})
user = response['Item']
# Query (items in a partition)
response = table.query(
KeyConditionExpression='user_id = :uid AND created_at BETWEEN :start AND :end',
ExpressionAttributeValues={':uid': 'u123', ':start': 1700000000, ':end': 1716825600}
)
# Conditional write (atomic)
table.update_item(
Key={'user_id': 'u123'},
UpdateExpression='SET #st = :active',
ConditionExpression='attribute_exists(user_id)',
ExpressionAttributeNames={'#st': 'status'},
ExpressionAttributeValues={':active': 'active'}
)
Hot Partition: If all writes hit one partition key (e.g., same date prefix), DynamoDB throttles. Distribute with high-cardinality keys or add random suffix (write sharding).
DynamoDB Capacity: Provisioned (predictable, auto-scaling) or On-Demand (variable, more expensive). Use On-Demand for new tables or unpredictable traffic.
DynamoDB Capacity: Provisioned (predictable, auto-scaling) or On-Demand (variable, more expensive). Use On-Demand for new tables or unpredictable traffic.
AWS Data Pipeline — Kinesis, Glue, Athena, Redshift
AWSAnalytics
Modern Data Pipeline on AWS:
Sources → Kinesis Data Streams (real-time, like Kafka)
→ Kinesis Firehose (managed delivery to S3/Redshift/ES)
↓
S3 Data Lake (Parquet/ORC format, partitioned by date/source)
↓
┌───────────────────────────────────────────┐
│ AWS Glue (ETL, data catalog, crawlers) │
│ Amazon Athena (serverless SQL on S3 data) │
│ Amazon Redshift (data warehouse, OLAP) │
└───────────────────────────────────────────┘
↓
QuickSight (BI / dashboards)
| Service | Purpose | Best For |
|---|---|---|
| Kinesis Data Streams | Real-time stream ingestion (like Kafka but AWS-managed) | Real-time analytics, alerting, 1-7 day retention |
| Kinesis Firehose | Fully managed delivery to S3/Redshift/OpenSearch | Simple streaming ETL, no coding required |
| AWS Glue | Serverless ETL, data catalog | Batch ETL, schema discovery, data catalog for Athena |
| Amazon Athena | Interactive SQL on S3 (Presto under hood) | Ad-hoc queries on data lake, pay per query scanned |
| Amazon Redshift | Fully managed columnar data warehouse | Complex analytics, BI, large joins |
Serverless & Messaging
AWS Lambda — Serverless Functions
AWSPopular
How Lambda Works
- Upload code (zip or container image up to 10GB)
- Set memory (128MB–10GB), timeout (max 15 min), concurrency limit
- AWS provisions execution environment (firecracker microVM) on invocation
- Pay per invocation + GB-seconds of execution
- Scale to 1000+ concurrent executions by default
import json, boto3, os
def handler(event, context):
# event: the triggering event payload (API GW request, S3 event, SQS message)
# context: runtime info (function name, memory limit, remaining time)
# Trigger sources: API Gateway, S3, SQS, SNS, DynamoDB Streams, EventBridge,
# Kinesis, ALB, Cognito, Step Functions, CloudWatch Events
# Example: S3 trigger
for record in event.get('Records', []):
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Process file...
# Warm vs Cold start
# Cold start: ~100ms (Python) to ~1s (Java) for first invocation after idle
# Warm: subsequent invocations reuse container (~1ms overhead)
return {
'statusCode': 200,
'headers': {'Content-Type': 'application/json'},
'body': json.dumps({'message': 'Success'})
}
# Best practices:
# - Initialize clients outside handler (reused across warm invocations)
db_client = boto3.client('rds-data') # module-level initialization!
Cold Start Bottleneck: Java Lambda cold starts ~1s. Mitigations: Provisioned Concurrency (keep N instances warm, costs money), Lambda SnapStart (Java — snapshot JVM state), use Python/Node for latency-sensitive.
Lambda Concurrency
- Unreserved: Shared pool (default 1000 per region)
- Reserved Concurrency: Guarantee N for a function; caps it to prevent throttling others
- Provisioned Concurrency: Pre-warm N instances → zero cold starts (costs $ always)
- SQS trigger: Lambda scales to match SQS queue depth (up to concurrency limit)
SQS, SNS & EventBridge
AWSMessaging
| Service | Model | Best For |
|---|---|---|
| SQS Standard | Queue (pull-based), at-least-once, best-effort order | Decoupling, work queues, task distribution |
| SQS FIFO | Queue, exactly-once, strict order, 300 msg/s (3000 with batching) | Financial, inventory where order matters |
| SNS | Pub/Sub fan-out (push to SQS, Lambda, HTTP, email, SMS) | Fan-out to multiple consumers simultaneously |
| EventBridge | Event bus with routing rules (match JSON patterns) | Event-driven architectures, SaaS integrations, scheduled rules |
Fan-out Pattern (SNS + SQS):
S3 Upload Event → SNS Topic
├── SQS Queue A → Lambda (thumbnail generation)
├── SQS Queue B → Lambda (virus scanning)
└── SQS Queue C → Lambda (metadata extraction)
Benefits: each consumer has own queue (independent rate/failure handling)
SNS delivers to all, SQS buffers for each consumer
SQS Key Properties
- Visibility Timeout: Message hidden from other consumers while being processed (default 30s)
- Message Retention: 1 min to 14 days (default 4 days)
- DLQ: After maxReceiveCount failures → moves to Dead Letter Queue
- Long Polling: Wait up to 20s for message (reduces empty responses + cost)
- Max message size: 256KB (use S3 for large payloads, store reference in message)
ECS vs EKS — Container Orchestration
AWSContainers
| ECS (Elastic Container Service) | EKS (Elastic Kubernetes Service) | |
|---|---|---|
| Control Plane | AWS proprietary | Kubernetes |
| Learning curve | Lower — AWS-native concepts | Higher — K8s expertise needed |
| Flexibility | AWS ecosystem only | Standard K8s (portable) |
| Cost | Free control plane + EC2/Fargate | $0.10/hr control plane + EC2/Fargate |
| Launch types | EC2 (manage servers) or Fargate (serverless) | EC2 or Fargate (for K8s) |
| Choose when | AWS-native team, simpler use case, cost-sensitive | Multi-cloud, existing K8s investment, more control |
Fargate — Serverless Containers
Run containers without managing EC2 instances. AWS provisions compute per task. Pay for vCPU + memory allocated. Great for: variable workloads, CI/CD build runners, low-ops teams. Downside: slower startup than EC2, no GPU support, higher per-unit cost.IAM — Identity & Access Management
AWSSecurity Critical
IAM Concepts
- Users: Long-term identity (humans, CI/CD). Avoid; use Roles when possible.
- Groups: Collection of users with shared permissions.
- Roles: Temporary credentials assumed by AWS services, apps, or federated users. No stored credentials.
- Policies: JSON documents defining Allow/Deny on Actions/Resources.
- Conditions: Restrict by IP, MFA, time, tags.
// Least-privilege policy: Lambda can only read from specific S3 bucket
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
],
"Condition": {
"StringEquals": { "s3:prefix": ["uploads/"] }
}
}
]
}
IAM Security Best Practices:
Principle of least privilege — only what's needed
Use roles (not users) for EC2/Lambda/ECS
Enable MFA on root and admin users
Rotate access keys; better: use OIDC/SSO
Use SCPs (Service Control Policies) in AWS Organizations
Never put AWS credentials in code, Docker images, or Git repos
Principle of least privilege — only what's needed
Use roles (not users) for EC2/Lambda/ECS
Enable MFA on root and admin users
Rotate access keys; better: use OIDC/SSO
Use SCPs (Service Control Policies) in AWS Organizations
Never put AWS credentials in code, Docker images, or Git repos
AWS Architecture Patterns
Typical 3-Tier Web Architecture on AWS
AWSArchitecture
Internet
↓
Route 53 (DNS, health check, failover)
↓
CloudFront (CDN — static assets, edge caching)
↓
WAF (Web Application Firewall — OWASP rules, rate limiting)
↓
ALB (Application Load Balancer — HTTP routing, SSL termination)
↓
Auto Scaling Group (EC2 / ECS / EKS — app servers)
├── AZ-a: 2+ instances └── AZ-b: 2+ instances
↓
┌─────────────────────────────────────────────┐
│ RDS Aurora (Multi-AZ) ← Primary writes │
│ RDS Read Replicas ← Read-heavy queries │
│ ElastiCache Redis ← Session/cache layer │
│ S3 ← Blob storage │
└─────────────────────────────────────────────┘
↓
CloudWatch (metrics, logs, alarms)
X-Ray (distributed tracing)
Event-Driven Serverless Architecture
AWSServerless
Client → API Gateway → Lambda (auth + route)
↓
┌────────────────────────────────────┐
│ EventBridge (event bus) │
│ Rule: order.created → SQS │
│ Rule: payment.failed → SNS → email │
└────────────────────────────────────┘
↓
SQS → Lambda (order processor)
Lambda (inventory updater)
Lambda (notification sender)
↓
DynamoDB (orders table)
S3 (invoices, receipts)
Aurora Serverless (reports)
Serverless Trade-offs
- No infrastructure management; auto-scale to zero
- Pay-per-use (great for low/unpredictable traffic)
- Cold starts add latency (<1s but noticeable)
- 15-minute max execution time (not for long jobs)
- Stateless — use DynamoDB/ElastiCache for state
- Vendor lock-in; local dev is harder
- Difficult to debug distributed failures (use X-Ray)
AWS Well-Architected Framework — 6 Pillars
AWSInterview
| Pillar | Key Principle | AWS Services |
|---|---|---|
| Security | Implement security at every layer; least privilege; encrypt in transit + at rest | IAM, KMS, WAF, Shield, CloudTrail, GuardDuty |
| Reliability | Recover from failures; test recovery; horizontal scale | Multi-AZ, ASG, Route53, CloudWatch, Backup |
| Performance Efficiency | Use right resource type; experiment; serverless | CloudFront, ElastiCache, Lambda, Graviton |
| Cost Optimization | Pay only for what you use; rightsize; spot instances | Cost Explorer, Reserved Instances, Savings Plans, Spot |
| Operational Excellence | Operations as code; make frequent small changes; learn from failure | CloudFormation/CDK, CodePipeline, CloudWatch, Systems Manager |
| Sustainability | Reduce environmental impact; use efficient resources | Graviton, Fargate spot, S3 Intelligent-Tiering |
Infrastructure as Code — CloudFormation & CDK
AWSDevOps
# CloudFormation — Infrastructure as YAML
AWSTemplateFormatVersion: '2010-09-09'
Description: API with Lambda and DynamoDB
Parameters:
Environment: { Type: String, Default: prod }
Resources:
OrdersTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: !Sub orders-${Environment}
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- { AttributeName: order_id, AttributeType: S }
KeySchema:
- { AttributeName: order_id, KeyType: HASH }
ApiFunction:
Type: AWS::Lambda::Function
Properties:
FunctionName: !Sub api-handler-${Environment}
Runtime: python3.12
Handler: index.handler
Code: { ZipFile: |
def handler(event, context):
return {'statusCode': 200, 'body': 'OK'}
}
Role: !GetAtt LambdaRole.Arn
Environment:
Variables:
TABLE_NAME: !Ref OrdersTable
LambdaRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement: [{Effect: Allow, Principal: {Service: lambda.amazonaws.com}, Action: sts:AssumeRole}]
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Outputs:
TableName: { Value: !Ref OrdersTable }
# AWS CDK — Infrastructure as Python code
from aws_cdk import (App, Stack, Duration,
aws_lambda as _lambda,
aws_dynamodb as dynamodb,
aws_apigateway as apigw,
aws_iam as iam)
from constructs import Construct
class ApiStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs):
super().__init__(scope, id, **kwargs)
# DynamoDB Table
table = dynamodb.Table(self, "OrdersTable",
partition_key=dynamodb.Attribute(
name="order_id", type=dynamodb.AttributeType.STRING),
billing_mode=dynamodb.BillingMode.PAY_PER_REQUEST,
removal_policy=RemovalPolicy.DESTROY # for dev
)
# Lambda Function
handler = _lambda.Function(self, "ApiHandler",
runtime=_lambda.Runtime.PYTHON_3_12,
code=_lambda.Code.from_asset("lambda"),
handler="index.handler",
timeout=Duration.seconds(30),
environment={"TABLE_NAME": table.table_name}
)
# Grant Lambda read/write access to DynamoDB
table.grant_read_write_data(handler)
# API Gateway
api = apigw.RestApi(self, "OrdersApi")
orders = api.root.add_resource("orders")
orders.add_method("POST", apigw.LambdaIntegration(handler))
app = App()
ApiStack(app, "ApiStack", env={"account": "123456789", "region": "us-east-1"})
app.synth()
CDK vs CloudFormation: CDK generates CloudFormation under the hood. CDK has full programming language power (loops, conditions, abstractions). CloudFormation is more explicit. Both are infrastructure-as-code — version controlled, reproducible, reviewable.
Observability on AWS — CloudWatch, X-Ray, OpenTelemetry
AWSMonitoring
| Tool | What | Use For |
|---|---|---|
| CloudWatch Metrics | EC2/RDS/Lambda/custom metrics + dashboards | Alerts, autoscaling triggers, SLO monitoring |
| CloudWatch Logs | Log aggregation from all AWS services + custom | Centralized log search, Insights queries |
| CloudWatch Alarms | Threshold-based alerts → SNS/SQS/ASG actions | PagerDuty integration, auto-recovery |
| AWS X-Ray | Distributed tracing (service map, latency analysis) | Identify bottlenecks across Lambda/API GW/DynamoDB |
| CloudTrail | API activity audit log (who called what when) | Security auditing, compliance, incident investigation |
| AWS Config | Track config changes to AWS resources | Compliance, drift detection |
| Amazon Managed Grafana | Grafana dashboards connected to CloudWatch/Prometheus | Rich dashboards, open-source ecosystem |
CloudWatch Logs Insights Query Examples
-- Find most common errors in Lambda fields @timestamp, @message | filter @message like /ERROR/ | stats count(*) by bin(5m) -- Lambda cold start analysis fields @type, @duration, @billedDuration, @initDuration | filter @type = "REPORT" | stats avg(@initDuration) by bin(1h)