Batch Rendering Farm
A product catalog image renderer that generates SVG variants from JSON specifications using AWS Batch array jobs. Step Functions orchestrate the workflow: Lambda validates input, Batch processes one child job per product variant with EFS shared storage, and SNS/SQS deliver completion notifications. The scenario exercises Simfra's batch compute, shared filesystem, and workflow orchestration capabilities.
Services
| Service | Role |
|---|---|
| AWS Batch | Managed compute environment, job queue, and job definition for array jobs |
| ECS | Task execution backing Batch jobs with Go worker containers |
| ECR | Container image repository for the rendering worker |
| EFS | Shared scratch filesystem mounted by all Batch tasks |
| Lambda | Python 3.12 functions for input validation and result publishing |
| Step Functions | Orchestrates validate, render (Batch), publish, and notify steps |
| S3 | Input specs, rendered output, and pipeline artifacts - all SSE-KMS |
| SNS | Completion and failure notification topics |
| SQS | Notification queue with DLQ for failed messages |
| EventBridge | Captures Batch job state-change events |
| CloudWatch Logs | Batch job and Lambda execution logs |
| KMS | Six customer-managed keys for per-service encryption |
| CodeCommit | Source repository |
| CodeBuild | Packages Lambda functions and builds worker image |
| CodeDeploy | Lambda deployment with AllAtOnce traffic shifting |
| CodePipeline | Orchestrates the CI/CD flow |
Architecture
S3 (product JSON specs)
|
v
Step Functions Workflow
|
├── Lambda: validate-input (check spec exists and is well-formed)
|
├── Batch: array job (one child per variant)
| └── Go worker reads spec from S3, generates SVG, writes to S3
| └── All children share EFS scratch mount
|
├── Lambda: publish-results (creates manifest.json)
|
└── SNS: completion notification --> SQS queue
|
On failure: DLQ
Each Batch array job spawns one child task per product variant. All tasks mount the same EFS filesystem for intermediate scratch data. The Step Functions workflow handles both success and failure paths - on Batch failure, the workflow publishes to a separate failure SNS topic. EventBridge independently captures Batch job state transitions for observability.
What This Validates
- AWS Batch compute environment, job queue, and job definition lifecycle
- Batch array jobs with per-child task distribution and parallel execution
- EFS filesystem mounted in ECS-backed Batch tasks for shared scratch storage
- Step Functions orchestrating Lambda, Batch, and SNS service integrations
- ECR container image storage for Batch worker images
- SNS/SQS notification delivery with KMS encryption
- Dead-letter queue for failed notification processing
- EventBridge capturing Batch job state-change events
- S3 as input/output store with KMS encryption
Test Coverage
Tests cover CI/CD pipeline execution, smoke checks for compute environment and job queue state, integration tests for Step Functions workflows (submit job, verify Batch execution, check S3 output and manifest, validate SNS/SQS notification), failure path testing with DLQ routing, and performance tests with 5 concurrent Step Functions executions and 10 concurrent standalone Batch jobs.