Why Does DynamoDB Cost $330K for a 5TB Feature Store?
Last week an engineer at a Series B startup sent me their AWS bill. DynamoDB alone: $27,000/month. For a feature store serving ML models.
“Is this normal?”
Yes, and it’s completely avoidable.
If you’re running read-heavy workloads on DynamoDB at scale, you’re probably paying 10-50x more than you need to.
A Real Workload
Here’s production infrastructure at thousands of companies: ML feature store serving models.
Specs:
- 5TB of feature data (embeddings, signals, model inputs)
- 10K reads/sec (every inference hits the store)
- 100 writes/sec (feature updates)
- 3 regions (us-east-1, eu-west-1, ap-southeast-1)
- 95% reads, 5% writes (classic pattern)
Nothing exotic. Standard production setup. Let’s see what it costs.
DynamoDB Costs
Here’s the cost breakdown:
| Cost Component | Calculation | Monthly Cost |
|---|---|---|
| Storage | 5 TB × 3 regions × $0.25/GB-month | $3,840 |
| Reads | 10,000 reads/sec × 86,400 × 30 × 3 regions × $0.25/million | $19,425 |
| Writes | 100 writes/sec × 86,400 × 30 × 3 regions × $1.25/million | $972 |
| Global Table Replication | 100 writes/sec × 2 × 86,400 × 30 × 3 regions × $1.875/million | $2,916 |
| Data Transfer | ~518 GB/month × ~$0.05/GB | ~$150 |
| Monthly Total | $27,303 | |
| Annual Total | $327,636 |
The primary cost driver is the read throughput, which accounts for over 70% of the total monthly bill. The often-overlooked Global Table replication cost adds another $2,916/month—for every write in one region, DynamoDB charges replicated write request units (rWRUs) at $1.875/million to sync data to the other regions. This is a direct consequence of DynamoDB’s pricing model, which charges for every read request.
Additional costs not included above: Point-in-time recovery ($0.20/GB-month), DynamoDB Streams, provisioned capacity waste, and engineering overhead for capacity planning and auto-scaling tuning.
DynamoDB delivers strong consistency, single-digit millisecond P99 latency globally, and multi-region replication—excellent for workloads that need these features. But for read-heavy ML feature stores where eventual consistency is acceptable and data changes infrequently, you’re paying a premium for capabilities you don’t need.
The Object Storage Alternative
The key insight: most data gets read way more often than it changes, and most workloads tolerate eventual consistency just fine.
Object storage like S3 costs just $0.023/GB-month, and you don’t pay per read request—just for API calls and data transfer out. The architecture is straightforward:
- Writes go to object storage (S3, R2, whatever). Durable, replicated, dirt cheap.
- Reads hit cache first (in-memory or local SSD). Sub-millisecond latency—as fast as DynamoDB.
- Cache misses hit object storage (10-20ms). With 95-98% cache hit rates, this affects <5% of requests.
- Async cross-region replication. Eventually consistent—data propagates in seconds, not milliseconds.
Same 5TB feature store, same 3 regions, same throughput. Here’s how three different caching strategies compare to DynamoDB:
| Component | DynamoDB | S3 + In-Memory Cache (95% cache hit rate) | S3 + EBS Backed Disk (full dataset on disk) | S3 + Local Disk (NVMe SSD instances) |
|---|---|---|---|---|
| Storage | $3,840 | $345 | $1,545 | $345 |
| Compute/Cache | — | $540 | $540 | $6,624 |
| Global Table Replication | $2,916 | — | — | — |
| Data Transfer | $150 | $280 | $47 | $47 |
| Monthly Total | $27,303 | $1,165 | $2,132 | $7,016 |
| Annual Total | $327,636 | $13,980 | $25,584 | $84,192 |
| Savings vs DynamoDB | — | 96% | 92% | 74% |
Note: BoulderKV costs calculated using on-demand instance pricing. With reserved instances, BoulderKV compute costs drop 30-50%.
When DynamoDB Is Actually the Right Choice
Despite the high cost for read-heavy workloads, DynamoDB is a powerful and valuable tool for specific use cases:
✓ Strong Consistency Requirements: For applications that require strongly consistent reads, such as financial ledgers or inventory systems, DynamoDB is an excellent choice.
✓ Write-Heavy Workloads: If the workload is write-heavy (>30% writes), the cost benefits of the object storage architecture diminish.
✓ Small Datasets with High Throughput: For small datasets (<100GB) with millions of requests per second, DynamoDB is a proven and effective solution.
✓ Operational Simplicity: For teams that prioritize a fully managed, zero-ops solution, the premium for DynamoDB may be justified.
Conclusion
For ML engineers and architects designing and managing read-heavy feature stores, it is crucial to carefully consider the cost implications of their database choices. While DynamoDB is a powerful and highly performant database, its pricing model can lead to exorbitant costs for read-intensive workloads. An alternative architecture based on object storage and a caching layer can provide comparable performance at 74-96% lower cost, making it a more suitable and cost-effective solution for many ML use cases.
BoulderKV is a global, read-optimized key-value store built on object storage for ML feature stores, inference caches, and other read-heavy workloads. Building in public—early access launches January 2025.
Questions or feedback? Email hello@boulderkv.com