Why Does DynamoDB Cost $330K for a 5TB Feature Store?

Why Does DynamoDB Cost $330K for a 5TB Feature Store?

December 28, 2025

Last week an engineer at a Series B startup sent me their AWS bill. DynamoDB alone: $27,000/month. For a feature store serving ML models.

“Is this normal?”

Yes, and it’s completely avoidable.

If you’re running read-heavy workloads on DynamoDB at scale, you’re probably paying 10-50x more than you need to.

A Real Workload

Here’s production infrastructure at thousands of companies: ML feature store serving models.

Specs:

  • 5TB of feature data (embeddings, signals, model inputs)
  • 10K reads/sec (every inference hits the store)
  • 100 writes/sec (feature updates)
  • 3 regions (us-east-1, eu-west-1, ap-southeast-1)
  • 95% reads, 5% writes (classic pattern)

Nothing exotic. Standard production setup. Let’s see what it costs.

DynamoDB Costs

Here’s the cost breakdown:

Cost ComponentCalculationMonthly Cost
Storage5 TB × 3 regions × $0.25/GB-month$3,840
Reads10,000 reads/sec × 86,400 × 30 × 3 regions × $0.25/million$19,425
Writes100 writes/sec × 86,400 × 30 × 3 regions × $1.25/million$972
Global Table Replication100 writes/sec × 2 × 86,400 × 30 × 3 regions × $1.875/million$2,916
Data Transfer~518 GB/month × ~$0.05/GB~$150
Monthly Total$27,303
Annual Total$327,636

The primary cost driver is the read throughput, which accounts for over 70% of the total monthly bill. The often-overlooked Global Table replication cost adds another $2,916/month—for every write in one region, DynamoDB charges replicated write request units (rWRUs) at $1.875/million to sync data to the other regions. This is a direct consequence of DynamoDB’s pricing model, which charges for every read request.

Additional costs not included above: Point-in-time recovery ($0.20/GB-month), DynamoDB Streams, provisioned capacity waste, and engineering overhead for capacity planning and auto-scaling tuning.

DynamoDB delivers strong consistency, single-digit millisecond P99 latency globally, and multi-region replication—excellent for workloads that need these features. But for read-heavy ML feature stores where eventual consistency is acceptable and data changes infrequently, you’re paying a premium for capabilities you don’t need.

The Object Storage Alternative

The key insight: most data gets read way more often than it changes, and most workloads tolerate eventual consistency just fine.

Object storage like S3 costs just $0.023/GB-month, and you don’t pay per read request—just for API calls and data transfer out. The architecture is straightforward:

  1. Writes go to object storage (S3, R2, whatever). Durable, replicated, dirt cheap.
  2. Reads hit cache first (in-memory or local SSD). Sub-millisecond latency—as fast as DynamoDB.
  3. Cache misses hit object storage (10-20ms). With 95-98% cache hit rates, this affects <5% of requests.
  4. Async cross-region replication. Eventually consistent—data propagates in seconds, not milliseconds.

Same 5TB feature store, same 3 regions, same throughput. Here’s how three different caching strategies compare to DynamoDB:

ComponentDynamoDBS3 + In-Memory Cache
(95% cache hit rate)
S3 + EBS Backed Disk
(full dataset on disk)
S3 + Local Disk
(NVMe SSD instances)
Storage$3,840$345$1,545$345
Compute/Cache$540$540$6,624
Global Table Replication$2,916
Data Transfer$150$280$47$47
Monthly Total$27,303$1,165$2,132$7,016
Annual Total$327,636$13,980$25,584$84,192
Savings vs DynamoDB96%92%74%

Note: BoulderKV costs calculated using on-demand instance pricing. With reserved instances, BoulderKV compute costs drop 30-50%.

When DynamoDB Is Actually the Right Choice

Despite the high cost for read-heavy workloads, DynamoDB is a powerful and valuable tool for specific use cases:

Strong Consistency Requirements: For applications that require strongly consistent reads, such as financial ledgers or inventory systems, DynamoDB is an excellent choice.

Write-Heavy Workloads: If the workload is write-heavy (>30% writes), the cost benefits of the object storage architecture diminish.

Small Datasets with High Throughput: For small datasets (<100GB) with millions of requests per second, DynamoDB is a proven and effective solution.

Operational Simplicity: For teams that prioritize a fully managed, zero-ops solution, the premium for DynamoDB may be justified.

Conclusion

For ML engineers and architects designing and managing read-heavy feature stores, it is crucial to carefully consider the cost implications of their database choices. While DynamoDB is a powerful and highly performant database, its pricing model can lead to exorbitant costs for read-intensive workloads. An alternative architecture based on object storage and a caching layer can provide comparable performance at 74-96% lower cost, making it a more suitable and cost-effective solution for many ML use cases.

BoulderKV is a global, read-optimized key-value store built on object storage for ML feature stores, inference caches, and other read-heavy workloads. Building in public—early access launches January 2025.

Join early access


Questions or feedback? Email hello@boulderkv.com