When DynamoDB Gets Expensive for Feature Stores

When DynamoDB Gets Expensive for Feature Stores

December 30, 2025

A data engineer messaged me last month: “We’re spending $20K/month on DynamoDB for our feature store. We’re doing thousands of reads per second, but something feels off about the bill.”

I asked about their architecture. Turns out, 92% of their bill was from loading data, not serving reads.

This is shockingly common.

The Feature Store Pattern

Here’s the standard ML feature store architecture at most companies:

  1. Daily Spark job computes features from raw data
  2. Write snapshot to S3 (parquet files, checkpoints)
  3. Load snapshot into DynamoDB for serving
  4. ML models read features during inference

The workflow makes sense. S3 is cheap for batch writes. DynamoDB is fast for point reads. But step 3—loading 1TB into DynamoDB every day—is where the bill explodes.

A Real-World Scenario

Specs:

  • 1TB feature store (user embeddings, behavioral signals, model inputs)
  • Daily full refresh (Spark job computes new features every 24 hours)
  • 2,000 reads/sec (model inference hitting the store)
  • Average item size: 1KB

Nothing unusual. This is a typical mid-size feature store.

Where The Money Goes

Here’s the DynamoDB cost breakdown (on-demand pricing):

Cost ComponentCalculationMonthly Cost% of Total
Daily Load (Writes)1TB ÷ 1KB × 30 days × $0.625/million WRUs$18,75092%
Reads2,000/sec × 86,400 × 30 × $0.125/million$6483%
Storage1TB × $0.25/GB-month$2561%
Monthly Total$19,654
Annual Total$235,848

92% of your bill is loading data you already have in S3.

You’re paying DynamoDB’s write premium to move data from one place (S3) to another (DynamoDB)—every single day. The actual serving cost? Almost a rounding error.

Why This Happens

There are two ways to load data into DynamoDB from S3:

Option 1: BatchWriteItem API (what most teams use)

DynamoDB charges $0.625 per million write request units. Each WRU handles 1KB. To load 1TB:

  • 1TB = ~1 billion KB
  • 1 billion WRUs × $0.625/million = $625 per load
  • 30 loads/month = $18,750

Option 2: S3 Import feature

AWS offers a managed S3 import feature at $0.15/GB—much cheaper at first glance:

  • 1TB = 1,024 GB × $0.15 = $154 per load
  • 30 loads/month = $4,608

The catch? S3 Import creates a new table each time. For daily refreshes, you’d need to create a new table, switch traffic, and delete the old one—every single day. Most teams stick with BatchWriteItem to avoid this operational complexity.

Either way, you’re paying thousands per month just to move data you already have in S3.

The Alternative: Local Disk + Object Storage

What if you skip step 3 entirely?

  1. Daily Spark job computes features from raw data
  2. Write snapshot to S3 (parquet files, checkpoints)
  3. Load snapshot into DynamoDB for serving Sync to local disk with BoulderKV
  4. ML models read features during inference

BoulderKV keeps S3 as the source of truth and syncs data to local disk for serving. When your Spark job writes a new snapshot, BoulderKV pulls it to local SSD automatically. No expensive per-item writes. No DynamoDB load step. Just a fast file sync.

Cost Comparison

ComponentDynamoDB
(BatchWriteItem)
DynamoDB
(S3 Import)
BoulderKV
(Local Disk)
Storage$256$256$187 (S3 + EBS)
Daily Sync/Load$18,750$4,608$0
Compute$140
Read Serving$648$648
Monthly Total$19,654$5,512$327
Annual Total$235,848$66,144$3,924
Savings vs BatchWriteItem72%98%

BoulderKV syncs your 1TB dataset from S3 to local disk for pennies (S3 GET requests), not thousands (DynamoDB writes). Even compared to the S3 Import option, you’re saving $5,000/month—and you don’t have to manage daily table swaps.

But What About Latency?

Fair question. DynamoDB delivers single-digit millisecond P99. Can local disk match that?

Yes—local SSD delivers comparable performance:

  • Local disk reads: <5ms P99. On par with DynamoDB.
  • No cache misses: The entire dataset is on disk. No slow fallback to S3.
  • Predictable latency: No cold starts or cache warming needed.

For a feature store refreshed daily, you sync once and serve all day from local storage. No cold starts, no cache warming, no misses.

Multi-Region Makes It Worse

The costs above are for a single region. Most production feature stores run in multiple regions for latency and availability. With DynamoDB Global Tables, costs multiply per region:

RegionsDynamoDB (BatchWriteItem)DynamoDB (S3 Import)BoulderKV
1 region$19,654$5,512$327
3 regions$58,962$16,536$981

DynamoDB replicates every write to every region—so your daily load cost triples. With BoulderKV, each region syncs independently from S3. The data transfer is cheap, and there’s no write amplification.

For a deeper dive into multi-region costs, see our DynamoDB cost analysis for a 5TB feature store.

Conclusion

For batch-refreshed feature stores, the daily load into DynamoDB is often the majority of your bill. You’re paying DynamoDB’s write premium for a bulk copy operation—not for the fast reads that justified choosing DynamoDB in the first place.

The alternative is simple: keep your features in S3, sync to local disk for serving, and skip the expensive DynamoDB load entirely.

BoulderKV is built for exactly this pattern—read-heavy workloads where the data already lives in object storage. Sync to local disk for fast reads. No daily loads. No write amplification. Just DynamoDB-level latency at a fraction of the cost.

Join early access →


Think our math is off? We’d love to hear from you—email hello@boulderkv.com