Backblaze B2 — Goodput Calculator
Unlock data. Train anywhere. Save everywhere.
GenAI Neocloud
Connecting…
📖 How does this work?
1 · Where does your training data live today?
Which storage tier? (auto-fills Hot rate; pick Custom to enter a contracted rate)
2 · Your workload
5 PB · 64 GPUs · hourly ckpts · $23/TB·mo hot
MLPerf anchors (default = 70B): ···
⚙ Advanced assumptions — click to expose every modeling knob (concurrency, utilization, I/O wait inputs, displacement %, per-GPU bandwidth). Defaults are conservative.
modeled — measure actual training-time fraction
mode default — measure yours via nvidia-smi dmon / NCCL
modeled outcome — pilot to validate
These defaults are conservative. If your actuals differ — measure first, dial in here — the math updates live and the picker re-snaps to the new best-Year-1 tier.
B2 throughput tier
Standard fits most sub-3 PB workloads. Larger active-training datasets (5+ PB) benefit from a higher tier to keep GPUs fed — Section 4 shows the per-tier math after the run.
3 · Watch the math build, validated by real B2 operations
Click to start. Hero tiles fill in as each stage finishes.
skip the demo · just populate the math
Projected savings — fill in as the analysis runs
GPU $ reclaimed / mo
pending Stages 1+2
Hot-storage $ avoided / mo
pending Stages 3+4+5
Migration egress covered (one-time) ★
Backblaze-covered switching incentive
Year-1 customer value · incl. migration coverage
recurring × 12 + Backblaze-covered migration egress
Live narration · math + real B2 operations
Click "Calculate my savings" above. The math builds line by line. Each tile fills in as the relevant stage finishes.
Pipeline stages
B2 console ↗
1
Data Lake
Pack 50k samples → 1 shard
queued
✓ What B2 does here
  • Stores raw, processed, and versioned datasets in durable object storage.
  • Lowers the cost of keeping large training datasets and historical versions.
  • Gives training jobs a reliable place to read large shards, manifests, and dataset files.
  • Keeps data portable across GPU clouds, regions, and compute environments.
✗ What B2 doesn't do here
  • Replace fast local storage for active training — training reads come from a pre-staged NVMe copy; B2 is the durable source, not the working set.
2
Training Prep
Parallel range reads → NVMe
queued
✓ What B2 does here
  • Provides the storage source for reading or pre-staging training data to local NVMe/cache.
  • Supports parallel reads when shard size, network path, and client concurrency are tuned.
  • Keeps durable data separate from short-lived GPU nodes and training clusters.
✗ What B2 doesn't do here
  • Speed up CPU-bound work like tokenization, decompression, or data augmentation.
  • Guarantee low I/O wait without testing your real workload.
3
Checkpoints
Train + mid-run ckpt → B2
queued
✓ What B2 does here
  • Stores checkpoints off-cluster so recovery state survives node, disk, or cluster failure.
  • Gives training runs a durable recovery target outside the GPU environment.
  • Makes it affordable to keep more recovery points and milestone checkpoints.
  • Supports checkpoint retention policies instead of forcing teams to keep only the latest copy.
✗ What B2 doesn't do here
  • Make training itself faster or remove checkpoint serialization/GPU sync time.
4
Model Registry
Versioned bundle → B2
queued
✓ What B2 does here
  • Stores versioned model artifacts: weights, tokenizer, config, eval files, and manifests.
  • Provides durable artifact paths for deployment, rollback, and distribution.
  • Makes it practical to keep previous model versions available for rollback.
✗ What B2 doesn't do here
  • Replace MLflow, W&B, Comet, approval workflows, or model evaluation.
5
Inference
Cold-start: B2 → generate
queued
✓ What B2 does here
  • Acts as the source of truth for model files used by inference fleets.
  • Lets new or replacement nodes download and cache model artifacts before serving traffic.
  • Helps keep multiple active, previous, and rollback model versions available on demand.
  • Supports KV cache workflows by storing reusable prompt/context assets or cache warm-up artifacts that serving nodes can load locally, when the inference stack supports it.
✗ What B2 doesn't do here
  • Serve live inference requests or reduce per-token latency after the model is loaded.
  • Act as the live KV cache — per-request KV state belongs in GPU/CPU memory or the serving runtime's local cache.
Click any stage above to see what B2 helps with — and what it doesn't. Real B2 operations execute against your bucket; click "B2 console ↗" at top to verify objects landed.
Real B2 cost (bytes moved this session) $0.00
Stored 0 B
Egressed 0 B
PUTs 0
GETs 0