Data modeling and storage
Data Modeling And Storage Deep Dive¶
Overview¶
Storage design should start from query patterns, write frequency, and consistency requirements.
Core Concepts¶
- Model entities around access paths, not ER-diagram purity.
- Indexes improve reads but can increase write latency.
- Partitioning strategy matters at scale.
Internal Architecture¶
- Operational DB for transactional workloads.
- Search/index store for low-latency discovery.
- Blob/object storage for large immutable media.
Data and Request Flow¶
- Writes persist to source of truth first.
- Derived views/search indexes update asynchronously.
- Read path selects best store for latency requirements.
Scalability and Reliability¶
- Hot partition detection and mitigation.
- Backfill/migration strategies for schema evolution.
- Backups and restore rehearsals.
Code Examples¶
Write: API -> Primary DB -> Change Event -> Indexer -> Search Index
Read: API -> Search Index (list) -> DB (details)
Common Interview Questions¶
- Q: SQL vs NoSQL: what decides? A: Structure the answer as constraints then tradeoffs: SLOs, capacity assumptions, bottlenecks, failure modes, and mitigation plans with clear triggers.
- Q: How do you handle schema migration safely? A: Structure the answer as constraints then tradeoffs: SLOs, capacity assumptions, bottlenecks, failure modes, and mitigation plans with clear triggers.
- Q: How do you avoid hot keys? A: Structure the answer as constraints then tradeoffs: SLOs, capacity assumptions, bottlenecks, failure modes, and mitigation plans with clear triggers.
Production Considerations¶
- Enforce retention and PII handling policies.
- Instrument slow queries and index hit ratios.
Tradeoffs¶
- Strong consistency vs lower latency reads.
- Denormalization speed vs update complexity.
Senior-Level Insights¶
- Data lifecycle strategy is as important as schema design.