Repository pattern and data sources
Repository Pattern and Data Sources Deep Dive¶
Overview¶
Repository architecture abstracts data origin and centralizes consistency policy. In production Android apps, repositories are where cache, sync, and fallback logic should live instead of leaking into ViewModels.
Core Concepts¶
- repository exposes stable domain-facing API
- local and remote sources remain implementation details
- single source of truth (SSOT) enforced at repository boundary
- mapping and error normalization happen before upstream exposure
Layer Responsibilities¶
- Presentation/ViewModel:
- consume repository through use cases/contracts
- avoid source-specific logic
- Domain:
- define repository interfaces and business semantics
- Data:
- implement source orchestration
- handle persistence/network/retry policy
Data Flow¶
- Consumer requests data from repository.
- Repository reads canonical local source.
- Freshness rules decide whether remote fetch is needed.
- Remote result is validated/mapped and persisted.
- Updated canonical source emits new data upstream.
Internal Architecture¶
Typical repository internals:
- remote source adapter (API)
- local source adapter (DB/cache)
- mapper layer (DTO <-> entity <-> UI model)
- policy engine (TTL, backoff, conflict handling)
Important anti-patterns:
- leaking Retrofit/Room types to domain/UI
- duplicating fetch policy across features
- bypassing repository for "quick fixes"
Code Examples¶
interface ArticleRepository {
fun observeArticles(): Flow<List<Article>>
suspend fun refreshArticles(force: Boolean = false)
}
class ArticleRepositoryImpl(
private val api: ArticleApi,
private val dao: ArticleDao,
private val clock: Clock
) : ArticleRepository {
override fun observeArticles(): Flow<List<Article>> = dao.observeAll()
override suspend fun refreshArticles(force: Boolean) {
if (force || dao.isStale(clock.now())) {
val remote = api.getArticles()
dao.replaceAll(remote.map { it.toEntity() })
}
}
}
Common Interview Questions¶
- Q: Should repositories return
Flow,suspend, or both? A: Start from delivery semantics: use StateFlow for durable state, SharedFlow or Channel for transient events, and lifecycle-aware collection to prevent duplicate work. - Q: Where should mapping happen? A: Answer by defining boundaries and ownership first, then place business rules in the correct layer, and finish with testability and change-resilience tradeoffs.
- Q: How do you enforce SSOT in multi-feature apps? A: Describe data policy explicitly: freshness and invalidation rules, canonical local source, deterministic merge logic, and duplicate prevention with stable keys.
- Q: Repository vs use case: where do rules belong? A: Answer by defining boundaries and ownership first, then place business rules in the correct layer, and finish with testability and change-resilience tradeoffs.
Production Considerations¶
- document freshness and retry policies explicitly
- include tracing around source decisions (local vs remote)
- protect write paths with idempotency/retry semantics
- keep repository APIs stable to minimize cross-team churn
Scalability Tradeoffs¶
- Pros:
- consistency and reuse of data policy
- lower UI-layer complexity
- Cons:
- repository can become a god object without boundaries
- policy complexity grows with product scope
Senior-Level Insights¶
Senior-level answers should discuss repository ownership and governance. At scale, success depends on keeping repository contracts stable while letting data-source implementations evolve safely.