Skip to content

Offline first and sync

Offline-First and Sync Deep Dive

Overview

Offline-first architecture treats local data as primary for reads and resilience. Network becomes synchronization infrastructure, not a mandatory read dependency.

Core Concepts

  • local-first reads for responsiveness and availability
  • queued writes with retry/backoff
  • explicit sync lifecycle and observability
  • deterministic conflict resolution policy

Layer Responsibilities

  • Presentation:
  • render local-backed state immediately
  • show sync status/errors transparently
  • Domain/use cases:
  • apply business invariants before enqueueing writes
  • Data/sync engine:
  • persist write queue
  • schedule pull/push sync
  • reconcile conflicts and update canonical store

Data Flow

  1. User action writes to local store (or queue + local mutation).
  2. UI reflects updated local state optimistically.
  3. Sync worker pushes pending operations.
  4. Server response is reconciled and persisted.
  5. Canonical local state emits final representation.

Internal Architecture

Typical internal components:

  • local DB as SSOT
  • operation queue with metadata (attempt count, timestamps)
  • sync orchestrator (WorkManager + backoff)
  • merge/conflict policy module

Conflict handling examples:

  • last-write-wins for low-risk fields
  • version-based merge for collaborative entities
  • user-assisted merge for critical records

Code Examples

data class PendingOp(
    val id: String,
    val entityId: String,
    val type: String,
    val payload: String,
    val attempt: Int
)

interface SyncCoordinator {
    suspend fun enqueueWrite(op: PendingOp)
    suspend fun runSyncCycle()
}

Common Interview Questions

  • Q: How do you guarantee eventual consistency? A: State load and SLO assumptions first, identify the first bottleneck, choose scaling and consistency strategy, and explain fallback behavior for partial failures.
  • Q: When is optimistic UI unsafe? A: Answer by defining boundaries and ownership first, then place business rules in the correct layer, and finish with testability and change-resilience tradeoffs.
  • Q: How do you avoid infinite retry loops? A: Answer by defining boundaries and ownership first, then place business rules in the correct layer, and finish with testability and change-resilience tradeoffs.
  • Q: Where should conflict resolution policy live? A: Use STAR with explicit tradeoffs: context, options considered, decision rationale, quantified result, and what process change you institutionalized.

Production Considerations

  • instrument queue depth, sync latency, and conflict rate
  • use idempotency keys for retried write operations
  • provide backpressure when backend/system health degrades
  • ensure secure local storage for sensitive offline data

Scalability Tradeoffs

  • Pros:
  • resilient UX under poor connectivity
  • smoother user interaction latency
  • Cons:
  • higher complexity in sync and reconciliation logic
  • larger operational/debugging surface area

Senior-Level Insights

Staff-level answers connect architecture with operations: what metrics triggered incidents, how policies evolved, and how teams balanced consistency guarantees vs product latency goals.