Skip to content

Parallelism and scheduling

Parallelism and Scheduling Deep Dive

Overview

Parallelism is a capacity control decision, not just a performance trick. In coroutine systems, limiting parallelism often improves stability and p99 latency.

Core Concepts

  • logical concurrency vs actual parallel execution
  • limitedParallelism constrains concurrent tasks
  • fairness and throughput tradeoffs
  • contention-aware scheduling

Internal Implementation

limitedParallelism creates a constrained dispatcher view that throttles active work while still using the parent dispatcher's infrastructure.

This provides a lightweight control boundary without introducing a new executor.

Threading Model

Limiting parallelism helps protect scarce resources:

  • DB connection pools
  • remote APIs with rate limits
  • CPU-heavy transforms

It does not eliminate thread contention; it reduces pressure on shared pools.

Coroutine / Flow Behavior

In Flow-heavy systems, parallel map/merge operations can overwhelm downstream. Applying bounded concurrency keeps pipelines predictable under load.

Code Examples

val limited = Dispatchers.Default.limitedParallelism(4)

suspend fun process(items: List<Item>) = coroutineScope {
    items.map { item ->
        async(limited) { transform(item) }
    }.awaitAll()
}

Common Interview Questions

  • Q: When should you use limitedParallelism? A: Lead with correctness then throughput: choose dispatcher by workload type, keep critical sections small, cap parallelism, and monitor tail latency and queue depth.
  • Q: Is it equivalent to a fixed thread pool? A: Lead with correctness then throughput: choose dispatcher by workload type, keep critical sections small, cap parallelism, and monitor tail latency and queue depth.
  • Q: How do you pick the limit value? A: Answer with correctness first and throughput second: cancellation model, dispatcher choice, bounded parallelism, and contention or latency measurements.
  • Q: What metrics validate the chosen bound? A: Answer with correctness first and throughput second: cancellation model, dispatcher choice, bounded parallelism, and contention or latency measurements.

Production Considerations

  • start conservative and tune with metrics
  • separate limits per bottlenecked dependency
  • revisit limits as workload profile changes
  • avoid one global limit for unrelated workloads

Performance Insights

High parallelism can improve average throughput but hurt tail latency. Bounded scheduling often stabilizes responsiveness and reduces queue collapse.

Senior-Level Insights

Staff-level discussions should include concurrency budgets and SLO alignment: parallelism choices should be tied to reliability targets, not guesswork.