Parallelism and scheduling
Parallelism and Scheduling Deep Dive¶
Overview¶
Parallelism is a capacity control decision, not just a performance trick. In coroutine systems, limiting parallelism often improves stability and p99 latency.
Core Concepts¶
- logical concurrency vs actual parallel execution
limitedParallelismconstrains concurrent tasks- fairness and throughput tradeoffs
- contention-aware scheduling
Internal Implementation¶
limitedParallelism creates a constrained dispatcher view that throttles active
work while still using the parent dispatcher's infrastructure.
This provides a lightweight control boundary without introducing a new executor.
Threading Model¶
Limiting parallelism helps protect scarce resources:
- DB connection pools
- remote APIs with rate limits
- CPU-heavy transforms
It does not eliminate thread contention; it reduces pressure on shared pools.
Coroutine / Flow Behavior¶
In Flow-heavy systems, parallel map/merge operations can overwhelm downstream. Applying bounded concurrency keeps pipelines predictable under load.
Code Examples¶
val limited = Dispatchers.Default.limitedParallelism(4)
suspend fun process(items: List<Item>) = coroutineScope {
items.map { item ->
async(limited) { transform(item) }
}.awaitAll()
}
Common Interview Questions¶
- Q: When should you use
limitedParallelism? A: Lead with correctness then throughput: choose dispatcher by workload type, keep critical sections small, cap parallelism, and monitor tail latency and queue depth. - Q: Is it equivalent to a fixed thread pool? A: Lead with correctness then throughput: choose dispatcher by workload type, keep critical sections small, cap parallelism, and monitor tail latency and queue depth.
- Q: How do you pick the limit value? A: Answer with correctness first and throughput second: cancellation model, dispatcher choice, bounded parallelism, and contention or latency measurements.
- Q: What metrics validate the chosen bound? A: Answer with correctness first and throughput second: cancellation model, dispatcher choice, bounded parallelism, and contention or latency measurements.
Production Considerations¶
- start conservative and tune with metrics
- separate limits per bottlenecked dependency
- revisit limits as workload profile changes
- avoid one global limit for unrelated workloads
Performance Insights¶
High parallelism can improve average throughput but hurt tail latency. Bounded scheduling often stabilizes responsiveness and reduces queue collapse.
Senior-Level Insights¶
Staff-level discussions should include concurrency budgets and SLO alignment: parallelism choices should be tied to reliability targets, not guesswork.