Skip to content

Cloudflare Identifies Query Planning Bottleneck in ClickHouse

7.4 relevance
Score Breakdown
technical depth
9
novelty
7
actionability
7
community
6
strategic
5
personal
8

Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.

ClickHouse query planning bottleneck analysis is deep technical content on data engineering and observability.

DevTools infoq.com
Cloudflare Identifies Query Planning Bottleneck in ClickHouse
Summary

Cloudflare traced a billing pipeline slowdown to lock contention in ClickHouse's query planning stage, where 45% of CPU time was spent in the filterPartsByPartition function waiting on a single mutex. The team patched ClickHouse by replacing an exclusive lock with a shared lock, removing per-query copies of the parts list, and improving part filtering, cutting query durations by 50% and decoupling latency from part count growth. The root cause emerged after migrating to a per-tenant partitioning scheme that increased data parts without changing query access patterns.

Author

Renato Losio

More from Renato Losio →