
Unlock this content
Enter your email to unlock this content for free
Why Data Locality is Important
TL;DR
Data locality depends on your sorting key. Sequential access is 5-10x faster than random; with cache and compression, you get 40x faster queries. The right sorting key enables data skipping and improves compression.
What is Data Locality?
Data locality refers to how data is physically organized on disk. In ClickHouse, this is controlled entirely by your sorting key (ORDER BY clause).
Sequential access is order of magnitude faster than random access. When combined with compression, this advantage reaches even faster queries.
Your sorting key controls data locality. Sequential access is 5-10x faster than random; with cache and compression, you get 40x faster queries.