Unlock this content

Enter your email to unlock this content for free

By continuing, you agree to our Terms of Service and Privacy Notice, and to receive occasional marketing emails.

Why Data Locality is Important

TL;DR

Data locality depends on your sorting key. Sequential access is 5-10x faster than random; with cache and compression, you get 40x faster queries. The right sorting key enables data skipping and improves compression.

What is Data Locality?

Data locality refers to how data is physically organized on disk. In ClickHouse, this is controlled entirely by your sorting key (ORDER BY clause).

Sequential access is order of magnitude faster than random access. When combined with compression, this advantage reaches even faster queries.

Your sorting key controls data locality. Sequential access is 5-10x faster than random; with cache and compression, you get 40x faster queries.

The Sorting Key Experiment

Tinybird is not affiliated with, associated with, or sponsored by ClickHouse, Inc. ClickHouse® is a registered trademark of ClickHouse, Inc.

Why Data Locality is Important | ClickHouse for Developers