确认projection生效: https://clickhouse.com/docs/knowledgebase/projection_example

ReadFromPreparedSource (for_column2) 可以看出是走了projection

clickhouse-23.12 :) explain indexes = 1 SELECT count() from table1_projections where column2 > 999 and column1 > 1;
┌─explain─────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│   Aggregating                               │
│     Filter                                  │
│       ReadFromPreparedSource (for_column2)  │
└─────────────────────────────────────────────┘

Using projections to speedup queries in Clickhouse

This is important as this means we have to add all columns to projection that are used in the query. This is different from row-based databases indexes, which referrence entire rows. But columnar database doesn’t even know what row is, that’s why projection != index. Projection is just some part of the original table data stored in a different form, which is better suitable for certain queries.

需要在投影中把所有用到的列都加进去,这和行数据库有很大不同。列数据库并不知道什么是行,因此投影不等于索引。投影只是原始表的数据以一种不同的方式存储了。

增加投影会影响数据库写入性能,因此增加投影需要慎重考虑,应该仅对调用频繁的查询构建投影。

测试数据库:

create table test (a UInt32, b UInt32, c UInt32, d UInt32)
Engine = MergeTree() order by (a, b)

insert into test select number, number + rand() % 100, number - rand() % 100, rand() / 1000
from numbers(1, 100000)


alter table test add projection c_d (select c, d order by c,d)
alter table test materialize projection c_d


# used projection
explain indexes=1 SELECT count() from test where c < 10
explain indexes=1 SELECT count() from test where c = 10

# no projection, why?
explain indexes=1 SELECT count() from test where c > 10