Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up set index by adding comparison with min-max values #64134

Open
nickitat opened this issue May 20, 2024 · 1 comment
Open

Speed up set index by adding comparison with min-max values #64134

nickitat opened this issue May 20, 2024 · 1 comment
Assignees
Labels
easy task Good for first contributors performance

Comments

@nickitat
Copy link
Member

nickitat commented May 20, 2024

BTW, once I noticed that set is 4 times slower than minmax with the same data.

set max_threads=1;

create table ind_minmax(A Int64, B Int64, index x1 B type minmax granularity 1) 
Engine MergeTree order by A as select number, cityHash64(number)%111111=1 from numbers(1e8);
optimize table ind_minmax final;

create table ind_set(A Int64, B Int64, index x1 B type set(2) granularity 1) 
Engine MergeTree order by A as select number, cityHash64(number)%111111=1 from numbers(1e8);
optimize table ind_set final;

select count() from ind_minmax where B = 1;
Elapsed: 0.015 sec. Processed 7.26 million rows

select count() from ind_set where B = 1;
Elapsed: 0.060 sec. Processed 7.26 million rows

drop table ind_set;
drop table ind_minmax;

0.015 sec vs 0.060 sec

because we check predicate against all set elements. it is known sub-optimality (lack of min-max for set)

Originally posted by @nickitat in #64098 (comment)

@nickitat nickitat added performance easy task Good for first contributors labels May 20, 2024
@AntiTopQuark
Copy link

Hi @nickitat ,I am interested in this problem. Could you assign the issue to me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easy task Good for first contributors performance
Projects
None yet
Development

No branches or pull requests

2 participants