Improve intersect_by_rank performance#7744
Conversation
|
I will go over this tomorrow @joseph-isaacs I looked at #7098 and #7393 which both optimised slightly different cases of this function. I tried to combine the two. |
Merging this PR will degrade performance by 23.81%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | take_search[(0.005, 0.05)] |
132 µs | 168.3 µs | -21.6% |
| ❌ | Simulation | take_search[(0.005, 0.1)] |
247.5 µs | 320.3 µs | -22.72% |
| ❌ | Simulation | take_search[(0.005, 0.5)] |
1.2 ms | 1.5 ms | -23.67% |
| ❌ | Simulation | take_search[(0.005, 1.0)] |
2.3 ms | 3.1 ms | -23.81% |
| ❌ | Simulation | take_search[(0.01, 0.05)] |
143.1 µs | 179.4 µs | -20.27% |
| ❌ | Simulation | take_search[(0.01, 0.1)] |
268.4 µs | 341.2 µs | -21.33% |
| ❌ | Simulation | take_search[(0.01, 0.5)] |
1.3 ms | 1.6 ms | -22.21% |
| ❌ | Simulation | take_search[(0.01, 1.0)] |
2.5 ms | 3.3 ms | -22.34% |
| ❌ | Simulation | take_search[(0.1, 0.05)] |
212.7 µs | 249.1 µs | -14.6% |
| ❌ | Simulation | take_search[(0.1, 0.1)] |
385.9 µs | 458.6 µs | -15.87% |
| ❌ | Simulation | take_search[(0.1, 0.5)] |
1.8 ms | 2.2 ms | -16.93% |
| ❌ | Simulation | take_search[(0.1, 1.0)] |
3.5 ms | 4.3 ms | -17.09% |
| ❌ | Simulation | take_search_chunked[(0.005, 0.05)] |
162.2 µs | 193.2 µs | -16.01% |
| ❌ | Simulation | take_search_chunked[(0.005, 0.1)] |
307.3 µs | 369.2 µs | -16.76% |
| ❌ | Simulation | take_search_chunked[(0.005, 0.5)] |
1.5 ms | 1.8 ms | -17.39% |
| ❌ | Simulation | take_search_chunked[(0.005, 1.0)] |
2.9 ms | 3.5 ms | -17.48% |
| ❌ | Simulation | take_search_chunked[(0.01, 0.05)] |
175.3 µs | 206.2 µs | -14.99% |
| ❌ | Simulation | take_search_chunked[(0.01, 0.1)] |
332.1 µs | 393.9 µs | -15.7% |
| ❌ | Simulation | take_search_chunked[(0.01, 0.5)] |
1.6 ms | 1.9 ms | -16.3% |
| ❌ | Simulation | take_search_chunked[(0.01, 1.0)] |
3.2 ms | 3.8 ms | -16.38% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Comparing rk/intersect-by-rank (8578962) with develop (823991f)
3083876 to
ea3104a
Compare
| indices.iter().for_each(|&idx| buf.set(idx)); | ||
| buf | ||
| pub fn from_indices(len: usize, indices: impl IntoIterator<Item = usize>) -> BitBufferMut { | ||
| let mut buffer = BufferMut::<u64>::zeroed(len.div_ceil(64)); |
There was a problem hiding this comment.
add a todo to add a sized iterator variant, then you can switch over the ratio of len to indices
|
Add benchmark numbers to the desc |
Signed-off-by: Robert Kruszewski <github@robertk.io>
6121376 to
4e90444
Compare
Intersect by rank is an operation for performing filter on top of another filter
This pr improves
Dense on Dense masks: 10-18x
Sparse on Dense masks: 8x
Sparse on Sparse masks: 2x
Signed-off-by: Robert Kruszewski github@robertk.io