Skip to content

Conversation

AlexGuteniev
Copy link
Contributor

There is count for vector<bool> optimization that uses popcnt on the integer elements of the vector<bool> internal representation, originally added in #1131. There's open PR #5640 to enhance that optimization further.

This PR adds benchmark to measure the results.

I've started off copying vector_bool_copy.cpp to mimic the existing style, then left only the _algined case, since unalignment doesn't make significant impact (unlike copying), still left the same name (as just count matches the STL algorithm name), and added DoNotOptimize where necessary. The value to count is alternating to explore both branches without adding extra benchmarks.


The results for #5640 are mixed for me.

On P cores of i5-1235U I see no improvement:

Benchmark Before After Speedup
count_aligned/64 17.0 ns 17.0 ns 1.00
count_aligned/4096 61.5 ns 59.6 ns 1.03
count_aligned/65536 718 ns 747 ns 0.96

On E cores I see some improvement, which is not too little for such a small change:

Benchmark Before After Speedup
count_aligned/64 21.3 ns 21.6 ns 0.99
count_aligned/4096 114 ns 90.6 ns 1.26
count_aligned/65536 1505 ns 1092 ns 1.38

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner August 19, 2025 08:15
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Aug 19, 2025
@StephanTLavavej StephanTLavavej added performance Must go faster test Related to test code labels Aug 20, 2025
@StephanTLavavej StephanTLavavej self-assigned this Aug 20, 2025
@StephanTLavavej
Copy link
Member

5950X results:

Benchmark Before After Speedup
count_aligned/64 18.8 ns 18.8 ns 1.00
count_aligned/4096 77.7 ns 62.8 ns 1.24
count_aligned/65536 903 ns 786 ns 1.15

@StephanTLavavej
Copy link
Member

Thanks! 😻 I pushed minor stylistic changes.

@StephanTLavavej StephanTLavavej removed their assignment Aug 22, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Aug 22, 2025
* Predictable sequence in each benchmark case regardless of other cases
* Potentially ninor perf improvement by avoiding magic static
@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Aug 25, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit af4f9c8 into microsoft:main Aug 25, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Aug 25, 2025
@StephanTLavavej
Copy link
Member

Thanks for measuring this performance! ⏱️ 🚀 😻

@AlexGuteniev AlexGuteniev deleted the vector-bool-count-benchmark branch August 26, 2025 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster test Related to test code
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants