Skip to content

Conversation

sterrettm2
Copy link
Contributor

This patch adds support for OpenMP parallelized kv-sort, and adds a new GitHub CI run to test this logic.

Below are benchmarks, ran with OpenMP limited to 8 threads. Note that for the smaller sizes, no parallelization is occurring, they are simply here to show there isn't a significant regression for smaller arrays.

10m
Benchmark                                                                    Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_10m vs. simdkvsort/random_10m]/uint64_t                -0.6970         -0.7395     226436121      68612307     226422795      58991174
[simdkvsort/random_10m vs. simdkvsort/random_10m]/int64_t                 -0.6971         -0.7428     227326975      68863301     227315014      58473350
[simdkvsort/random_10m vs. simdkvsort/random_10m]/double                  -0.6772         -0.7269     210894855      68077527     210880444      57583225
[simdkvsort/random_10m vs. simdkvsort/random_10m]/uint32_t                -0.7136         -0.7136     108392724      31043472     108379357      31040053
[simdkvsort/random_10m vs. simdkvsort/random_10m]/int32_t                 -0.7153         -0.7153     108738497      30954989     108722801      30950017
[simdkvsort/random_10m vs. simdkvsort/random_10m]/float                   -0.7140         -0.7141     118042940      33756084     118033628      33751245
OVERALL_GEOMEAN                                                           -0.7027         -0.7256             0             0             0             0
1m
Benchmark                                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_1m vs. simdkvsort/random_1m]/uint64_t                -0.7640         -0.7640      17189294       4056562      17187512       4056186
[simdkvsort/random_1m vs. simdkvsort/random_1m]/int64_t                 -0.7660         -0.7661      17267621       4039818      17267319       4039588
[simdkvsort/random_1m vs. simdkvsort/random_1m]/double                  -0.7504         -0.7503      15645983       3905547      15644824       3905746
[simdkvsort/random_1m vs. simdkvsort/random_1m]/uint32_t                -0.7423         -0.7423       8079605       2082162       8078552       2081844
[simdkvsort/random_1m vs. simdkvsort/random_1m]/int32_t                 -0.7436         -0.7436       8122354       2082877       8121708       2082503
[simdkvsort/random_1m vs. simdkvsort/random_1m]/float                   -0.7564         -0.7564       9062363       2207975       9061397       2207558
OVERALL_GEOMEAN                                                         -0.7539         -0.7540             0             0             0             0
100k
Benchmark                                                                      Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_100k vs. simdkvsort/random_100k]/uint64_t                -0.6382         -0.6382       1333651        482529       1333715        482505
[simdkvsort/random_100k vs. simdkvsort/random_100k]/int64_t                 -0.6425         -0.6426       1344152        480477       1344167        480425
[simdkvsort/random_100k vs. simdkvsort/random_100k]/double                  -0.5965         -0.5965       1206089        486644       1206208        486682
[simdkvsort/random_100k vs. simdkvsort/random_100k]/uint32_t                -0.6396         -0.6397        660024        237853        659967        237789
[simdkvsort/random_100k vs. simdkvsort/random_100k]/int32_t                 -0.6416         -0.6417        663403        237771        663399        237707
[simdkvsort/random_100k vs. simdkvsort/random_100k]/float                   -0.6384         -0.6385        756412        273522        756418        273460
OVERALL_GEOMEAN                                                             -0.6332         -0.6332             0             0             0             0
5k
Benchmark                                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_5k vs. simdkvsort/random_5k]/uint64_t                +0.0038         +0.0038         49500         49686         49506         49695
[simdkvsort/random_5k vs. simdkvsort/random_5k]/int64_t                 -0.0052         -0.0052         49855         49594         49859         49598
[simdkvsort/random_5k vs. simdkvsort/random_5k]/double                  +0.0076         +0.0077         40759         41070         40770         41083
[simdkvsort/random_5k vs. simdkvsort/random_5k]/uint32_t                +0.0137         +0.0139         21853         22152         21859         22163
[simdkvsort/random_5k vs. simdkvsort/random_5k]/int32_t                 +0.0133         +0.0135         21870         22161         21875         22171
[simdkvsort/random_5k vs. simdkvsort/random_5k]/float                   +0.0081         +0.0083         26792         27009         26797         27019
OVERALL_GEOMEAN                                                         +0.0069         +0.0070             0             0             0             0
128
Benchmark                                                                    Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_128 vs. simdkvsort/random_128]/uint64_t                +0.0078         +0.0083          1437          1448          1439          1451
[simdkvsort/random_128 vs. simdkvsort/random_128]/int64_t                 +0.0039         +0.0038          1438          1443          1441          1446
[simdkvsort/random_128 vs. simdkvsort/random_128]/double                  +0.0058         +0.0062          1297          1304          1300          1308
[simdkvsort/random_128 vs. simdkvsort/random_128]/uint32_t                +0.0074         +0.0078          1053          1061          1056          1064
[simdkvsort/random_128 vs. simdkvsort/random_128]/int32_t                 +0.0087         +0.0086          1053          1063          1055          1065
[simdkvsort/random_128 vs. simdkvsort/random_128]/float                   +0.0035         +0.0025          1135          1139          1139          1142
OVERALL_GEOMEAN                                                           +0.0062         +0.0062             0             0             0             0

And with OpenMP disabled at compile time, there is no significant change:

OpenMP disabled:
Benchmark                                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_1m vs. simdkvsort/random_1m]/uint64_t                +0.0047         +0.0046      17028177      17107557      17028398      17106417
[simdkvsort/random_1m vs. simdkvsort/random_1m]/int64_t                 -0.0026         -0.0025      17231220      17186187      17229022      17185635
[simdkvsort/random_1m vs. simdkvsort/random_1m]/double                  +0.0047         +0.0047      15536187      15608728      15535014      15607474
[simdkvsort/random_1m vs. simdkvsort/random_1m]/uint32_t                +0.0090         +0.0091       7990315       8062617       7989616       8062360
[simdkvsort/random_1m vs. simdkvsort/random_1m]/int32_t                 +0.0024         +0.0024       8022190       8041050       8021417       8040324
[simdkvsort/random_1m vs. simdkvsort/random_1m]/float                   -0.0012         -0.0012       9011472       9000477       9010707       9000246
OVERALL_GEOMEAN                                                         +0.0028         +0.0028             0             0             0             0

Benchmark                                                                    Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------
[simdkvsort/random_128 vs. simdkvsort/random_128]/uint64_t                +0.0018         +0.0023          1431          1434          1433          1437
[simdkvsort/random_128 vs. simdkvsort/random_128]/int64_t                 +0.0014         +0.0020          1434          1436          1436          1439
[simdkvsort/random_128 vs. simdkvsort/random_128]/double                  -0.0016         -0.0024          1296          1293          1299          1296
[simdkvsort/random_128 vs. simdkvsort/random_128]/uint32_t                +0.0020         +0.0021          1056          1058          1058          1060
[simdkvsort/random_128 vs. simdkvsort/random_128]/int32_t                 +0.0021         +0.0017          1056          1058          1058          1060
[simdkvsort/random_128 vs. simdkvsort/random_128]/float                   +0.0020         +0.0008          1134          1136          1137          1138
OVERALL_GEOMEAN                                                           +0.0013         +0.0011             0             0             0             0

Copy link
Member

@r-devulap r-devulap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants