Skip to content

Conversation

shyla226
Copy link

This pull request introduces improvement to Euclidean similarity function in PQVectors.diversityFunctionFor. From flamegraph, it is observed that considerable amount of time is spent in jdk/incubator/vector/FloatVector.reducelanesTemplate. This is mainly because FloatVector.reducelanes() is expensive and it is being called inside a for loop (via VectorUtil.squareL2Distance). Modification in this pull request moves call to reduceLanes() outside the for loop.

Change proposed here was tested with the benchmark, PQDistanceCalculationBenchmark.diversityCalculation
With this benchmark, ~18% reduction in time was observed when M=64 and ~22% when M=192.

Code modifications:

  1. Added a new function pqDiversityEuclidean in VectorUtilSupport and its corresponding implementations
  2. Removed for loop in PQVectors.diversityFunctionFor and moved it into pqDiversityEuclidean
  3. Moved FloatVector.reducelanes() outside the for loop

Test setup:
Jvector version : main branch (as of 2025-08-28)
JDK version : openjdk version "24.0.2" 2025-07-15
Platform : INTEL(R) XEON(R) PLATINUM 8592+
Benchmark : PQDistanceCalculationBenchmark.diversityCalculation

@MarkWolters
Copy link
Contributor

This PR does successfully remove the calls to reduceLanes from the inside of the loop iterating over the subspaces the Euclidean case. I do have a concern about applying this pattern to one specific case but leaving the handling of other cases as is, leaving the code with 2 different implementations for the same problem. Some questions:

  1. Why is the optimization only implemented for the Euclidean case? In PanamaVectorUtilSupport, where the call to reduceLanes is made from the squareDistance methods which are called in a loop for the subspace count, likewise reduceLanes is called from the dotProduct methods. Why only fix one?

  2. Why is the optimization only applied to diversityFunctionFor(…) and not scoreFunctionFor(…) which uses the same pattern of calling squareL2Distance(…) inside of a loop for each subspace?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants