-
Notifications
You must be signed in to change notification settings - Fork 284
Add text rerank pipeline #2436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add text rerank pipeline #2436
Changes from 20 commits
a2f3528
8da1701
567cea6
42a4601
7504e8c
521d926
7e488e0
3a425db
46b654e
309c7d1
333d1f9
57dcdaf
ad4fa6a
21f52e2
dc43a10
be22a89
cd4422e
35e4929
8e2c7b6
ea286ee
e819507
17154f9
a5fa86e
610052c
37d1b00
4fca17d
244e21d
64cef5a
6f61fdd
0985787
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
// Copyright (C) 2025 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
#include "openvino/genai/rag/text_rerank_pipeline.hpp" | ||
|
||
int main(int argc, char* argv[]) try { | ||
if (argc < 4) { | ||
throw std::runtime_error(std::string{"Usage: "} + argv[0] + | ||
" <MODEL_DIR> '<QUERY>' '<TEXT 1>' ['<TEXT 2>' ...]"); | ||
} | ||
|
||
auto documents = std::vector<std::string>(argv + 3, argv + argc); | ||
std::string models_path = argv[1]; | ||
std::string query = argv[2]; | ||
|
||
std::string device = "CPU"; // GPU can be used as well | ||
|
||
ov::genai::TextRerankPipeline::Config config; | ||
config.top_n = 3; | ||
|
||
ov::genai::TextRerankPipeline pipeline(models_path, device, config); | ||
|
||
std::vector<std::pair<size_t, float>> rerank_result = pipeline.rerank(query, documents); | ||
|
||
// print reranked documents | ||
std::cout << std::fixed << std::setprecision(4); | ||
std::cout << "Reranked documents:\n"; | ||
for (const auto& [index, score] : rerank_result) { | ||
std::cout << "Document " << index << " (score: " << score << "): " << documents[index] << '\n'; | ||
} | ||
std::cout << std::defaultfloat; | ||
|
||
} catch (const std::exception& error) { | ||
try { | ||
std::cerr << error.what() << '\n'; | ||
} catch (const std::ios_base::failure&) { | ||
} | ||
return EXIT_FAILURE; | ||
} catch (...) { | ||
try { | ||
std::cerr << "Non-exception object thrown\n"; | ||
} catch (const std::ios_base::failure&) { | ||
} | ||
return EXIT_FAILURE; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Retrieval Augmented Generation Sample | ||
|
||
This example showcases inference of Text Embedding Models. The application limited configuration configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample features `openvino_genai.TextEmbeddingPipeline` and uses text as an input source. | ||
This example showcases inference of Text Embedding and Text Rerank Models. The application has limited configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample features `openvino_genai.TextEmbeddingPipeline` and `openvino_genai.TextRerankPipeline` and uses text as an input source. | ||
|
||
## Download and Convert the Model and Tokenizers | ||
|
||
|
@@ -38,10 +38,24 @@ export_tokenizer(tokenizer, output_dir) | |
|
||
Install [deployment-requirements.txt](../../deployment-requirements.txt) via `pip install -r ../../deployment-requirements.txt` and then, run a sample: | ||
|
||
`python text_embeddings.py BAAI/bge-small-en-v1.5 "Document 1" "Document 2"` | ||
|
||
### 1. Text Embedding Sample (`text_embeddings.py`) | ||
- **Description:** | ||
Demonstrates inference of text embedding models using OpenVINO GenAI. Converts input text into vector embeddings for downstream tasks such as retrieval or semantic search. | ||
- **Run Command:** | ||
```sh | ||
python text_embeddings.py <MODEL_DIR> "Document 1" "Document 2" | ||
``` | ||
Refer to the [Supported Models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#text-embeddings-models) for more details. | ||
|
||
### 2. Text Rerank Sample (`text_rerank.py`) | ||
Wovchena marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- **Description:** | ||
Demonstrates inference of text rerank models using OpenVINO GenAI. Reranks a list of candidate documents based on their relevance to a query using a cross-encoder or reranker model. | ||
- **Run Command:** | ||
```sh | ||
python text_rerank.py <MODEL_DIR> "<QUERY>" "<TEXT 1>" ["<TEXT 2>" ...] | ||
``` | ||
|
||
|
||
# Text Embedding Pipeline Usage | ||
|
||
```python | ||
|
@@ -51,3 +65,13 @@ pipeline = openvino_genai.TextEmbeddingPipeline(model_dir, "CPU") | |
|
||
embeddings = pipeline.embed_documents(["document1", "document2"]) | ||
``` | ||
|
||
# Text Rerank Pipeline Usage | ||
|
||
```python | ||
import openvino_genai | ||
|
||
pipeline = openvino_genai.TextRerankPipeline(model_dir, "CPU") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can consider combining There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a ContextualCompressionRetriever in langchain which wraps retriever and reranker. Base method is |
||
|
||
rerank_result = pipeline.rerank(query, documents) | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
#!/usr/bin/env python3 | ||
# Copyright (C) 2025 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import argparse | ||
import openvino_genai | ||
|
||
|
||
def main(): | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument("model_dir") | ||
parser.add_argument("query") | ||
parser.add_argument("texts", nargs="+") | ||
args = parser.parse_args() | ||
|
||
device = "CPU" # GPU can be used as well | ||
apaniukov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
config = openvino_genai.TextRerankPipeline.Config() | ||
config.top_n = 3 | ||
|
||
pipeline = openvino_genai.TextRerankPipeline(args.model_dir, device, config) | ||
|
||
rerank_result = pipeline.rerank(args.query, args.texts) | ||
|
||
print("Reranked documents:") | ||
for index, score in rerank_result: | ||
print(f"Document {index} (score: {score:.4f}): {args.texts[index]}") | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
// Copyright (C) 2025 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
#pragma once | ||
|
||
#include "openvino/genai/tokenizer.hpp" | ||
|
||
namespace ov { | ||
namespace genai { | ||
|
||
class OPENVINO_GENAI_EXPORTS TextRerankPipeline { | ||
public: | ||
struct OPENVINO_GENAI_EXPORTS Config { | ||
/** | ||
* @brief Number of documents to return sorted by score | ||
*/ | ||
size_t top_n = 3; | ||
|
||
/** | ||
* @brief Constructs text rerank pipeline configuration | ||
*/ | ||
Config() = default; | ||
|
||
/** | ||
* @brief Constructs text rerank pipeline configuration | ||
* | ||
* @param properties configuration options | ||
* | ||
* const ov::AnyMap properties{{"top_n", 3}}; | ||
* ov::genai::TextRerankPipeline::Config config(properties); | ||
* | ||
* ov::genai::TextRerankPipeline::Config config({{"top_n", 3}}); | ||
*/ | ||
explicit Config(const ov::AnyMap& properties); | ||
}; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would rather do it in a separate PR after we confirm approach is working for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The By the way, we can set the upper bound for the input shape if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, will add max_length. Could you please specify model names which are expected to fail? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you sure we need |
||
|
||
/** | ||
* @brief Constructs a pipeline from xml/bin files, tokenizer and configuration in the same dir. | ||
* | ||
* @param models_path Path to the directory containing model xml/bin files and tokenizer | ||
* @param device Device | ||
* @param config Pipeline configuration | ||
* @param properties Optional plugin properties to pass to ov::Core::compile_model(). | ||
*/ | ||
TextRerankPipeline(const std::filesystem::path& models_path, | ||
const std::string& device, | ||
const Config& config, | ||
const ov::AnyMap& properties = {}); | ||
|
||
/** | ||
* @brief Constructs a pipeline from xml/bin files, tokenizer and configuration in the same dir. | ||
* | ||
* @param models_path Path to the directory containing model xml/bin files and tokenizer | ||
* @param device Device | ||
* @param properties Optional plugin and/or config properties | ||
*/ | ||
TextRerankPipeline(const std::filesystem::path& models_path, | ||
const std::string& device, | ||
const ov::AnyMap& properties = {}); | ||
|
||
/** | ||
* @brief Constructs a pipeline from xml/bin files, tokenizer and configuration in the same dir. | ||
* | ||
* @param models_path Path to the directory containing model xml/bin files and tokenizer | ||
* @param device Device | ||
* @param properties Plugin and/or config properties | ||
*/ | ||
template <typename... Properties, | ||
typename std::enable_if<ov::util::StringAny<Properties...>::value, bool>::type = true> | ||
TextRerankPipeline(const std::filesystem::path& models_path, const std::string& device, Properties&&... properties) | ||
: TextRerankPipeline(models_path, device, ov::AnyMap{std::forward<Properties>(properties)...}) {} | ||
|
||
/** | ||
* @brief Reranks a vector of texts based on the query. | ||
*/ | ||
std::vector<std::pair<size_t, float>> rerank(const std::string& query, const std::vector<std::string>& texts); | ||
|
||
/** | ||
* @brief Asynchronously reranks a vector of texts based on the query. Only one method of async family can be | ||
* active. | ||
*/ | ||
void start_rerank_async(const std::string& query, const std::vector<std::string>& texts); | ||
|
||
/** | ||
* @brief Waits for reranked texts. | ||
*/ | ||
std::vector<std::pair<size_t, float>> wait_rerank(); | ||
|
||
~TextRerankPipeline(); | ||
|
||
private: | ||
class TextRerankPipelineImpl; | ||
std::unique_ptr<TextRerankPipelineImpl> m_impl; | ||
}; | ||
|
||
/** | ||
* @brief Number of documents to return after reranking sorted by score | ||
*/ | ||
static constexpr ov::Property<size_t> top_n{"top_n"}; | ||
|
||
} // namespace genai | ||
} // namespace ov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is more convenient to specify document files that is more closer to practical use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep sample simple for the moment. We could implement some file/network loaders and text splitters. But I think it should be a dedicated sample where we can showcase full RAG flow.