Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Remove support for 2-digit years in java datetime parser (#5596)
- Remove DocMapper trait (#5508)
- Remove support for AWS Lambda (#5884)

- Remove search stream endpoint (#5886)

# [0.8.1]

Expand Down
3 changes: 1 addition & 2 deletions docs/configuration/node-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,11 +198,10 @@ This section contains the configuration options for a Searcher.
| --- | --- | --- |
| `aggregation_memory_limit` | Controls the maximum amount of memory that can be used for aggregations before aborting. This limit is per searcher node. A node may run concurrent queries, which share the limit. The first query that will hit the limit will be aborted and frees its memory. It is used to prevent excessive memory usage during the aggregation phase, which can lead to performance degradation or crashes. | `500M`|
| `aggregation_bucket_limit` | Determines the maximum number of buckets returned to the client. | `65000` |
| `fast_field_cache_capacity` | Fast field in memory cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` |
| `fast_field_cache_capacity` | Fast field in memory cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` |
| `split_footer_cache_capacity` | Split footer in memory cache (it is essentially the hotcache) capacity on a Searcher.| `500M` |
| `partial_request_cache_capacity` | Partial request in memory cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`. | `64M` |
| `max_num_concurrent_split_searches` | Maximum number of concurrent split search requests running on a Searcher. | `100` |
| `max_num_concurrent_split_streams` | Maximum number of concurrent split stream requests running on a Searcher. | `100` |
| `split_cache` | Searcher split cache configuration options defined in the section below. Cache disabled if unspecified. | |
| `request_timeout_secs` | The time before a search request is cancelled. This should match the timeout of the stack calling into quickwit if there is one set. | `30` |

Expand Down
254 changes: 0 additions & 254 deletions docs/guides/add-full-text-search-to-your-olap-db.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/overview/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ Check out our guides to see how you can use Quickwit:

- [Log management](../log-management/overview.md)
- [Distributed Tracing](../distributed-tracing/overview.md)
- Adding full-text search capabilities to [OLAP databases such as ClickHouse](../guides/add-full-text-search-to-your-olap-db).


## Key features
Expand Down
55 changes: 0 additions & 55 deletions docs/reference/rest-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,61 +113,6 @@ GET api/v1/stackoverflow*/search
}
```

### Search stream in an index

```
GET api/v1/<index id>/search/stream?query=searchterm&fast_field=my_id
```

Streams field values from ALL documents matching a search query in the target index `<index id>`, in a specified output format among the following:

- [CSV](https://datatracker.ietf.org/doc/html/rfc4180)
- [ClickHouse RowBinary](https://clickhouse.tech/docs/en/interfaces/formats/#rowbinary). If `partition_by_field` is set, Quickwit returns chunks of data for each partition field value. Each chunk starts with 16 bytes being partition value and content length and then the `fast_field` values in `RowBinary` format.

`fast_field` and `partition_by_field` must be fast fields of type `i64` or `u64`.

This endpoint is available as long as you have at least one node running a searcher service in the cluster.



:::note

The endpoint will return 10 million values if 10 million documents match the query. This is expected, this endpoint is made to support queries matching millions of documents and return field values in a reasonable response time.

:::

#### Path variable

| Variable | Description |
| ------------- | ------------- |
| `index id` | The index id |

#### Get parameters

| Variable | Type | Description | Default value |
|---------------------|------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------|
| `query` | `String` | Query text. See the [query language doc](query-language.md) | _required_ |
| `fast_field` | `String` | Name of a field to retrieve from documents. This field must be a fast field of type `i64` or `u64`. | _required_ |
| `search_field` | `[String]` | Fields to search on. Comma-separated list, e.g. "field1,field2" | index_config.search_settings.default_search_fields |
| `start_timestamp` | `i64` | If set, restrict search to documents with a `timestamp >= start_timestamp`. The value must be in seconds. | |
| `end_timestamp` | `i64` | If set, restrict search to documents with a `timestamp < end_timestamp`. The value must be in seconds. | |
| `partition_by_field` | `String` | If set, the endpoint returns chunks of data for each partition field value. This field must be a fast field of type `i64` or `u64`. | |
| `output_format` | `String` | Response output format. `csv` or `clickHouseRowBinary` | `csv` |

:::info
The `start_timestamp` and `end_timestamp` should be specified in seconds regardless of the timestamp field precision.
:::

#### Response

The response is an HTTP stream. Depending on the client's capability, it is an HTTP1.1 [chunked transfer encoded stream](https://en.wikipedia.org/wiki/Chunked_transfer_encoding) or an HTTP2 stream.

It returns a list of all the field values from documents matching the query. The field must be marked as "fast" in the index config for this to work.
The formatting is based on the specified output format.

On error, an "X-Stream-Error" header will be sent via the trailers channel with information about the error, and the stream will be closed via [`sender.abort()`](https://docs.rs/hyper/0.14.16/hyper/body/struct.Sender.html#method.abort).
Depending on the client, the trailer header with error details may not be shown. The error will also be logged in quickwit ("Error when streaming search results").

## Ingest API

### Ingest data into an index
Expand Down
12 changes: 0 additions & 12 deletions quickwit/quickwit-cli/tests/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -960,18 +960,6 @@ async fn test_all_local_index() {
let result: Value = serde_json::from_str(&query_response).unwrap();
assert_eq!(result["num_hits"], Value::Number(Number::from(2i64)));

let search_stream_response = reqwest::get(format!(
"http://127.0.0.1:{}/api/v1/{}/search/stream?query=level:info&output_format=csv&fast_field=ts",
test_env.rest_listen_port,
test_env.index_id
))
.await
.unwrap()
.text()
.await
.unwrap();
assert_eq!(search_stream_response, "72057597000000\n72057608000000\n");

let args = DeleteIndexArgs {
client_args: test_env.default_client_args(),
index_id,
Expand Down
Loading