feat(opensearch): Implement match highlighting#7437
Merged
Conversation
Contributor
Greptile SummaryImplements match highlighting for OpenSearch hybrid queries, returning snippets of matched text with search terms wrapped in
The PR includes a typo in the test file ( Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as Search Client
participant DQ as DocumentQuery
participant OSC as OpenSearchClient
participant OS as OpenSearch
participant ODI as OpenSearchDocumentIndex
participant IC as InferenceChunk
Client->>DQ: get_hybrid_search_query()
DQ->>DQ: _get_match_highlights_configuration()
Note over DQ: Configure highlighting with<br/>fragment_size, number_of_fragments,<br/>and highlight tags
DQ-->>Client: Query with highlight config
Client->>OSC: search(query_body)
OSC->>OS: Execute search
OS-->>OSC: Results with highlight field
Note over OSC: Extract match_highlights<br/>from hit.get("highlight")
OSC->>OSC: Create SearchHit with match_highlights
OSC-->>Client: List of SearchHit with highlights
Client->>ODI: Process search results
ODI->>ODI: _convert_retrieved_opensearch_chunk_to_inference_chunk_uncleaned()
Note over ODI: Extract content field highlights<br/>from highlights dict
ODI->>IC: Create InferenceChunkUncleaned
IC-->>ODI: Chunk with match_highlights
ODI-->>Client: Inference chunks with highlights
|
backend/tests/external_dependency_unit/opensearch/test_opensearch_client.py
Outdated
Show resolved
Hide resolved
backend/tests/external_dependency_unit/opensearch/test_opensearch_client.py
Outdated
Show resolved
Hide resolved
backend/tests/external_dependency_unit/opensearch/test_opensearch_client.py
Outdated
Show resolved
Hide resolved
evan-onyx
approved these changes
Jan 15, 2026
backend/tests/external_dependency_unit/opensearch/test_opensearch_client.py
Outdated
Show resolved
Hide resolved
rohoswagger
pushed a commit
that referenced
this pull request
Jan 19, 2026
jessicasingh7
pushed a commit
that referenced
this pull request
Jan 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR implements a feature that returns which part of a chunk's content contributed to a text match.
It also bumbs the OpenSearch image tag to 3.4.0 because somewhere between 3.0.0 and that release they dropped a fix for match highlighting; it did not work for hybrid queries beforehand. See opensearch-project/neural-search#1215
Note that the highest version of the python client is 3.1.0, and we happen to be on 3.0.0. What a mess lol. Anyway from my testing things seem to work, I can't imagine that OpenSearch removed client-facing features from 3.0 to 3.4, just added things that maybe the client doesn't reflect.
How Has This Been Tested?
test_opensearch_clientAdditional Options
Summary by cubic
Adds match highlighting to OpenSearch search results so we return the exact content snippets that matched a query. Highlights use tags and work for hybrid search.
New Features
Dependencies
Written for commit e13798a. Summary will update on new commits.