fix: vertex prompt caching by evan-onyx · Pull Request #7339 · onyx-dot-app/onyx

evan-onyx · 2026-01-11T00:02:26Z

Description

We were seeing a variety of errors when users tried to user vertex models through Onyx. Supporting vertex's explicit prompt caching (not allowed to pass the tool calls or system message) will take a while and will likely be tricky to get right. Hopefully they just make it more convenient in the future.

We were previously just optimistically doing what litellm does in the first code example in their docs:
https://docs.litellm.ai/docs/providers/vertex#context-caching

but it seems pretty clear at this point we need the heavier-weight version they describe below it (making explicit calls to the provider).

How Has This Been Tested?

gemini through vertex works

Additional Options

[Optional] Override Linear Check

Summary by cubic

Disabled Vertex prompt caching to prevent errors with Gemini via Vertex. Removed cache_control injection and will add explicit caching in a future update.

Bug Fixes
- Stop transforming cacheable messages for Vertex (no cache_control added).
- Avoids conflicts with tools and system messages during caching.
Dependencies
- Marked fsevents as dev-only in package-lock.json.

^{Written for commit d1df258. Summary will update on new commits.}

cubic-dev-ai

No issues found across 2 files

greptile-apps · 2026-01-11T00:06:04Z

Greptile Overview

Greptile Summary

This PR disables Vertex AI's explicit prompt caching to fix errors users were experiencing with Gemini models. The change moves from an optimistic caching approach (adding cache_control parameters) to relying on Vertex's implicit caching mechanism.

Key Changes:

Set transform_cacheable=None in prepare_messages_for_caching() to skip cache control transformations
Added explanatory comment noting that explicit caching with tools and system messages requires a more sophisticated implementation
Updated fsevents package metadata in package-lock.json (appears unrelated to the stated purpose)

Code Quality Issues:

The _add_vertex_cache_control function (lines 83-125) is now dead code and should be removed per coding standards
Contains a typo: "mechnism" should be "mechanism"
The package-lock.json change seems accidental and unrelated to the Vertex caching fix

Functional Impact:
The change correctly addresses the immediate problem by removing cache_control parameters that were conflicting with tools and system messages. Vertex AI will now handle caching implicitly, which should resolve the errors users were seeing. The approach aligns with how the OpenAI provider handles implicit caching (no transformation needed).

Confidence Score: 4/5

This PR is safe to merge with minor cleanup needed - the core fix is sound and addresses the reported issue effectively.
The functional change (setting transform_cacheable to None) is correct and follows the pattern used by the OpenAI provider for implicit caching. The approach appropriately defers complex explicit caching to a future PR. Score of 4 (not 5) due to: (1) dead code that should be removed per coding standards, (2) a spelling typo in comments, and (3) an apparently unrelated package-lock.json change that may be accidental. These are all minor style/cleanup issues that don't affect functionality.
backend/onyx/llm/prompt_cache/providers/vertex.py should have the unused _add_vertex_cache_control function removed (lines 83-125)

Important Files Changed

File Analysis

Filename	Score	Overview
backend/onyx/llm/prompt_cache/providers/vertex.py	3/5	Disabled Vertex prompt caching by setting transform_cacheable to None. Contains unused dead code (_add_vertex_cache_control function) and a typo in comment.
web/package-lock.json	4/5	Added "dev": true to fsevents package - appears unrelated to PR's stated purpose of fixing Vertex caching.

Sequence Diagram

sequenceDiagram
    participant Client
    participant VertexProvider
    participant prepare_messages_with_cacheable_transform
    participant LLM_API as Vertex AI API

    Note over VertexProvider: Before this PR
    Client->>VertexProvider: prepare_messages_for_caching(prefix, suffix)
    VertexProvider->>prepare_messages_with_cacheable_transform: transform_cacheable=_add_vertex_cache_control
    prepare_messages_with_cacheable_transform->>prepare_messages_with_cacheable_transform: Add cache_control to messages
    prepare_messages_with_cacheable_transform-->>VertexProvider: Messages with cache_control
    VertexProvider->>LLM_API: Send messages with cache_control
    LLM_API-->>VertexProvider: Error (conflicts with tools/system messages)

    Note over VertexProvider: After this PR
    Client->>VertexProvider: prepare_messages_for_caching(prefix, suffix)
    VertexProvider->>prepare_messages_with_cacheable_transform: transform_cacheable=None
    prepare_messages_with_cacheable_transform->>prepare_messages_with_cacheable_transform: No transformation
    prepare_messages_with_cacheable_transform-->>VertexProvider: Messages unchanged
    VertexProvider->>LLM_API: Send messages without cache_control
    LLM_API-->>VertexProvider: Success (implicit caching)

greptile-apps

_{3 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-11T00:06:11Z

backend/onyx/llm/prompt_cache/providers/vertex.py

    not at the message level. This function converts string content to the array format
    and adds cache_control to the last content block in each cacheable message.
    """
+    # NOTE: unfortunately we need a much more sophisticated mechnism to support


'mechnism' is mis-spelled - should be 'mechanism'

Suggested change

# NOTE: unfortunately we need a much more sophisticated mechnism to support

# NOTE: unfortunately we need a much more sophisticated mechanism to support

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/onyx/llm/prompt_cache/providers/vertex.py Line: 92:92 Comment: 'mechnism' is mis-spelled - should be 'mechanism' ```suggestion # NOTE: unfortunately we need a much more sophisticated mechanism to support ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-11T00:06:12Z

web/package-lock.json

      "version": "2.3.2",
+      "dev": true,


This change appears unrelated to the PR's stated purpose of fixing Vertex prompt caching. The addition of "dev": true to the fsevents package is likely an accidental change from your local development environment.

Consider whether this change should be:

Removed from this PR (if accidental)

Included with explanation (if intentional)

Moved to a separate PR focused on dependency management

fsevents is a macOS-specific file system event watcher that's typically an optional dependency. While marking it as dev-only may be correct, it's unclear why this change is bundled with a backend Python fix.

Prompt To Fix With AI

This is a comment left during a code review. Path: web/package-lock.json Line: 10310:10311 Comment: This change appears unrelated to the PR's stated purpose of fixing Vertex prompt caching. The addition of `"dev": true` to the fsevents package is likely an accidental change from your local development environment. Consider whether this change should be: 1. Removed from this PR (if accidental) 2. Included with explanation (if intentional) 3. Moved to a separate PR focused on dependency management fsevents is a macOS-specific file system event watcher that's typically an optional dependency. While marking it as dev-only may be correct, it's unclear why this change is bundled with a backend Python fix. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-11T00:06:15Z

Additional Comments (1)

backend/onyx/llm/prompt_cache/providers/vertex.py
This function is now unused (since transform_cacheable is set to None on line 51) and should be removed entirely. According to the custom instruction "When hardcoding a boolean variable to a constant value, remove the variable entirely and clean up all places where it's used rather than just setting it to a constant" - the same principle applies here when setting a function parameter to a constant (None).

The function was previously called via transform_cacheable=_add_vertex_cache_control but is no longer used anywhere in the codebase. Keeping dead code adds maintenance burden and can be confusing for future developers.

# Remove lines 83-125 entirely - the function is no longer used

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review.
Path: backend/onyx/llm/prompt_cache/providers/vertex.py
Line: 83:125

Comment:
This function is now unused (since `transform_cacheable` is set to `None` on line 51) and should be removed entirely. According to the custom instruction "When hardcoding a boolean variable to a constant value, remove the variable entirely and clean up all places where it's used rather than just setting it to a constant" - the same principle applies here when setting a function parameter to a constant (None).

The function was previously called via `transform_cacheable=_add_vertex_cache_control` but is no longer used anywhere in the codebase. Keeping dead code adds maintenance burden and can be confusing for future developers.

```suggestion
# Remove lines 83-125 entirely - the function is no longer used
```

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Co-authored-by: Weves <chrisweaver101@gmail.com>

fix: vertex prompt caching

95b0ca4

evan-onyx requested a review from a team as a code owner January 11, 2026 00:02

cubic-dev-ai bot reviewed Jan 11, 2026

View reviewed changes

Weves approved these changes Jan 11, 2026

View reviewed changes

Weves enabled auto-merge January 11, 2026 00:04

greptile-apps bot reviewed Jan 11, 2026

View reviewed changes

.

d1df258

Weves added this pull request to the merge queue Jan 11, 2026

Merged via the queue into main with commit 22138bb Jan 11, 2026
203 of 204 checks passed

Weves deleted the fix/vertex-prompt-caching branch January 11, 2026 00:28

jessicasingh7 pushed a commit that referenced this pull request Jan 12, 2026

fix: vertex prompt caching (#7339)

f061b22

Co-authored-by: Weves <chrisweaver101@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: vertex prompt caching#7339

fix: vertex prompt caching#7339
Weves merged 2 commits intomainfrom
fix/vertex-prompt-caching

evan-onyx commented Jan 11, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

greptile-apps bot commented Jan 11, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 11, 2026

Uh oh!

greptile-apps bot Jan 11, 2026

Uh oh!

greptile-apps bot commented Jan 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	# NOTE: unfortunately we need a much more sophisticated mechnism to support
	# NOTE: unfortunately we need a much more sophisticated mechanism to support

Conversation

evan-onyx commented Jan 11, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Additional Options

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 11, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

evan-onyx commented Jan 11, 2026 •

edited by cubic-dev-ai bot

Loading