Conversation
Greptile OverviewGreptile SummaryThis PR disables Vertex AI's explicit prompt caching to fix errors users were experiencing with Gemini models. The change moves from an optimistic caching approach (adding Key Changes:
Code Quality Issues:
Functional Impact: Confidence Score: 4/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Client
participant VertexProvider
participant prepare_messages_with_cacheable_transform
participant LLM_API as Vertex AI API
Note over VertexProvider: Before this PR
Client->>VertexProvider: prepare_messages_for_caching(prefix, suffix)
VertexProvider->>prepare_messages_with_cacheable_transform: transform_cacheable=_add_vertex_cache_control
prepare_messages_with_cacheable_transform->>prepare_messages_with_cacheable_transform: Add cache_control to messages
prepare_messages_with_cacheable_transform-->>VertexProvider: Messages with cache_control
VertexProvider->>LLM_API: Send messages with cache_control
LLM_API-->>VertexProvider: Error (conflicts with tools/system messages)
Note over VertexProvider: After this PR
Client->>VertexProvider: prepare_messages_for_caching(prefix, suffix)
VertexProvider->>prepare_messages_with_cacheable_transform: transform_cacheable=None
prepare_messages_with_cacheable_transform->>prepare_messages_with_cacheable_transform: No transformation
prepare_messages_with_cacheable_transform-->>VertexProvider: Messages unchanged
VertexProvider->>LLM_API: Send messages without cache_control
LLM_API-->>VertexProvider: Success (implicit caching)
|
| not at the message level. This function converts string content to the array format | ||
| and adds cache_control to the last content block in each cacheable message. | ||
| """ | ||
| # NOTE: unfortunately we need a much more sophisticated mechnism to support |
There was a problem hiding this comment.
'mechnism' is mis-spelled - should be 'mechanism'
| # NOTE: unfortunately we need a much more sophisticated mechnism to support | |
| # NOTE: unfortunately we need a much more sophisticated mechanism to support |
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/onyx/llm/prompt_cache/providers/vertex.py
Line: 92:92
Comment:
'mechnism' is mis-spelled - should be 'mechanism'
```suggestion
# NOTE: unfortunately we need a much more sophisticated mechanism to support
```
How can I resolve this? If you propose a fix, please make it concise.| "version": "2.3.2", | ||
| "dev": true, |
There was a problem hiding this comment.
This change appears unrelated to the PR's stated purpose of fixing Vertex prompt caching. The addition of "dev": true to the fsevents package is likely an accidental change from your local development environment.
Consider whether this change should be:
- Removed from this PR (if accidental)
- Included with explanation (if intentional)
- Moved to a separate PR focused on dependency management
fsevents is a macOS-specific file system event watcher that's typically an optional dependency. While marking it as dev-only may be correct, it's unclear why this change is bundled with a backend Python fix.
Prompt To Fix With AI
This is a comment left during a code review.
Path: web/package-lock.json
Line: 10310:10311
Comment:
This change appears unrelated to the PR's stated purpose of fixing Vertex prompt caching. The addition of `"dev": true` to the fsevents package is likely an accidental change from your local development environment.
Consider whether this change should be:
1. Removed from this PR (if accidental)
2. Included with explanation (if intentional)
3. Moved to a separate PR focused on dependency management
fsevents is a macOS-specific file system event watcher that's typically an optional dependency. While marking it as dev-only may be correct, it's unclear why this change is bundled with a backend Python fix.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
The function was previously called via Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! Prompt To Fix With AIThis is a comment left during a code review.
Path: backend/onyx/llm/prompt_cache/providers/vertex.py
Line: 83:125
Comment:
This function is now unused (since `transform_cacheable` is set to `None` on line 51) and should be removed entirely. According to the custom instruction "When hardcoding a boolean variable to a constant value, remove the variable entirely and clean up all places where it's used rather than just setting it to a constant" - the same principle applies here when setting a function parameter to a constant (None).
The function was previously called via `transform_cacheable=_add_vertex_cache_control` but is no longer used anywhere in the codebase. Keeping dead code adds maintenance burden and can be confusing for future developers.
```suggestion
# Remove lines 83-125 entirely - the function is no longer used
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise. |
Co-authored-by: Weves <chrisweaver101@gmail.com>
Description
We were seeing a variety of errors when users tried to user vertex models through Onyx. Supporting vertex's explicit prompt caching (not allowed to pass the tool calls or system message) will take a while and will likely be tricky to get right. Hopefully they just make it more convenient in the future.
We were previously just optimistically doing what litellm does in the first code example in their docs:
https://docs.litellm.ai/docs/providers/vertex#context-caching
but it seems pretty clear at this point we need the heavier-weight version they describe below it (making explicit calls to the provider).
How Has This Been Tested?
gemini through vertex works
Additional Options
Summary by cubic
Disabled Vertex prompt caching to prevent errors with Gemini via Vertex. Removed cache_control injection and will add explicit caching in a future update.
Bug Fixes
Dependencies
Written for commit d1df258. Summary will update on new commits.