Conversation
Greptile OverviewGreptile SummaryFixes Azure OpenAI tool calls in streaming responses by extracting What changed:
Why this matters: Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Azure as Azure OpenAI API
participant Parser as chunk_parser
participant Handler as Streaming Handler
participant Client as Onyx Client
Azure->>Parser: response.created event
Parser->>Handler: GenericStreamingChunk (empty)
Azure->>Parser: response.output_item.added (function_call)
Parser->>Handler: GenericStreamingChunk (with tool_use)
Azure->>Parser: response.function_call_arguments.delta
Parser->>Handler: GenericStreamingChunk (arguments delta)
Azure->>Parser: response.output_item.done
Parser->>Handler: GenericStreamingChunk (tool complete)
Note over Azure,Parser: Azure sends ALL tool calls in final event
Azure->>Parser: response.completed (with all function_calls)
Note over Parser: OLD: Just set finish_reason="tool_calls"
Note over Parser: NEW: Extract tool calls, return first one
Parser->>Handler: GenericStreamingChunk (tool_use=tool_calls[0], finish_reason="tool_calls")
Handler->>Client: Merged tool call data
|
| if tool_calls: | ||
| # Return first tool call - the streaming handler will merge them | ||
| return GenericStreamingChunk( | ||
| text="", | ||
| tool_use=tool_calls[0], | ||
| is_finished=True, | ||
| finish_reason="tool_calls", | ||
| usage=None, | ||
| ) |
There was a problem hiding this comment.
Only returning the first tool call when multiple exist. Verify that either: (1) Azure sends each tool call in separate response.output_item.added events before response.completed, or (2) the streaming handler properly merges tool calls even when only the first is returned here.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/onyx/llm/litellm_singleton/monkey_patches.py
Line: 515:523
Comment:
Only returning the first tool call when multiple exist. Verify that either: (1) Azure sends each tool call in separate `response.output_item.added` events before `response.completed`, or (2) the streaming handler properly merges tool calls even when only the first is returned here.
How can I resolve this? If you propose a fix, please make it concise.
justin-tahara
left a comment
There was a problem hiding this comment.
If you can validate the Greptile comment and ideally if you can add a simple unit test that would be great.
Approving to unblock
Description
Tool Calls in Azure OpenAI are not given back as streamed objects but rather just a final response.completed object. This now correctly extracts them.
How Has This Been Tested?
Works locally where previously it didnt.
Additional Options
Summary by cubic
Fixes Azure OpenAI tool calls in streaming by extracting function_call items from response.completed and emitting delta.tool_calls with a tool_calls finish reason. Tracks streamed tool calls to avoid duplicates and correctly signal completion, so tools execute reliably with the Azure Responses API.
Written for commit f3a82ad. Summary will update on new commits.