feat(posthog): track message origin analytics in posthog#7313
feat(posthog): track message origin analytics in posthog#7313rohoswagger merged 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
Overview
This PR adds message origin analytics tracking to PostHog across all chat entry points (webapp, Chrome extension, API, Slack). The implementation includes:
What Changed
- Backend Models: Added
MessageOriginenum with 5 values (webapp, chrome_extension, api, slackbot, unknown) - Request Models: Added
originfield to bothSendMessageRequestandCreateChatMessageRequest - Analytics Tracking: Integrated PostHog telemetry in
process_message.pywith fallback for unavailable telemetry module - Frontend Origin Detection: Detects whether requests come from webapp or Chrome extension using
getExtensionContext() - API Endpoint Updates: Set explicit origins (API, SLACKBOT) at relevant entry points
Critical Issue Found
Frontend Origin Not Propagated to Backend: In useChatController.ts, the origin (chrome_extension or webapp) is correctly determined and set in the local CurrentMessageFIFO stack (line 696), but this origin is never actually passed to the backend. The origin remains stuck in client-side state and isn't included in the sendMessage() API call parameters. This means:
- All frontend messages (both webapp and extension) will arrive at the backend with the default origin (WEBAPP)
- The Chrome extension origin tracking will fail silently
- Analytics will show all frontend messages as "webapp" regardless of actual source
Secondary Issues
- Inconsistent Default Origins:
SendMessageRequestdefaults to WEBAPP whileCreateChatMessageRequestdefaults to UNKNOWN, causing unpredictable behavior - No Origin Override in API Endpoints: The OneShotQARequest doesn't accept an origin parameter, so hardcoding API origin may mask other use cases
- Type Duplication:
MessageOrigintype is duplicated between Python and TypeScript, risking sync issues - Missing Type Hints: PostHog properties dict lacks explicit typing for better IDE support and correctness checking
Risk Assessment
- High Risk: The frontend origin detection is completely non-functional due to missing parameter propagation. Users relying on extension origin tracking will get incorrect data
- Medium Risk: Analytics accuracy depends on internal API endpoints properly setting the origin, which is only partially done
- Low Risk: Fallback mechanism for PostHog module is well-designed and shouldn't cause runtime errors
The feature is incomplete and would send inaccurate telemetry if merged as-is.
Confidence Score: 2/5
- NOT SAFE TO MERGE - Critical functionality gap: frontend origin detection is implemented but never sent to backend, causing silent failure of Chrome extension origin tracking.
- Score reflects a critical logic bug where the frontend correctly detects the message origin (webapp vs extension) but fails to propagate this information to the backend API call. This means the entire Chrome extension origin tracking feature is non-functional. Additionally, inconsistent default values between request models and lack of origin parameter support in some API endpoints create further correctness issues. The PostHog telemetry fallback is well-designed (using fetch_versioned_implementation_with_fallback), which prevents runtime errors, but the data being sent will be inaccurate.
- web/src/app/chat/hooks/useChatController.ts (critical), backend/onyx/server/query_and_chat/models.py (defaults), backend/onyx/server/query_and_chat/query_backend.py (hardcoded origin)
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| backend/onyx/server/query_and_chat/models.py | 4/5 | Added MessageOrigin enum with 5 values (webapp, chrome_extension, api, slackbot, unknown). Added origin field to SendMessageRequest (default: WEBAPP) and CreateChatMessageRequest (default: UNKNOWN). Inconsistent defaults between the two models could cause analytics confusion. |
| backend/onyx/chat/process_message.py | 4/5 | Added PostHog event_telemetry tracking with origin and other analytics properties. Uses fetch_versioned_implementation_with_fallback with noop_fallback for safe loading. Missing explicit type hints for the properties dict and good pattern otherwise. |
| backend/onyx/server/query_and_chat/query_backend.py | 3/5 | Sets origin=MessageOrigin.API in get_answer_stream. However, hardcodes the origin without checking if OneShotQARequest might have its own origin field, which could cause the caller's intent to be ignored. |
| web/src/app/chat/hooks/useChatController.ts | 2/5 | Determines messageOrigin correctly (chrome_extension vs webapp) and sets it in CurrentMessageFIFO stack. However, this origin is never actually passed through to the sendMessage API call - the origin sits in the local stack but isn't included in the actual HTTP request parameters. |
Sequence Diagram
sequenceDiagram
participant Frontend as Frontend (useChatController)
participant API as Backend API<br/>(send-chat-message)
participant Chat as Chat Processor<br/>(handle_stream_message_objects)
participant PostHog as PostHog Analytics
Frontend->>Frontend: getExtensionContext()<br/>Determine: webapp vs chrome_extension
Frontend->>Frontend: Set origin in CurrentMessageFIFO
Note over Frontend: ⚠️ ISSUE: origin not passed<br/>to sendMessage() params
Frontend->>API: sendMessage(params)<br/>origin field MISSING!
API->>Chat: SendMessageRequest<br/>with origin (defaults to WEBAPP)
Chat->>Chat: Extract origin from request
Chat->>PostHog: event_telemetry("user_message_sent")<br/>properties: {origin, has_files,<br/>has_project, has_persona,<br/>deep_research, tenant_id}
PostHog-->>Chat: Success
Chat->>Chat: Continue message processing
Chat-->>API: Stream response
API-->>Frontend: Streamed packets
|
@greptile |
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR adds comprehensive message origin tracking across all chat message entry points (webapp, Chrome extension, API, Slack) and emits telemetry events to PostHog for analytics.
What Changed
Backend:
- Added
MessageOriginenum with values:webapp,chrome_extension,api,slackbot,unknown - Extended
SendMessageRequestandCreateChatMessageRequestmodels with anoriginfield (defaults toUNKNOWN) - Modified
process_message.pyto emituser_message_sentPostHog events with origin and contextual properties (has_files, has_project, has_persona, deep_research, tenant_id) - Updated all backend entry points to set appropriate origin values:
- API endpoints in
ee/chat_backend.pyandquery_backend.pysetorigin=API - Slack handler sets
origin=SLACKBOT - Main chat endpoint trusts client-provided origin value
- API endpoints in
Frontend:
- Added
MessageOriginTypeScript type matching backend enum - Created
getExtensionContext()utility to detect extension usage based on URL pathname - Modified
useChatController.tsto determine origin (webapp vs chrome_extension) and pass it when sending messages - Added frontend PostHog tracking for extension queries with
extension_chat_queryevent - Used safe telemetry loading pattern with
fetch_versioned_implementation_with_fallbackand noop fallback
How It Fits
The changes integrate cleanly with existing telemetry infrastructure (mt_cloud_telemetry, PostHog client). The origin field threads through the entire message flow from entry point → request models → processing → telemetry emission. The implementation follows established patterns for versioned feature loading and falls back gracefully when telemetry is unavailable.
Confidence Score: 5/5
- Safe to merge - no breaking changes, proper fallback handling, non-blocking telemetry tracking
- The implementation is well-structured with proper error handling (noop fallback for telemetry), type safety (enum on backend, union type on frontend), and follows existing codebase patterns. The only issue is a non-critical data integrity concern where the main web endpoint trusts client-provided origin values, but this doesn't break functionality.
- No files require special attention - the single non-blocking issue flagged in chat_backend.py is a data integrity improvement, not a functional bug
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| backend/onyx/server/query_and_chat/models.py | 5/5 | Adds MessageOrigin enum and origin field to SendMessageRequest and CreateChatMessageRequest models with UNKNOWN default |
| backend/onyx/chat/process_message.py | 5/5 | Adds PostHog telemetry tracking for user_message_sent event with origin, files, project, persona, and deep_research properties; properly propagates origin through message flow |
| backend/onyx/chat/chat_utils.py | 5/5 | Adds optional origin parameter to prepare_chat_message_request, defaulting to UNKNOWN if not provided |
| backend/ee/onyx/server/query_and_chat/chat_backend.py | 5/5 | Sets origin=MessageOrigin.API for both simplified chat endpoints, correctly identifying API-sourced messages |
| backend/onyx/server/query_and_chat/query_backend.py | 5/5 | Sets origin=MessageOrigin.API for one-shot Q&A endpoint with helpful comment explaining the choice |
| backend/onyx/onyxbot/slack/handlers/handle_regular_answer.py | 5/5 | Sets origin=MessageOrigin.SLACKBOT when preparing chat messages from Slack |
| web/src/app/chat/services/lib.tsx | 5/5 | Adds MessageOrigin type and origin parameter to sendMessage function, defaults to 'unknown' with comment about explicit setting |
| web/src/app/chat/hooks/useChatController.ts | 5/5 | Determines origin via getExtensionContext, passes to sendMessage, and tracks extension_chat_query event in PostHog for extension users |
| web/src/lib/extension/utils.ts | 5/5 | Adds getExtensionContext function to detect extension usage based on pathname (/chat/nrf paths) |
Sequence Diagram
sequenceDiagram
participant User
participant Frontend
participant ExtensionUtils
participant SendMessage
participant Backend
participant ProcessMessage
participant PostHog
User->>Frontend: Sends chat message
Frontend->>ExtensionUtils: getExtensionContext()
ExtensionUtils-->>Frontend: {isExtension, context}
Frontend->>Frontend: Determine origin<br/>(webapp or chrome_extension)
Frontend->>SendMessage: sendMessage(message, origin)
SendMessage->>Backend: POST /api/chat/send-chat-message<br/>SendMessageRequest{origin}
Backend->>ProcessMessage: handle_stream_message_objects(new_msg_req)
ProcessMessage->>ProcessMessage: Translate to CreateChatMessageRequest<br/>with origin field
ProcessMessage->>PostHog: event_telemetry("user_message_sent",<br/>{origin, has_files, has_project, etc})
PostHog-->>ProcessMessage: (async tracking)
ProcessMessage-->>Backend: Stream packets
Backend-->>Frontend: SSE stream
alt Extension user
Frontend->>PostHog: capture("extension_chat_query",<br/>{extension_context, assistant_id, etc})
end
Frontend-->>User: Display response
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: backend/onyx/server/query_and_chat/chat_backend.py
Line: 510:516
Comment:
[P2] The `/send-chat-message` endpoint trusts the client-provided `origin` field without validation. API clients calling this endpoint could set `origin="webapp"` or `origin="chrome_extension"`, polluting telemetry data. Consider determining origin server-side based on authentication method or request characteristics (e.g., API key vs web session), or restrict this endpoint to web UI only and direct API clients to use dedicated API endpoints like those in `ee/chat_backend.py`.
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
1 issue found across 34 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="web/tailwind-themes/tailwind.config.js">
<violation number="1">
P3: The comment contradicts the code - it says the plugin "is not needed here" but then immediately adds it. Consider rewording to clarify that the plugin IS needed for Tailwind v3 and will become unnecessary after upgrading to v4.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| const { forcedToolIds } = useForcedTools(); | ||
| const { fetchProjects, setCurrentMessageFiles, beginUpload } = | ||
| useProjectsContext(); | ||
| const posthog = usePostHog(); |
There was a problem hiding this comment.
Does it make sense to only do this if we know it's the chrome extension (since we only emit for chrome ext)? I.e. could there be unnecessary performance impact in web app?
TBH, I'm really not sure about this, so relying on you to decide :)
There was a problem hiding this comment.
idt the performance impact would be measurable, our chat is not super quick anyways LOL
Description
add message origin analytics in posthog for all chat messages
How Has This Been Tested?
in posthog
Additional Options
Summary by cubic
Track the origin of every chat message (web app, Chrome extension, API, Slack) and send analytics to PostHog. Improves visibility into usage across surfaces without changing user behavior.
Written for commit 3ef78fe. Summary will update on new commits.