Truncate long chat sessions #190986
Replies: 2 comments
-
|
💬 Your Product Feedback Has Been Submitted 🎉 Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users. Here's what you can expect moving forward ⏩
Where to look to see what's shipping 👀
What you can do in the meantime 💻
As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities. Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐ |
Beta Was this translation helpful? Give feedback.
-
|
Hi @OaenHed, this is a known limitation with LLM-based chat systems, and I understand how frustrating it is to lose context mid-session. Why This HappensCopilot Chat (and similar AI assistants) use a context window — a limited number of tokens (roughly 8k-128k depending on the model) that the AI can "see" at once. Once you exceed this limit, the system has to drop older messages to make room for new ones. Currently, when the limit is reached, the session throws an error rather than automatically truncating. You're correct that chat hosts don't actively use the earliest messages after a certain point — but the system still keeps them in the context buffer until it overflows. Current WorkaroundsWhile the product doesn't yet support automatic truncation, here are some ways to mitigate this: 1. Manual Truncation
2. Use @workspace ReferencesInstead of letting context build up naturally, explicitly reference relevant files using 3. Break Complex Tasks into Steps
How to Get This Feature AddedSince this is a product feature request, the best way to get it implemented is to:
Product teams prioritize features based on user demand, so the more visibility this gets, the more likely it is to be addressed. Hope this helps explain the situation and gives you some useful workarounds in the meantime. Let me know if you'd like help crafting a feature request issue! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
🏷️ Discussion Type
Bug
💬 Feature/Topic Area
Visual Studio
Body
After any meaningfully long session with a chat host, there will always come a time when an exception is generated due to too many tokens, forcing users to start a new session ... and lose the built up context. This behaviour makes little sense, since chat hosts don't (can't?) actively look at the beginning of the session anyways. This isn't a guess; I've tested this multiple times. Anything written at the start of a long-ish session cannot be directly retrieved even if explicit instructions are given that this is going to happen.
Would it be possible to simply truncate the top x% of the tokens when the limit is reached, to allow the session to continue?
Beta Was this translation helpful? Give feedback.
All reactions