Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark user message as required #520

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

alxmiron
Copy link
Contributor

@alxmiron alxmiron commented Apr 2, 2023

I found a conflict with maxResponseTokens @transitive-bullshit

Problem 1

From one side, buildMessages() now has logic to limit prompt length, because we include "history messages" and wanna know when to stop adding them. From another hand, prompt length defines a length of response, because total sub should not exceed maxModelTokens. The we have the following:

  1. I don't send parentMessageIds in my chatbot, so my users send short prompts, but get responses only of 1000 tokens length, because it's a default value
  2. I set maxResponseTokens: 4096 as max allowed value, expect maximum length responses
  3. But the current logic filters out user prompt, so we send just systemMessage to ChatGPT that has no sense and is confusing to user
  4. I can reduce maxResponseTokens, but don't know for how much because I want to use full capacity of 4096 tokens but don't know what user prompt length is

Solution

If user passed a prompt and it exceeds the limits, we should throw exception to him, instead of skipping the userPrompt and sending systemMessage only. Then let's mark systemMessage and userPromp as "required" messages, which means that if they are passed the take prioritized tokens space. And if they exceed limits - we throw exception.

Now the flow looks like this:

  1. Validate systemMessage and userPromp, based on maxModelTokens - 1 value, ignoring maxResponseTokens. Where 1 - minimum space for response. If user enters maxModelTokens - 1 tokens length prompt, he will get 1 token in
    response. If he enters prompt bigger than maxModelTokens - 1, there will be an exception
  2. Validate historyPrompts, based on maxModelTokens - maxResponseTokens value
  3. Send calculated size for response, based on numTokens and maxResponseTokens

break
}

messages = nextMessages
numTokens = nextNumTokensEstimate

if (!isValidPrompt) {
Copy link
Contributor Author

@alxmiron alxmiron Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition almost never happened, because we had if (prompt && !isValidPrompt) { break } before. I removed that one

@@ -400,21 +400,29 @@ export class ChatGPTAPI {
}
}, [] as string[])
.join('\n\n')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem 2

I found that our calculations of prompt length in tokens have a mistake.
It's -6 tokens diff when we have 1 systemMessage in the list (our estimation is 6 tokens less than real one from OpenAI)
It's -8 tokens diff when we have 2 messages in the list, regardless of the text in userPrompt
For other lengths it's bigger
But _getTokenCount() works correctly, because it calculates the length of response equally as OpenAI does.
It's something we prompt string building. We need to check it (not in this PR)
Why do we add Instructions:, User: prefixes there? I didn't find any docs about it

@alxmiron alxmiron changed the title Mark user message as required. Fix used tokens estimation Mark user message as required Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant