Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested functionality: Estimate by model_type #1

Open
Somerandomguy10111 opened this issue Oct 19, 2023 · 1 comment
Open

Suggested functionality: Estimate by model_type #1

Somerandomguy10111 opened this issue Oct 19, 2023 · 1 comment

Comments

@Somerandomguy10111
Copy link

First off: Great tool and saved me the headache of trying to trace the functions tokens myself.
A final touch could to introduce an option to have a token estimator class (tokenizer class?) which gets the model type as attribute and then uses the tiktoken.encoding_for_model() function to retrieve the encoding.

That way if openai ever changes the encoding or uses a different encoding for newer models the package can stay up to date.
On a side note what I think is also useful are following functions which you can use e.g. to prevent logging of huge inputs to the model

def get_string_tokens(self, the_str : str) -> int:
    return len(self.encode(the_str))


def get_limited_string(self, the_str : str, max_tokens : int) -> str:
    encoded_str = self.encode(the_str)
    return self.decode(encoded_str[:max_tokens])

Best
Somerandomguy10111

@Somerandomguy10111
Copy link
Author

If I get around to it I will implement it and pull request it myself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant