Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate history conversation filenames in Chinese properly. #1150

Merged

Conversation

Steve235lab
Copy link
Contributor

@Steve235lab Steve235lab commented Mar 28, 2024

Describe the changes you have made:

There's a problem with generating history conversation filenames in Chinese and other languages without blanks between words for a long time. If users input their first request in languages like Chinese, they would get a history conversation json file named as something like __March_28_2024_19-59-01.json almost every time. This was caused by the old way to generate the first part of filename: self.messages[0]["content"][:25].split(" ")[:-1]. This would get a blank string if there's no " "(blank space) in users' first input. I made a small patch to fix this. Now it will name history conversation files like 这是一句中文__March_28_2024_19-59-01.json if got users' first input in Chinese.

Reference any relevant issues (e.g. "Fixes #000"):

Pre-Submission Checklist (optional but appreciated):

  • I have included relevant documentation updates (stored in /docs)
  • I have read docs/CONTRIBUTING.md
  • I have read docs/ROADMAP.md

OS Tests (optional but appreciated):

  • Tested on Windows
  • Tested on MacOS
  • Tested on Linux

… support languages like Chinese without blank between words.
@Steve235lab
Copy link
Contributor Author

Is it a good idea to request LLM summarizing the first turn of conversation as the filename automatically?

@KillianLucas
Copy link
Collaborator

Great catch @Steve235lab. I think having the LLM summarize the first turn is tough because it uses an LLM call, which folks should be super aware of. Let's think about it in the future if we move into a more advanced UI like in @Notnaton's PR: #976

@KillianLucas KillianLucas merged commit b9571ac into OpenInterpreter:main Apr 5, 2024
0 of 2 checks passed
@Steve235lab
Copy link
Contributor Author

Great catch @Steve235lab. I think having the LLM summarize the first turn is tough because it uses an LLM call, which folks should be super aware of. Let's think about it in the future if we move into a more advanced UI like in @Notnaton's PR: #976

@KillianLucas Ok, I see. What if we make it a configurable optional feature and turn it off by default? Maybe I will implement this later. It won't make huge improvements anyway because we cannot use a very long file name to store much information, which would cause problems as different file systems support different max filename length. But there's a solution: we use UUID or hash value as filename of conversation json files, and add one more file to store a K-V structure of UUID and conversation meta data. In this way we can have more detailed information of history conversations stored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants