Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exotic codestyle #7

Open
Kelin2025 opened this issue Apr 25, 2023 · 3 comments
Open

Exotic codestyle #7

Kelin2025 opened this issue Apr 25, 2023 · 3 comments

Comments

@Kelin2025
Copy link

In my game, I wrote a library that allows me to build logic by composing objects, so I could customize it and subscribe to any step
Then I wrote higher-level presets and operators to describe character's skills and perks
So the code mostly looks like this:

Actions
Perks
How presets/operators are made

I understand that this approach is kinda different from how people usually write code (and without explanation might be frustrating even for human haha), so the question is - what do you think, can fine-tuning with common data help me get better predictions for this approach?

@minosvasilias
Copy link
Owner

That looks interesting!
I'd say for cases like this where existing code utilizing that structure only exists in a single private codebase, finetuning is probably not the ideal way to go about it.

Instead, i'd suggest either using embeddings to retrieve relevant examples to inject into prompts (llama_index-style) or just dumping as much example code as possible into the prompt as context for the model to follow. For the latter, larger context length for the used model would be very beneficial.

Both could still be done on top of a GDScript-finetuned model like godot-dodo of course. But larger GPT models might perform better at zero-shot learning as would be required in this case.

@SeanR26
Copy link

SeanR26 commented Apr 28, 2023

I believe this is the answer to my question as well. I work on a medical application that has its own scripting language. However it is not documented on GitHub and I was wondering how to use the same technique here to find a way to simplify rule generation using this scripting language. I have script examples in text files and pdf documents, I guess that embedding makes the most sense if I am reading your response correctly. However given the relatively low number of script lines in this documentation not sure if it will actually be worth the effort

@minosvasilias
Copy link
Owner

@SeanR26 It's difficult to say without knowing the exact data available of course, but i'd suggest simply playing around with some existing APIs (OpenAI primarily, but also relevant Huggingface/Google Colab ones) and pasting a decent chunk of existing code into your prompt as context. Should be very easy and give you some sense on how well these models perform at in-context learning regarding your specific data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants