Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discord? #5

Open
yankscally opened this issue Apr 25, 2023 · 5 comments
Open

Discord? #5

yankscally opened this issue Apr 25, 2023 · 5 comments

Comments

@yankscally
Copy link

I know this is usually unnecessary, but I'd like to help out. Do you have a discord?

I have some ideas about datasets, training that I think could be very useful. I'm also a GDScript veteran at this point.

I've successfully and commercially used GPT2 models in 2019 - so I have experience with datasets.

@minosvasilias
Copy link
Owner

Hey, any ideas and contributions are appreciated!
Would ideally like to keep conversations public in GitHub issues if possible, so this can be followed by anyone interested.

If you do want a private chat though, you can reach me on Discord at markus_#2339 or on Twitter at @minosvasilias.

@yankscally
Copy link
Author

image

Just got a chance to try out your model, and the results seem promising.

Here is a sample of a conversation I had with your model. It grasps basic GDScript terminology and syntax, but it seems to be setup wrong maybe in my webui.

what is the best way to use this model? I am using the 7B model locally on the oobabooga webui.

@minosvasilias
Copy link
Owner

The model is finetuned on an instruct-dataset similar to stanford-alpaca and similar models. This means all samples confirm to a specific prompting template, which in the case of godot-dodo is:

Below is an instruction that describes a GDScript coding task. 
Write code that appropriately completes the request.


### Instruction:
{instruction}

### Response:

I have not tested using the model without that format, and am not familiar with how oobabooga sets things up.
Very interesting to see it seemingly still retaining some capability of natural language conversations though, despite this fixed format + the very repetitive initial response token(s) (func xxx:) it's trained on.

So if you want to reproduce the model performance as i evaluated it, you will need to follow the exact prompting template above. You can do this via Google Colab using the Jupyter notebook linked in the readme: https://colab.research.google.com/github/minosvasilias/godot-dodo/blob/main/demo/inference_demo.ipynb

@yankscally
Copy link
Author

image

ok! I set this up in the 'character' part right but the output still isn't full scripts, but I think the problem may be in these generation parameters... still figuring it out

image

@minosvasilias
Copy link
Owner

minosvasilias commented Jun 4, 2023

That looks sensible, though again not sure how exactly they format the context.

However, godot-dodo models are unlikely to generate full scripts anyway if you're looking for that. The training dataset is split into individual methods, and the model therefore learns to implement the instructions within the scope of a single method. It will rarely, if ever, exceed that scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants