Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom ExperienceMaker #285

Open
mgerstgrasser opened this issue May 1, 2024 · 4 comments
Open

Custom ExperienceMaker #285

mgerstgrasser opened this issue May 1, 2024 · 4 comments

Comments

@mgerstgrasser
Copy link
Contributor

I have a use case where I'd like to use a custom ExperienceMaker class instead of either of the provided ones. As far as I can tell, there isn't currently a way to configure the experience maker class used by PPOTrainer / ActorPPOTrainer.

Would you be open to a PR that makes this configurable?

And if yes, would the best way of doing this be to put it into the DeepspeedStrategy?

@hijkzzz
Copy link
Collaborator

hijkzzz commented May 1, 2024

I think ExperienceMaker has nothing to do with DeepSpeed
I would like to know what kind of ExperienceMaker you need at first

@mgerstgrasser
Copy link
Contributor Author

I would like to know what kind of ExperienceMaker you need at first

I had multiple different ones in mind, actually, for different projects. For instance:

  • reward shaping
  • interaction with external or simulated environments
  • multi-turn conversations

Basically, anything that changes what the model does / how it interacts with the outside world, or how it is evaluated, would be implemented in the experience maker.

I think ExperienceMaker has nothing to do with DeepSpeed

I agree! I only suggest it because the DeepspeedStrategy is a convenient configuration-like object that's already passed to the Trainer anyway. The alternative would be to e.g. pass the experience maker class as an optional argument the the Trainer constructor, but in the Ray case that would require passing it through several other objects as well, i.e. more changes.

I could of course just monkey-patch the experience maker class, or maintain my own fork, so all of this isn't absolutely essential. But I think it might be useful for others to have this configurable as well.

@hijkzzz
Copy link
Collaborator

hijkzzz commented May 3, 2024

The alternative would be to e.g. pass the experience maker class as an optional argument the the Trainer constructor, but in the Ray case that would require passing it through several other objects as well, i.e. more changes.

I think it's probably better this way

Welcome to have related MR contributions~

@mgerstgrasser
Copy link
Contributor Author

OK! I'll test this on my side for a bit to make sure it covers all the use cases I have in mind. I'll open a PR in a couple of weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants