Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to store model checkpoints locally as artifacts #3144

Open
sbuschjaeger opened this issue May 13, 2024 · 1 comment
Open

How to store model checkpoints locally as artifacts #3144

sbuschjaeger opened this issue May 13, 2024 · 1 comment
Labels
type / question Issue type: question

Comments

@sbuschjaeger
Copy link

❓Question

What is the intended workflow to locally store artifacts such as e.g. model checkpoints?

As far as I can see (https://aimstack.readthedocs.io/en/latest/using/artifacts.html) we can use run.log_artifact() to store artifacts that are already on disk and upload them somewhere. This makes sense for S3 / remote storage, but what about local storage? Is there a way to directly store something as an artifact on disk? Basically, I want to store a checkpoint of my model every n epochs.

Something along the line of (here for PyTorch):

log_path = self.run.get_this_from_somewhere() # Get path for current run
torch.save(self.state_dict(), os.path.join(log_path, f"model_{epoch}.pt"))
self.run.log_artifact(os.path.join(log_path, f"model_{epoch}.pt"), name=f"model_{epoch}.pt") # without upload
@sbuschjaeger sbuschjaeger added the type / question Issue type: question label May 13, 2024
@DavidoF3
Copy link

The ability to store artifacts locally would be very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type / question Issue type: question
Projects
None yet
Development

No branches or pull requests

2 participants