Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLflow hook updates MLflow run state based on all op events, not the just the final op #21872

Open
JPercivall opened this issue May 15, 2024 · 0 comments
Labels
area: integrations Related to general integrations, including requests for a new integration type: bug Something isn't working

Comments

@JPercivall
Copy link

Dagster version

1.7.4

What's the issue?

I have a job that has multiple ops that runs multiple ops in serial. On that job I have the end_mlflow_on_run_finished hook set up.

When the job runs, the early ops succeed and the mlflow integration marks the run as finished before the job finishes. For some runs, the later ops then fail and then update the mlflow state.

Jobs still running but the mlflow runs are marked as finished:
Screenshot 2024-05-15 at 10 56 33 AM
Screenshot 2024-05-15 at 10 55 21 AM

Jobs have finished with failure and the mlflow runs are marked as failed.
Screenshot 2024-05-15 at 10 59 37 AM
Screenshot 2024-05-15 at 10 59 28 AM

I believe the issues is here: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-mlflow/dagster_mlflow/hooks.py#L25

What did you expect to happen?

MLflow run status not updated until the job finishes

How to reproduce?

  • Job with multiple ops in it
  • The job has the end_mlflow_on_run_finished hook
  • ENV is set up to talk to MLflow instance
  • Run the job and watch mlflow & dagster UIs

Deployment type

Local

Deployment details

We're using MLflow in Databricks

Using dagster-mlflow version: 0.23.4

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

@JPercivall JPercivall added the type: bug Something isn't working label May 15, 2024
@garethbrickman garethbrickman added the area: integrations Related to general integrations, including requests for a new integration label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: integrations Related to general integrations, including requests for a new integration type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants