You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My mpijob will stuck forever because SyncPodGroup error within 1 second.
For example:
At 00:00:00.100 SyncPodGroup created the pod group, and get the pod group failed.
At 00:00:00.200 SyncPodGroup try to update the pod group, but there is a confliction error, just as Operation cannot be fulfilled on ...
Then the controller will set the LastReconcileTime at the same as step 1.
Finally the controller will UpdateJobStatusInApiServer while the job spec is not changed, and will not trigger the next reconcile
The text was updated successfully, but these errors were encountered:
shadowdsp
changed the title
mpijob will not reconcile if LastReconcileTime is updated in 1 second
mpijob will stuck if LastReconcileTime is updated in 1 second
May 17, 2024
My mpijob will stuck forever because SyncPodGroup error within 1 second.
For example:
Operation cannot be fulfilled on ...
The text was updated successfully, but these errors were encountered: