Make Optimizer parameters public to enable dynamic learning rate #87

kemchenj · 2024-05-15T14:22:19Z

During my use of mlx-swift, I found that the optimizer in mlx-swift does not expose the learning rate property, making it impossible to dynamically adjust the learning rate during training.

I believe this is a significant functionality gap. Therefore, this PR changes all relevant properties of the Optimizer to public.

davidkoski · 2024-05-15T15:18:58Z

Source/MLXOptimizers/Optimizers.swift

-    var dampening: Float = 0
-    var nesterov = false
+    /// The learning rate
+    public var learningRate: Float


These changes look good to me, but I wonder about changing non-MLXArray values during the run and interaction with grad()

@awni will this work ok?

I realized there was a scheduler piece that happened on the python side and we don't have an equivalent here.

It should be fine to change non-mlx arrays. The main (and I think only time) you need to be careful there is when using mx.compile. In Python we detect if constants (non arrays) change and recompile, but I don't think we do it in Swift yet.

In Python everything is public by default so this kind of mimics that. I would say its ok to do it here. We do have schedulers in Python which is the typical way for updating a learning rate (and other variables). To the extent that the interface has to change for the schedulers that may be something to watch out for (e.g. if we add schedulers will we break the API that this exposes?).

To the extent that the interface has to change for the schedulers that may be something to watch out for (e.g. if we add schedulers will we break the API that this exposes?).

Possibly, but maybe not. It looks like the optimizer base gets a step so we could do something like this:

public var step: MLXArray public var learningRate: Float { get { if let scheduledLearningRate { return scheduledLearningRate(step) } else { return _learningRate } } set { if scheduledLearningRate != nil { fatalError("cannot set learningRate with scheduler") } _learningRate = newValue } } private var _learningRate: Float = 0 public var scheduledLearningRate: ((MLXArray) -> MLXArray)?

That would keep the same API while adding an optional scheduled parameter.

davidkoski

Thank you for these changes!

Make Optimizer parameters public to enable dynamic learning rate

ecfcbc3

davidkoski reviewed May 15, 2024

View reviewed changes

davidkoski approved these changes May 20, 2024

View reviewed changes

davidkoski merged commit 83efa17 into ml-explore:main May 20, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Optimizer parameters public to enable dynamic learning rate #87

Make Optimizer parameters public to enable dynamic learning rate #87

kemchenj commented May 15, 2024

davidkoski May 15, 2024

awni May 15, 2024

davidkoski May 15, 2024

davidkoski left a comment

Make Optimizer parameters public to enable dynamic learning rate #87

Make Optimizer parameters public to enable dynamic learning rate #87

Conversation

kemchenj commented May 15, 2024

davidkoski May 15, 2024

Choose a reason for hiding this comment

awni May 15, 2024

Choose a reason for hiding this comment

davidkoski May 15, 2024

Choose a reason for hiding this comment

davidkoski left a comment

Choose a reason for hiding this comment