Segment Anything Model #552

gathierry · 2024-03-09T05:44:46Z

Add Segment Anything Model (SAM) in MLX.

awni · 2024-05-20T13:17:59Z

@gathierry should we review and merge this? I'm not sure what the current status is, but let me know what you think.

gathierry · 2024-05-22T02:29:56Z

Hi @awni , yes please review it at your convenience.
I think it's almost done except that the ConvTranspose2d implementation in very naive and may not be able to generalized enough. I can only confirm it works in this model.

awni · 2024-05-26T00:51:59Z

@gathierry this is really nicely done! Thanks for the example. I started looking at it today and send some changes to your branch.

There's two high-level things I think we should aim to improve:

Simplify as much as possible. I understand the model has a lot of moving pieces, but to the extent that we can simplify how they all interact (maybe at the cost of some flexibility) that will make the example much more approachable
I would like to remove the dependence on torch in the amg class. I notice that there is still some non-trivial sections done using torch. Is that something we could work towards?

gathierry · 2024-05-26T03:22:03Z

Thanks for the comments and improvements @awni
For 2

I would like to remove the dependence on torch in the amg class. I notice that there is still some non-trivial sections done using torch. Is that something we could work towards?

I tried to use mlx first but found it much slower than torch. I thought it was because some Boolean masking and nonzero operators that mlx didn't support yet. So I had to convert it to numpy back and forth.
But that was 3 months ago. I don't know if it's still the case.

awni · 2024-05-26T13:34:40Z

I tried to use mlx first but found it much slower than torch.

Do you have a branch with that by any chance? We can profile and improve the ops if it's a bottleneck in MLX.

But that was 3 months ago.

PS sorry for the long delay on this. It kind of fell through the cracks. But I am quite keen to get an object segmentation example working!

gathierry · 2024-05-26T14:55:32Z

I don't have an existing branch for that but I can try to write one and compare them. I remember the gap was pretty big but maybe we can try to improve it.

gathierry · 2024-05-27T11:06:08Z

Hi @awni , I have amg implemented in torch in this branch.
I test it again but very roughly in the notebook and the speed is just a little slower than torch (<20%). And this is likely come from the overhead converting mx.array to numpy back and forth caused by the three places not implemented in mlx:

indexing with boolean mask code
nonzero code
torchvision.batched_nms

What do you think? Maybe the third one is the easiest one to start?

awni · 2024-05-27T13:10:43Z

Thanks for adding that!

Which notebook did you test, the amg one?

What do you think? Maybe the third one is the easiest one to start?

Each of those ops is a bit tricky because they have output shapes which depend on input data. But I would like to take a look and see where the slowdown is coming from. It might be from the conversion to numpy but it could be something else so it would be good to verify.

gathierry · 2024-05-28T01:05:21Z

Yes, the amg one.

gathierry · 2024-05-28T04:33:31Z

I'm trying to profiling the filter function and run the amg notebook, and I feel there's indeed a gap between mlx (with a workaround) and torch. Please correct me if I'm wrong.
For mlx

for torch

    def filter(self, keep: mx.array) -> None:
        import time
        t1 = time.perf_counter()
        for k, v in self._stats.items():
            if v is None:
                self._stats[k] = None
            elif isinstance(v, mx.array):
                # TODO: fix this with mlx
                # self._stats[k] = mx.array(np.array(v)[np.array(keep)])
                self._stats[k] = mx.array([a for i, a in enumerate(v) if keep[i]])
            elif isinstance(v, np.ndarray):
                self._stats[k] = v[np.array(keep)]
            elif isinstance(v, list) and keep.dtype == mx.bool_:
                self._stats[k] = [a for i, a in enumerate(v) if keep[i]]
            elif isinstance(v, list):
                self._stats[k] = [v[i] for i in keep]
            else:
                raise TypeError(f"MaskData key {k} has an unsupported type {type(v)}.")
        t2 = time.perf_counter()
        print(type(v), keep.shape, t2 - t1)

awni · 2024-05-28T23:20:22Z

On an M2 Ultra the fully MLX version seems to be overall faster (looking at total time)

Hybrid:

python main.py --model mlx_model --input notebooks/images/dog.jpg --output   29.91s user 86.16s system 1465% cpu 7.921 total

All MLX:

python main.py --model mlx_model --input notebooks/images/dog.jpg --output   2.06s user 6.41s system 127% cpu 6.628 total

awni · 2024-05-28T23:21:25Z

@gathierry would you mind updating this PR to use the MLX version? Then we can just focus on optimizing it. I don't think we will merge the torch version anyway. You can keep it in a side branch for reference (or it will also be in the git history).

amg in mlx

gathierry · 2024-05-29T02:27:56Z

Updated to pure mlx version

awni · 2024-05-30T16:01:28Z

Thanks for sending the pure MLX version. I'm noticing the main script isn't working 😓 . I tried:

python main.py --model mlx_model/sam-vit-base --input notebooks/images/dog.jpg --output dogs

It's possible I broke something in a previous refactor but lmk if you have any ideas.

awni

Thanks for this amazing addition!!

gathierry and others added 9 commits March 9, 2024 00:57

add segment anything model

3edf145

add readme

2b6d8fb

reorg file structure

056f86b

update

68d2e5a

lint

7013c35

minor updates

5551c47

Merge branch 'ml-explore:main' into main

bd75e20

ack

45c1e61

Merge branch 'ml-explore:main' into main

eaeb8df

awni added 2 commits May 25, 2024 13:14

fix weight loading

79e9700

simplify

ca6b342

gathierry and others added 3 commits May 27, 2024 11:29

Merge branch 'ml-explore:main' into main

a575f58

fix to run notebooks

28c620f

amg in mlx

67b3536

remove torch dependency

d206fd3

Merge pull request #1 from gathierry/shiyu/amg_mlx

4ec2a2c

amg in mlx

nit in README

9e283fe

gathierry and others added 7 commits May 31, 2024 10:39

return indices in nms

44aba73

simplify

ba2eee5

bugfix / simplify

6cfabfa

fix bug'

2207cca

simplify

535f10a

fix notebook and remove output

702303e

couple more nits

af0cc7e

awni approved these changes Jun 2, 2024

View reviewed changes

awni merged commit 8353bbb into ml-explore:main Jun 2, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segment Anything Model #552

Segment Anything Model #552

gathierry commented Mar 9, 2024 •

edited

awni commented May 20, 2024

gathierry commented May 22, 2024

awni commented May 26, 2024

gathierry commented May 26, 2024

awni commented May 26, 2024

gathierry commented May 26, 2024

gathierry commented May 27, 2024 •

edited

awni commented May 27, 2024

gathierry commented May 28, 2024

gathierry commented May 28, 2024

awni commented May 28, 2024

awni commented May 28, 2024

gathierry commented May 29, 2024

awni commented May 30, 2024

awni left a comment

Segment Anything Model #552

Segment Anything Model #552

Conversation

gathierry commented Mar 9, 2024 • edited

awni commented May 20, 2024

gathierry commented May 22, 2024

awni commented May 26, 2024

gathierry commented May 26, 2024

awni commented May 26, 2024

gathierry commented May 26, 2024

gathierry commented May 27, 2024 • edited

awni commented May 27, 2024

gathierry commented May 28, 2024

gathierry commented May 28, 2024

awni commented May 28, 2024

awni commented May 28, 2024

gathierry commented May 29, 2024

awni commented May 30, 2024

awni left a comment

Choose a reason for hiding this comment

gathierry commented Mar 9, 2024 •

edited

gathierry commented May 27, 2024 •

edited