Skip to content

multimodal-maestro-0.1.0

Latest
Compare
Choose a tag to compare
@SkalskiP SkalskiP released this 29 Nov 12:54
· 3 commits to main since this release

multimodal-maesto is out 馃敟 馃敟 馃敟

馃殌 Added

>>> import cv2
>>> import torch
>>> import multimodalmaesto as mm

>>> image = cv2.imread("...")

>>> generator = mm.SegmentAnythingMarkGenerator()
>>> visualizer = mm.MarkVisualizer()

>>> marks = generator.generate(image=image)
>>> marks = mm.refine_marks(marks=marks)

>>> image_prompt = visualizer.visualize(image=image, marks=marks)
>>> text_prompt = "Find dog."

>>> response = mm.prompt_image(api_key=api_key, image=image_prompt, prompt=text_prompt)
>>> response

"The dog is prominently featured in the center of the image with the label [9]."

>>> masks = mm.extract_relevant_masks(text=response, detections=refined_marks)

{'6': array([
    [False, False, False, ..., False, False, False],
    [False, False, False, ..., False, False, False],
    [False, False, False, ..., False, False, False],
    ...,
    [ True,  True,  True, ..., False, False, False],
    [ True,  True,  True, ..., False, False, False],
    [ True,  True,  True, ..., False, False, False]])
}

multimodal-maestro-2

馃弳 Contributors

@SkalskiP @deependujha