Scripts for use with LongCLIP, including fine-tuning Long-CLIP
-
Updated
Jun 1, 2024 - Python
Scripts for use with LongCLIP, including fine-tuning Long-CLIP
Restrict a double-precision floating-point number to a specified range.
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography
Run zero-shot prediction models on your data
Mimix: A Text Generation Tool and Pretrained Chinese Models
This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral].
Effortless data labeling with AI support from Segment Anything and other awesome models.
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Keras beit,caformer,CMT,CoAtNet,convnext,davit,dino,efficientdet,edgenext,efficientformer,efficientnet,eva,fasternet,fastervit,fastvit,flexivit,gcvit,ghostnet,gpvit,hornet,hiera,iformer,inceptionnext,lcnet,levit,maxvit,mobilevit,moganet,nat,nfnets,pvt,swin,tinynet,tinyvit,uniformer,volo,vanillanet,yolor,yolov7,yolov8,yolox,gpt2,llama2, alias kecam
Fine-tuning code for CLIP models
A complete(grpc service and lib) Rust inference with multilingual embedding support. This version leverages the power of Rust for both GRPC services and as a standalone library, providing highly efficient text and image embeddings.
An extremely cursed text to image AI which generates terrifying parrot abominations.
Famous Vision Language Models and Their Architectures
Add a description, image, and links to the clip topic page so that developers can more easily learn about it.
To associate your repository with the clip topic, visit your repo's landing page and select "manage topics."