Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
merve
's Collections
Weekly Releases (Jun 05, 2026)
Weekly Releases (May 29, 2026)
Weekly Releases (May 22, 2026)
Weekly Releases (May 15, 2026)
RFDetr
may-4-releases
Apr 27 Releases
Privacy Filter
apr-17-releases
Apr 7 Releases
Apr 3 Releases
super cool vision language datasets
Multimodal tool calling datasets
Jan 26 Releases
Jan 19 Releases
Jan 12 Releases
YOLO26 Models
Jan 5 Releases
Dec 30 Releases
Dec 19 Releases
Dec 12 Releases
Real-time Vision Models
SAM3
MetaCLIP2 Multilingual
Oct 6 Releases
Sep 30 Releases
Sep 23 Releases
Sep 16 Releases
Sep 11 Releases
Sep 1 Releases
August 29 Releases
Aug 22 Releases
Releases August 9
Releases August 2
Releases July 25
Releases July 18
Releases July 11
Releases July 4
Releases June 27
June 20 Releases
OCR Models & Datasets
Releases June 13
Releases June 6
Releases 30 May
Releases 23 May
May 16 Releases
May 9 Releases
Any-to-Any Models, Datasets, Spaces
Releases Apr 21 & May 2
InternVL3 HF
April 16 Releases
Multimodal DSE Retrievers
April 11 Releases
March 28 Releases
March 21 Releases
Türkçe VLMler
Feb 14 Releases 💌
Feb 7 Releases 🧣
January 31 Releases 🧤
Models, Jan 27
Jan 24 Releases
Jan 17 Releases ❄️
Jan 10 Releases 🌨️
Dec 6 Releases 🎄
Nov 29 Releases 🌲🌲
Nov 22 Releases ❄️
Nov 15 Releases 🍂
Nov 1 Releases
MIT Talk 31/10 Papers
October 25 Releases
LOTUS 🪷
New Depth Models
BRAVE Models 🦁
Computer Vision Backbones 🧩
Image Classification Models 🐶 🐱
Object Detection Models 🥥
Image Segmentation Models 💜
Zero-shot Image Classification Models 🖼️
Image-to-Image Models 🎨
Video Classification Models 📺
Image-to-Text Models 📝
Text-to-Image Models 🥑
Foundation Models for Vision 🧩
Segment Anything Model
OWL-series 🦉
SigLIP
Awesome Document AI
SegGPT
Vision Language Models Papers 🖼️💬📝
gvhf/owl
gv-hf/owl
merve/owl2
Depth Anything v2 Release
Document VLM Papers
Vision Language Leaderboards
Video Language Models
SAM2
NVEagle
Multimodal RAG
Zero-shot Segmentation
October 25 Releases
updated
Oct 25, 2024
Upvote
7
ibm-granite/granite-3.0-8b-instruct
Text Generation
•
8B
•
Updated
Dec 19, 2024
•
53.3k
•
206
ibm-granite/granite-3.0-2b-instruct
Text Generation
•
3B
•
Updated
Dec 19, 2024
•
2.73k
•
49
CohereLabs/aya-expanse-8b
Text Generation
•
8B
•
Updated
Jan 9
•
14.7k
•
434
CohereLabs/aya-expanse-32b
Text Generation
•
32B
•
Updated
Jan 9
•
9.74k
•
•
294
genmo/mochi-1-preview
Text-to-Video
•
Updated
Sep 4, 2025
•
2.04k
•
•
1.33k
rhymes-ai/Allegro
Text-to-Video
•
Updated
Oct 31, 2024
•
82
•
264
LanguageBind/Open-Sora-Plan-v1.3.0
Text-to-Video
•
Updated
Dec 5, 2024
•
3
•
74
jadechoghari/Ferret-UI-Llama8b
Image-Text-to-Text
•
8B
•
Updated
Jan 8, 2025
•
14
•
68
jadechoghari/Ferret-UI-Gemma2b
Image-Text-to-Text
•
3B
•
Updated
Oct 18, 2024
•
136
•
52
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
229
•
1.71k
neuralwork/arxiver
Viewer
•
Updated
Nov 1, 2024
•
63.4k
•
1.99k
•
368
neulab/Pangea-7B
8B
•
Updated
Oct 24, 2024
•
1.36k
•
133
neulab/Pangea-7B-hf
8B
•
Updated
Oct 28, 2025
•
219
•
13
Running
50
Pangea
🚀
50
A Fully Open Multilingual Multimodal LLM for 39 Languages
stabilityai/stable-diffusion-3.5-large
Text-to-Image
•
Updated
Oct 22, 2024
•
16.1k
•
•
3.53k
stabilityai/stable-diffusion-3.5-large-turbo
Text-to-Image
•
Updated
Oct 22, 2024
•
4.94k
•
•
675
Marqo/marqo-GS-10M
Viewer
•
Updated
Oct 23, 2024
•
9.81M
•
1.36k
•
54
vikhyatk/lofi
Viewer
•
Updated
Oct 26, 2024
•
857k
•
1.37k
•
86
neulab/PangeaInstruct
Updated
Feb 2, 2025
•
478
•
86
Upvote
7
+3
Share collection
View history
Collection guide
Browse collections