MiniGPT-v2
Chat with images and get visual answers
Chat with images and get visual answers
Generate speech from text using a reference voice
Generate audio from text using voice prompts
Fine-tuning large language model with Gradio UI
Generate music from a text description and optional melody
Generate animated videos from images and motion sequences
Translate speech and text between languages
Generate a short video from a single image
Vote for 3D creations and view the leaderboard
Transcribe audio files to text instantly
Generate detailed Stable Diffusion prompts from any image
Predict depth map from a single image
Creative Upscaler High-Res Image Generation DemoFusion SDXL
Generate virtual try‑on images for any model and clothing
Chat with Gemini Pro and upload images for responses
Transform your voice into a singer's
Generate speech from text with different voices
Generate speech with cloned timbre
Teleport objects into new backgrounds using masks
Generate human motion from text or audio
Generate academic responses using GPT
Generate AI images featuring your own face
Generate images with text and edit existing images
Display Hugging Face status and loading animation
Animate a portrait from audio speech
Replace objects in images using prompts or reference images
Generate personalized photos of a person from a prompt
Enhance your audio with denoising and quality boost
Generate images from text prompts
Generate personalized images preserving your face identity
Generate DuckDB SQL queries from natural language
Enhance images with custom text instructions
Get a music sample inspired by the mood of an image
Remove background from images instantly
Detect objects in images or videos
Explore Vision Arena visual AI demo online
Generate high‑resolution images from text prompts
Super-fast image generation on SDX
Detect and segment objects in images or videos
Edit images with custom change maps using AI
In-browser object detection w/ YOLOv9 and Transformers.js
Generate depth maps for video frames
Generate depth map from a single image
Fast, efficient, & multilingual text-to-speech
Generate highly aesthetic images
Generate 3D human motion from text prompts
Generate personalized stylized portraits from your photos
Official Demo Space for Trajectory Consistency Distillation
Generate a 3D model from a single image
Generate comic transcriptions from images
The most opinionated, anime-themed SDXL model
Enhance low‑resolution anime images with AI upscaling
Generate animated videos from images and text prompts
Animate an image into a video using a text prompt
Create your own AI comic with a single prompt
Display a live demo website
Video Editing
Edit images using text instructions
High-fidelity Text-To-Speech
MagicTime: Time-lapse Video Generation Models as Metamorphic
Generate customized scenes with your object and viewpoint
Generate high-res images from text prompts
Generate images from text prompts
Create a 3D model from an image in 10 seconds!
Generate images from text prompts
High-fidelity Virtual Try-on
Relight photos with AI using custom lighting prompts
A private and powerful AI that runs locally in your browser
Annotate and describe images with text prompts
Detect objects in images with customizable YOLOv10 models
Generate audio from text with tuning options
Generate animated video from two images and a prompt
Generate realistic speech and sounds from typed text
Edit image regions using a reference picture
Generate images using a reference image and text prompt
Download and preview ChatTTS speaker embeddings
Generate detailed captions for your images
Generate natural speech in 7000+ languages
Generate high‑quality images from text prompts in seconds
Generate captions, detections, and segmentations for any image
Generate a video from an image
Display a web page
Display a React app with TypeScript
Generate audio for silent videos
Generate personalized portrait images from your photos and prompts
Apply the motion of a video on a portrait
Generate personalized images based on comments
Generate a 3D mesh from a single image
Remove backgrounds from images instantly
LLM for long context
Generate virtual try‑on images of clothes on a person
Text-to-Video
Generate a smooth video between two keyframe images
Convert text to natural-sounding speech audio
Create HD cutouts from any image with just a prompt
Extract and format text from images with advanced OCR
Generate audio‑ready script from documents
Chat with Llama about images and text
Transcribe audio or YouTube videos into text
Personalised Podcasts For All - Available in 13 Languages
Create custom AI podcasts from text, URLs, PDFs, and images
A gradio demo for Posterior-Mean Rectified Flow (PMRF)
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
MaskGCT TTS Demo
An end-to-end (e2e) Voice Language Model by Fish Audio.
Generate and preview app code from a text description
Generate and edit images using text instructions
High-quality virtual try-on ~ Your cyber fitting room
Edit images with scribble‑based color and edge control
Generate new images from a subject photo and text prompt
Execute custom code from environment variable
Extract garment images from everyday images!
A unified multimodal understanding and generation model.
A unified multimodal understanding and generation model.
Zero Shot voice cloning with llasa 3b (Unofficial Demo)
Generate text answers or segment objects from images
Gemini 2.0 native image generation co-doodling
Blazingly Fast and Embarrassingly Simple Song Generation
Flexible Photo Recrafting While Preserving Your Identity
Generate human images from text descriptions
Enhance and restore old photos and AI-generated faces
LiveCC-7B-Instruct
A Step Towards Music Generation Foundation Model
Radiology Image & Report Explainer Demo. Built with MedGemma
Generate modified audio from text and voice
Generate web code from your description and view it live
Extract text, tables, formulas, and charts from images
Upload your anndata, get Tx1 embeddings in minutes
Generate spoken audio from your text in many voices
Generate speech from text using voice design, cloning or presets