--- title: Marlin 2B Video Understanding emoji: 🎬 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 6.10.0 app_file: app.py short_description: Dense video captions and timestamp search python_version: "3.10" startup_duration_timeout: 1h models: - NemoStation/Marlin-2B --- # Marlin 2B Video Understanding ZeroGPU Gradio demo for [NemoStation/Marlin-2B](https://huggingface.co/NemoStation/Marlin-2B), a 2B video VLM for dense video captioning and natural-language temporal grounding. The app exposes two model-card workflows: - **Caption**: returns Marlin's parsed scene paragraph plus timestamped events. - **Find**: resolves an event query into a parsed start/end time span. The model is loaded once at module startup and inference runs through a single `@spaces.GPU` endpoint. Example videos are short, attributed clips intended to keep verification fast while still exercising video decoding, caption parsing, and span parsing.