Ji Xie's picture

On Vacation 🏝️

Ji Xie PRO

sanaka87

·

https://horizonwind2004.github.io/

AI & ML interests

AIGC, Generative Model

Recent Activity

upvoted a paper 5 days ago

Echo-Memory: A Controlled Study of Memory in Action World Models

liked a dataset 13 days ago

XiangpengYang/VideoCoF-50k

liked a model 25 days ago

bytedance-research/Lance

View all activity

Organizations

upvoted a paper 5 days ago

Echo-Memory: A Controlled Study of Memory in Action World Models

Paper • 2606.09803 • Published 6 days ago • 32

upvoted 2 papers about 2 months ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

Paper • 2604.23781 • Published Apr 26 • 33

UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

Paper • 2604.17565 • Published Apr 19 • 10

upvoted a paper 2 months ago

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

Paper • 2604.06870 • Published Apr 8 • 43

upvoted an article 3 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 164

upvoted a paper 3 months ago

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Paper • 2603.08652 • Published Mar 9 • 41

upvoted a paper 4 months ago

Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published Dec 18, 2025 • 91

upvoted a collection 4 months ago

NEPA

5 items • Updated Dec 19, 2025 • 11

upvoted 2 papers 4 months ago

GENIUS: Generative Fluid Intelligence Evaluation Suite

Paper • 2602.11144 • Published Feb 11 • 55

GEBench: Benchmarking Image Generation Models as GUI Environments

Paper • 2602.09007 • Published Feb 9 • 39

upvoted a paper 6 months ago

Unified Video Editing with Temporal Reasoner

Paper • 2512.07469 • Published Dec 8, 2025 • 47

upvoted 2 papers 7 months ago

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward

Paper • 2511.20561 • Published Nov 25, 2025 • 33

Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Paper • 2511.19418 • Published Nov 24, 2025 • 29

upvoted a collection 7 months ago

CoVT: Chain-of-Visual-Thought

Enrich VLMs’ vision-centric reasoning capabilities via Chain-of-Visual-Thought! • 7 items • Updated Nov 25, 2025 • 6

upvoted 2 papers 8 months ago

SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models

Paper • 2510.12784 • Published Oct 14, 2025 • 20

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

Paper • 2510.11026 • Published Oct 13, 2025 • 18

upvoted a collection 9 months ago

Fine-Tuning

8 items • Updated Dec 19, 2025 • 1

upvoted a paper 9 months ago

Reconstruction Alignment Improves Unified Multimodal Models

Paper • 2509.07295 • Published Sep 8, 2025 • 40

upvoted an article 9 months ago

Article

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

AI-MO

•

Jul 10, 2025

• 56

upvoted a collection 10 months ago

RecA

Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning! • 8 items • Updated Sep 22, 2025 • 14