Buckets:
Training Guide
This guide covers how to run training jobs, from basic single-GPU training to advanced distributed setups and automatic model uploads.
⚡ Basic Training (Single GPU)
After preprocessing your dataset and preparing a configuration file, you can start training using the trainer script:
uv run python scripts/train.py configs/t2v_lora.yaml
The trainer will:
- Load your configuration and validate all parameters
- Initialize models and apply optimizations
- Run the training loop with progress tracking
- Generate validation videos (if configured)
- Save the trained weights in your output directory
Agent-Assisted Training
If your environment supports repository skills, the
train-model skill provides an end-to-end
orchestrator for this package. It asks what you want the model to learn, maps that intent to
one of the documented training modes, probes your filesystem and GPU,
prepares/preprocesses the dataset, writes a run-specific config, launches training, and
monitors the job. It uses the trainer docs as its source of truth and stops for approval before
captioning, preprocessing, or starting expensive training work.
Output Files
For LoRA training:
checkpoints/lora_weights_step_00000.safetensors- LoRA checkpoint weights, with the current step in the filenametraining_config.yaml- Copy of training configurationsamples/- Generated validation samples (if enabled)checkpoints/training_state_step_00000.pt- Optional resume state, depending oncheckpoints.save_training_state
For full model fine-tuning:
checkpoints/model_weights_step_00000.safetensors- Full model checkpoint weights, with the current step in the filenametraining_config.yaml- Copy of training configurationsamples/- Generated validation samples (if enabled)checkpoints/training_state_step_00000.pt- Optional resume state, depending oncheckpoints.save_training_state
🖥️ Distributed / Multi-GPU Training
We use Hugging Face 🤗 Accelerate for multi-GPU DDP and FSDP.
Configure Accelerate
Run the interactive wizard once to set up your environment (DDP / FSDP, GPU count, etc.):
uv run accelerate config
This stores your preferences in ~/.cache/huggingface/accelerate/default_config.yaml.
Use the Provided Accelerate Configs (Recommended)
We include ready-to-use Accelerate config files in configs/accelerate/:
- ddp.yaml — Standard DDP
- ddp_compile.yaml — DDP with
torch.compile(Inductor) - fsdp.yaml — Standard FSDP (auto-wraps
BasicAVTransformerBlock) - fsdp_compile.yaml — FSDP with
torch.compile(Inductor)
Launch with a specific config using --config_file:
# DDP (2 GPUs shown as example)
CUDA_VISIBLE_DEVICES=0,1 \
uv run accelerate launch --config_file configs/accelerate/ddp.yaml \
scripts/train.py configs/t2v_lora.yaml
# DDP + torch.compile
CUDA_VISIBLE_DEVICES=0,1 \
uv run accelerate launch --config_file configs/accelerate/ddp_compile.yaml \
scripts/train.py configs/t2v_lora.yaml
# FSDP (4 GPUs shown as example)
CUDA_VISIBLE_DEVICES=0,1,2,3 \
uv run accelerate launch --config_file configs/accelerate/fsdp.yaml \
scripts/train.py configs/t2v_lora.yaml
# FSDP + torch.compile
CUDA_VISIBLE_DEVICES=0,1,2,3 \
uv run accelerate launch --config_file configs/accelerate/fsdp_compile.yaml \
scripts/train.py configs/t2v_lora.yaml
Notes:
- The number of processes is taken from the Accelerate config (
num_processes). Override with--num_processes Xor restrict GPUs withCUDA_VISIBLE_DEVICES. - The compile variants enable
torch.compilewith the Inductor backend via Accelerate'sdynamo_config. - FSDP configs auto-wrap the transformer blocks (
fsdp_transformer_layer_cls_to_wrap: BasicAVTransformerBlock).
Launch with Your Default Accelerate Config
If you prefer to use your default Accelerate profile:
# Use settings from your default accelerate config
uv run accelerate launch scripts/train.py configs/t2v_lora.yaml
# Override number of processes on the fly (e.g., 2 GPUs)
uv run accelerate launch --num_processes 2 scripts/train.py configs/t2v_lora.yaml
# Select specific GPUs
CUDA_VISIBLE_DEVICES=0,1 uv run accelerate launch scripts/train.py configs/t2v_lora.yaml
You can disable the in-terminal progress bars with
--disable-progress-barsflag in the trainer CLI if desired.
Benefits of Distributed Training
- Faster training: Distribute workload across multiple GPUs
- Larger effective batch sizes: Combine gradients from multiple GPUs
- Memory efficiency: Each GPU handles a portion of the batch
Distributed training requires that all GPUs have sufficient memory for the model and batch size. The effective batch size becomes
batch_size × num_processes.
🤗 Pushing Models to Hugging Face Hub
You can automatically push your trained models to the Hugging Face Hub by adding the following to your configuration:
hub:
push_to_hub: true
hub_model_id: "your-username/your-model-name"
Prerequisites
Before pushing, make sure you:
- Have a Hugging Face account - Sign up at huggingface.co
- Are logged in via
huggingface-cli loginor have set theHUGGING_FACE_HUB_TOKENenvironment variable - Have write access to the specified repository (it will be created if it doesn't exist)
Login Options
Option 1: Interactive login
uv run huggingface-cli login
Option 2: Environment variable
export HUGGING_FACE_HUB_TOKEN="your_token_here"
What Gets Uploaded
The trainer will automatically:
- Create a model card with training details and sample outputs
- Upload model weights
- Push sample videos as GIFs in the model card
- Include training configuration and prompts
📊 Weights & Biases Logging
Enable experiment tracking with W&B by adding to your configuration:
wandb:
enabled: true
project: "ltx-2-trainer"
entity: null # Your W&B username or team
tags: [ "ltx2", "lora" ]
log_validation_videos: true
This will log:
- Training loss and learning rate
- Validation videos
- Model configuration
- Training progress
🚀 Next Steps
After training completes:
- Run inference with your trained LoRA - The
ltx-pipelinespackage provides production-ready inference pipelines that support loading custom LoRAs. Available pipelines include text-to-video, image-to-video, IC-LoRA video-to-video, and more. See theltx-pipelinespackage for usage details. - Test your model with validation prompts
- Iterate and improve based on validation results
- Share your results by pushing to Hugging Face Hub
💡 Tips for Successful Training
- Start small: Begin with a small dataset and a few hundred steps to verify everything works
- Monitor validation: Keep an eye on validation samples to catch overfitting
- Adjust learning rate: Lower learning rates often produce better results
- Use gradient checkpointing: Essential for training with limited GPU memory
- Save checkpoints: Regular checkpoints help recover from interruptions
Need Help?
If you encounter issues during training, see the Troubleshooting Guide.
Join our Discord community for real-time help!
Xet Storage Details
- Size:
- 7.75 kB
- Xet hash:
- 6b14fdd75ec734426dd4414e07d904813ee8c6b472a1945793cb4b7fc5bf8a35
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.