=== [qwen3.5-2b-justgfos-nothink-1116] Vast.ai Instance Setup === Tue Mar 24 23:48:52 UTC 2026 Installing unsloth (preserving torch)... WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. unsloth 2026.3.11 requires diffusers, which is not installed. unsloth 2026.3.11 requires nest-asyncio, which is not installed. unsloth 2026.3.11 requires pydantic, which is not installed. unsloth 2026.3.11 requires xformers>=0.0.27.post2; ("linux" in sys_platform or sys_platform == "win32") and (platform_machine == "AMD64" or platform_machine == "x86_64"), which is not installed. unsloth-zoo 2026.3.5 requires msgspec, which is not installed. unsloth-zoo 2026.3.5 requires torchao>=0.13.0, which is not installed. unsloth 2026.3.11 requires datasets!=4.0.*,!=4.1.0,<4.4.0,>=3.4.1, but you have datasets 4.8.4 which is incompatible. unsloth 2026.3.11 requires trl!=0.19.0,<=0.24.0,>=0.18.2, but you have trl 0.29.1 which is incompatible. unsloth-zoo 2026.3.5 requires datasets!=4.0.*,!=4.1.0,<4.4.0,>=3.4.1, but you have datasets 4.8.4 which is incompatible. unsloth-zoo 2026.3.5 requires trl!=0.19.0,<=0.24.0,>=0.18.2, but you have trl 0.29.1 which is incompatible. WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. Verifying install... torch=2.6.0+cu124 cuda=True gpu=NVIDIA H100 80GB HBM3 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. Unsloth: Your Flash Attention 2 installation seems to be broken. Using Xformers instead. No performance changes will be seen. 🦥 Unsloth Zoo will now patch everything to make training faster! unsloth OK === Starting Training === === qwen3.5-2b-justgfos-nothink-1116: Loading Unsloth === 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. Unsloth: Your Flash Attention 2 installation seems to be broken. Using Xformers instead. No performance changes will be seen. 🦥 Unsloth Zoo will now patch everything to make training faster! ==((====))== Unsloth 2026.3.11: Fast Qwen3_5 patching. Transformers: 5.3.0. \\ /| NVIDIA H100 80GB HBM3. Num GPUs = 1. Max memory: 79.205 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.6.0+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.2.0 \ / Bfloat16 = TRUE. FA [Xformers = None. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d Unsloth: QLoRA and full finetuning all not selected. Switching to 16bit LoRA. Loading weights: 0%| | 0/617 [00:00 stats = trainer.train() ^^^^^^^^^^^^^^^ File "/root/unsloth_compiled_cache/UnslothSFTTrainer.py", line 68, in wrapper output = f(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 1424, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "", line 81, in _fast_inner_training_loop File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 1734, in _run_epoch tr_loss_step = self.training_step(model, inputs, num_items_in_batch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/unsloth_compiled_cache/UnslothSFTTrainer.py", line 1389, in training_step return super().training_step(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 68, in _unsloth_training_step File "/opt/conda/lib/python3.11/site-packages/accelerate/accelerator.py", line 2838, in backward loss.backward(**kwargs) File "/opt/conda/lib/python3.11/site-packages/torch/_tensor.py", line 626, in backward torch.autograd.backward( File "/opt/conda/lib/python3.11/site-packages/torch/autograd/__init__.py", line 347, in backward _engine_run_backward( File "/opt/conda/lib/python3.11/site-packages/torch/autograd/graph.py", line 823, in _engine_run_backward return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/autograd/function.py", line 307, in apply return user_fn(self, *args) ^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/unsloth_zoo/gradient_checkpointing.py", line 612, in backward torch.autograd.backward(outputs_with_grad, args_with_grad) File "/opt/conda/lib/python3.11/site-packages/torch/autograd/__init__.py", line 347, in backward _engine_run_backward( File "/opt/conda/lib/python3.11/site-packages/torch/autograd/graph.py", line 823, in _engine_run_backward return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/autograd/function.py", line 307, in apply return user_fn(self, *args) ^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 1710, in backward return impl_fn() ^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 1700, in impl_fn out = CompiledFunction._backward_impl(ctx, all_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 2065, in _backward_impl out = call_func_at_runtime_with_args( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/utils.py", line 126, in call_func_at_runtime_with_args out = normalize_as_list(f(args)) ^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_inductor/output_code.py", line 466, in __call__ return self.current_callable(inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/_inductor/utils.py", line 2128, in run return model(new_inputs) ^^^^^^^^^^^^^^^^^ File "/tmp/torchinductor_root/gg/cgg4ncnprssvz5mer65fcitwk5y2e6sa7scujqtkbr7ehlyjhhpe.py", line 807, in call buf27 = empty_strided_cuda((s0, s1, 6144), (6144*s1, 6144, 1), torch.bfloat16) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.67 GiB. GPU 0 has a total capacity of 79.20 GiB of which 1.24 GiB is free. Process 6584 has 5.01 GiB memory in use. Process 2899144 has 72.90 GiB memory in use. Of the allocated memory 71.93 GiB is allocated by PyTorch, and 179.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) === SETUP/TRAINING FAILED (exit code 1) ===