Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
conversational
custom_code
Eval Results
suhara commited on
Commit
00bd6d1
·
verified ·
1 Parent(s): 7573cfb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -303,7 +303,7 @@ print(tokenizer.decode(outputs[0]))
303
  ### Use it with vLLM
304
 
305
  For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
306
- If you are on Jetson Thor, please use this vllm container: `ghcr.io/nvidia-ai-iot/vllm:latest-jetson-thor`.
307
 
308
  ```
309
  pip install -U "vllm>=0.12.0"
@@ -837,7 +837,7 @@ The following table depicts our sample distribution for the 6 languages and 5 tr
837
  ## Inference
838
 
839
  - Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
840
- - Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB, Jetson Thor
841
 
842
 
843
  ## Ethical Considerations
 
303
  ### Use it with vLLM
304
 
305
  For more detailed information on how to use the model with vLLM, please see [this cookbook](https://github.com/NVIDIA-NeMo/Nemotron/blob/main/usage-cookbook/Nemotron-3-Nano/vllm\_cookbook.ipynb).
306
+ If you are on Jetson Thor or DGX Spark, please use [this vllm container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/vllm?version=25.12.post1-py3).
307
 
308
  ```
309
  pip install -U "vllm>=0.12.0"
 
837
  ## Inference
838
 
839
  - Engines: HF, vLLM, TRT-LLM, SGLang, Llama.cpp
840
+ - Test Hardware: NVIDIA A100 80GB, H100 80GB, B200 192GB, RTX PRO 6000 96GB, Jetson Thor, DGX Spark
841
 
842
 
843
  ## Ethical Considerations