Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
conversational
custom_code
Eval Results
suhara commited on
Commit
7573cfb
·
verified ·
1 Parent(s): af5231a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -318,7 +318,7 @@ wget https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16/resolve/m
318
  Launch a vLLM server using the custom parser. In this example, we use a context length of 256k. You can increase the context size up to 1M to support longer contexts.
319
 
320
  ```
321
- vllm serve --model nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 \
322
  --max-num-seqs 8 \
323
  --tensor-parallel-size 1 \
324
  --max-model-len 262144 \
 
318
  Launch a vLLM server using the custom parser. In this example, we use a context length of 256k. You can increase the context size up to 1M to support longer contexts.
319
 
320
  ```
321
+ vllm serve nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 \
322
  --max-num-seqs 8 \
323
  --tensor-parallel-size 1 \
324
  --max-model-len 262144 \