Update README.md
Browse files
README.md
CHANGED
|
@@ -205,7 +205,57 @@ You can directly run and test the model with this [Colab notebook](https://colab
|
|
| 205 |
|
| 206 |
### 2. vLLM
|
| 207 |
|
| 208 |
-
vLLM
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 209 |
|
| 210 |
### 3. llama.cpp
|
| 211 |
|
|
|
|
| 205 |
|
| 206 |
### 2. vLLM
|
| 207 |
|
| 208 |
+
You can run the model in [`vLLM`](https://github.com/vllm-project/vllm) by building from source:
|
| 209 |
+
|
| 210 |
+
```bash
|
| 211 |
+
git clone https://github.com/vllm-project/vllm.git
|
| 212 |
+
cd vllm
|
| 213 |
+
pip install -e . -v
|
| 214 |
+
```
|
| 215 |
+
|
| 216 |
+
Here is an example of how to use it for inference:
|
| 217 |
+
|
| 218 |
+
```python
|
| 219 |
+
from vllm import LLM, SamplingParams
|
| 220 |
+
|
| 221 |
+
prompts = [
|
| 222 |
+
[
|
| 223 |
+
{
|
| 224 |
+
"content": "What is C. elegans?",
|
| 225 |
+
"role": "user",
|
| 226 |
+
},
|
| 227 |
+
],
|
| 228 |
+
[
|
| 229 |
+
{
|
| 230 |
+
"content": "Say hi in JSON format",
|
| 231 |
+
"role": "user",
|
| 232 |
+
},
|
| 233 |
+
],
|
| 234 |
+
[
|
| 235 |
+
{
|
| 236 |
+
"content": "Define AI in Spanish",
|
| 237 |
+
"role": "user",
|
| 238 |
+
},
|
| 239 |
+
],
|
| 240 |
+
]
|
| 241 |
+
|
| 242 |
+
sampling_params = SamplingParams(
|
| 243 |
+
temperature=0.3,
|
| 244 |
+
min_p=0.15,
|
| 245 |
+
repetition_penalty=1.05,
|
| 246 |
+
max_tokens=30
|
| 247 |
+
)
|
| 248 |
+
|
| 249 |
+
llm = LLM(model="LiquidAI/LFM2-8B-A1B", dtype="bfloat16")
|
| 250 |
+
|
| 251 |
+
outputs = llm.chat(prompts, sampling_params)
|
| 252 |
+
|
| 253 |
+
for i, output in enumerate(outputs):
|
| 254 |
+
prompt = prompts[i][0]["content"]
|
| 255 |
+
generated_text = output.outputs[0].text
|
| 256 |
+
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
|
| 257 |
+
```
|
| 258 |
+
|
| 259 |
|
| 260 |
### 3. llama.cpp
|
| 261 |
|