LiquidAI
/

LFM2-8B-A1B

Text Generation

Mixture of Experts

Model card Files Files and versions

paulpak58 commited on Oct 7

Commit

19a7cba

·

verified ·

1 Parent(s): 4846f70

Update README.md

Files changed (1) hide show

README.md +51 -1

README.md CHANGED Viewed

@@ -205,7 +205,57 @@ You can directly run and test the model with this [Colab notebook](https://colab
 ### 2. vLLM
-vLLM support is coming soon!
 ### 3. llama.cpp

 ### 2. vLLM
+You can run the model in [`vLLM`](https://github.com/vllm-project/vllm) by building from source:
+```bash
+git clone https://github.com/vllm-project/vllm.git
+cd vllm
+pip install -e . -v
+```
+Here is an example of how to use it for inference:
+```python
+from vllm import LLM, SamplingParams
+prompts = [
+    [
+        {
+            "content": "What is C. elegans?",
+            "role": "user",
+        },
+    ],
+    [
+        {
+            "content": "Say hi in JSON format",
+            "role": "user",
+        },
+    ],
+    [
+        {
+            "content": "Define AI in Spanish",
+            "role": "user",
+        },
+    ],
+]
+sampling_params = SamplingParams(
+    temperature=0.3,
+    min_p=0.15,
+    repetition_penalty=1.05,
+    max_tokens=30
+)
+llm = LLM(model="LiquidAI/LFM2-8B-A1B", dtype="bfloat16")
+outputs = llm.chat(prompts, sampling_params)
+for i, output in enumerate(outputs):
+    prompt = prompts[i][0]["content"]
+    generated_text = output.outputs[0].text
+    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
+```
 ### 3. llama.cpp