Instructions to use Steelskull/L3.3-Damascus-R1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Steelskull/L3.3-Damascus-R1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Steelskull/L3.3-Damascus-R1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Steelskull/L3.3-Damascus-R1")
model = AutoModelForCausalLM.from_pretrained("Steelskull/L3.3-Damascus-R1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Steelskull/L3.3-Damascus-R1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Steelskull/L3.3-Damascus-R1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Steelskull/L3.3-Damascus-R1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Steelskull/L3.3-Damascus-R1

SGLang

How to use Steelskull/L3.3-Damascus-R1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Steelskull/L3.3-Damascus-R1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Steelskull/L3.3-Damascus-R1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Steelskull/L3.3-Damascus-R1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Steelskull/L3.3-Damascus-R1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Steelskull/L3.3-Damascus-R1 with Docker Model Runner:
```
docker model run hf.co/Steelskull/L3.3-Damascus-R1
```

Way too righteous?

by MateoTeo - opened Feb 19, 2025

Discussion

MateoTeo

Feb 19, 2025

•

edited Feb 19, 2025

Sup!👋

I found that the morality leaning is on the 'righteous good' side even with layered toxic instructions.
The model simply ignores and avoids such themes even with triple reinforcement (system, context, post context) - feels very censored and GPT-like.
I guess, that adding a more toxic model into the mix would be beneficial for the model flexibility (or maybe increase the % of Negative_LLAMA_70B in the mix).

Just an example in the vacuum:

What was instructed:

The native tribes of this planet are crude, barbaric, dumb, and impulsive people who constantly raid and start wars with each other for land, goods, and thralls; avoid portraying them as good-natured characters. <- Yes, the last part may be interpreted as 'positive' reinforcement, but such structure usually shows good results.
Keep the tone rough and grounded, avoid flowery language and purple prose.

AI output as one of such natives:

"Welcome, stranger. What brings you to our land? We feel that you are a strong one. Wanna to be our friend?" <- This is dumb, yes, but only as AI's reply.

The output structure is also questionable, as it leans to rewrite {{user}}'s last actions in great detail, then some more bloat, leaving only around 15% to move the story forward. But at least this part is fixable with 2-3 instructions on how to behave on {{user}} input.

But this is only my experience so far, maybe I'm doing something wrong here. 🐒

Steelskull

Owner Feb 20, 2025

This is an issue with Damascus and other R1 merges they tend to have a positivity bias sadly. I actually just uploaded three new models that seem to fix this entirely (from testers and my own experiences with the models) I used a method of double stacking Negative in a non-standard format so if it seems to help let me know!

https://huggingface.co/Steelskull/L3.3-San-Mai-R1-70b
https://huggingface.co/Steelskull/L3.3-Cu-Mai-R1-70b
https://huggingface.co/Steelskull/L3.3-Mokume-Gane-R1-70b

MateoTeo

Feb 20, 2025

•

edited Feb 20, 2025

Hey! Will try :D
Also, amazing model info-cards! Don't even want to think how much time this extra work takes.
Cheers, mate. And thanks for your work, btw.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment