Instructions to use Steelskull/L3.3-Damascus-R1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Steelskull/L3.3-Damascus-R1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Steelskull/L3.3-Damascus-R1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Steelskull/L3.3-Damascus-R1") model = AutoModelForCausalLM.from_pretrained("Steelskull/L3.3-Damascus-R1") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Steelskull/L3.3-Damascus-R1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Steelskull/L3.3-Damascus-R1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Damascus-R1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Steelskull/L3.3-Damascus-R1
- SGLang
How to use Steelskull/L3.3-Damascus-R1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Steelskull/L3.3-Damascus-R1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Damascus-R1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Steelskull/L3.3-Damascus-R1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Damascus-R1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Steelskull/L3.3-Damascus-R1 with Docker Model Runner:
docker model run hf.co/Steelskull/L3.3-Damascus-R1
Way too righteous?
Sup!π
I found that the morality leaning is on the 'righteous good' side even with layered toxic instructions.
The model simply ignores and avoids such themes even with triple reinforcement (system, context, post context) - feels very censored and GPT-like.
I guess, that adding a more toxic model into the mix would be beneficial for the model flexibility (or maybe increase the % of Negative_LLAMA_70B in the mix).
Just an example in the vacuum:
What was instructed:
- The native tribes of this planet are crude, barbaric, dumb, and impulsive people who constantly raid and start wars with each other for land, goods, and thralls; avoid portraying them as good-natured characters. <- Yes, the last part may be interpreted as 'positive' reinforcement, but such structure usually shows good results.
- Keep the tone rough and grounded, avoid flowery language and purple prose.
AI output as one of such natives:
"Welcome, stranger. What brings you to our land? We feel that you are a strong one. Wanna to be our friend?" <- This is dumb, yes, but only as AI's reply.
The output structure is also questionable, as it leans to rewrite {{user}}'s last actions in great detail, then some more bloat, leaving only around 15% to move the story forward. But at least this part is fixable with 2-3 instructions on how to behave on {{user}} input.
But this is only my experience so far, maybe I'm doing something wrong here. π
This is an issue with Damascus and other R1 merges they tend to have a positivity bias sadly. I actually just uploaded three new models that seem to fix this entirely (from testers and my own experiences with the models) I used a method of double stacking Negative in a non-standard format so if it seems to help let me know!
https://huggingface.co/Steelskull/L3.3-San-Mai-R1-70b
https://huggingface.co/Steelskull/L3.3-Cu-Mai-R1-70b
https://huggingface.co/Steelskull/L3.3-Mokume-Gane-R1-70b
Hey! Will try :D
Also, amazing model info-cards! Don't even want to think how much time this extra work takes.
Cheers, mate. And thanks for your work, btw.