westenfelder/NL2SH-ALFA
Viewer • Updated • 40.9k • 3.73k • 10
Fine-tuned version of Qwen2.5-Coder-0.5B-Instruct on 40,639 natural language → Bash command pairs from the NL2SH-ALFA dataset.
Try it live: 🚀 Gradio Demo
| Metric | Score |
|---|---|
| Exact Match | 13.67% |
| Semantic Match (cosine ≥ 0.8) | 60.33% |
| Avg Similarity | 0.776 |
Evaluated on 300 held-out test examples from NL2SH-ALFA. Semantic similarity is computed using
all-MiniLM-L6-v2embeddings and is a better indicator of real-world quality than exact match alone, since multiple Bash commands can be functionally equivalent.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("dhwanichande29/nl-to-bash")
tokenizer = AutoTokenizer.from_pretrained("dhwanichande29/nl-to-bash")
system_prompt = "Your task is to translate a natural language instruction to a Bash command. You will receive an instruction in English and output a Bash command that can be run in a Linux terminal."
def translate(instruction):
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": instruction}
]
formatted = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=False)
response = outputs[0][inputs.input_ids.shape[-1]:]
return tokenizer.decode(response, skip_special_tokens=True).strip()
print(translate("list all files in current directory"))
# find . -type f
| Natural Language | Generated Bash |
|---|---|
| list all files in current directory | find . -type f |
| find all python files | find . -name "*.py" |
| count lines in a text file | wc -l path/to/file |
| remove all .tmp files | find . -name "*.tmp" -exec rm {} \; |
| show disk usage | du -h / |
nl2sh project)westenfelder/NL2SH-ALFA — a dataset of natural language instructions paired with corresponding Bash commands.
Full training code, evaluation notebooks, and FastAPI deployment: 👉 github.com/Dhwani-Chande/Natural-Language-to-Bash-Translation-using-LLMs
Base model
Qwen/Qwen2.5-0.5B