📘 GPT-OSS — Finetuned on Custom Instruction Data

This repository contains GPT-OSS, a finetuned version of an open-source LLM trained using custom, high-quality instruction/agent data. The objective of this finetuning process is to enhance:

Instruction following
Reasoning
Step-by-step task execution
Conversational quality
Task planning & agent-style responses
This finetuned model retains the capabilities of the base LLM while becoming more aligned, structured, and usable for real-world assistant-style tasks.

🚀 Model Details Property Details Base Model GPT-OSS (20B) Finetuning Type Supervised Finetuning (SFT) Format Used Merged LoRA → FP16 / MXFP4 depending on release Tokenizer Same as base GPT-OSS tokenizer Architecture Decoder-only Transformer Context Length As per base model (typically 4k–8k tokens)

🛠️ Training Training Objective The goal of finetuning was to improve:

natural language understanding structured reasoning agent-style task completion multi-turn conversation coherence
instruction following helpfulness & clarity

📚 Dataset

The model was trained on a custom instruction dataset containing:

Conversational instructions Agent-like ReAct-style prompts Structured reasoning demonstrations Multi-turn dialog tasks Task-planning instructions Domain-specific prompts for practical usage All data was manually curated and aligned for safety and clarity. The dataset is private but can be replaced with your own custom instruction set.

🧪 How the Model Was Trained

Finetuning was performed in Google Colab using:

HuggingFace Transformers PEFT / LoRA Bitsandbytes

Custom training notebook

The Colab notebook used for finetuning is included in a separate dataset repo:

📄 Training Notebook: gpt_oss_(20B)_Fine_tuning.ipynb

(You may replace with your actual HF link.)

📦 Model Variants Available

Variant Name Description Best Use merged_16bit FP16 merged model vLLM, SGLang, servers mxfp4 4-bit quantized Local GPU (4–8GB), CPU inference

💡 Intended Use This model is useful for:

AI assistants Chatbots Reasoning tasks Educational tools Task-execution agents General conversation Planning / workflow guidance

⚠️ Limitations

Like any LLM, GPT-OSS has limitations:

May hallucinate facts Not suitable for legal/medical/financial advice Not trained for harmful or unsafe domains May produce incorrect reasoning occasionally

🔒 Safety

The dataset was manually curated to avoid:

Toxic content Hate speech Violence Personal data Malware instructions Nevertheless, always evaluate outputs before deploying to production.

▶️ Example Usage from transformers import AutoModelForCausalLM, AutoTokenizer import torch

  model_id = "Neo404/gpt_oss_finetuned"
  
  tokenizer = AutoTokenizer.from_pretrained(model_id)
  model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
  
  prompt = "Explain how reinforcement learning works in simple terms."
  
  inputs = tokenizer(prompt, return_tensors="pt")
  outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🌐 Deployment Notes

For vLLM: Use the merged_16bit model For local GPU (GTX 1650): Use the mxfp4 4-bit quantized version

📜 License

This model follows the license of the original GPT-OSS base model. Users must comply with the original terms when using derivative models.

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Neo404/oss_finetuned

Base model

openai/gpt-oss-20b

Adapter

(121)

this model

Neo404
/

oss_finetuned

Model tree for Neo404/oss_finetuned

Dataset used to train Neo404/oss_finetuned