πŸ“˜ GPT-OSS β€” Finetuned on Custom Instruction Data

This repository contains GPT-OSS, a finetuned version of an open-source LLM trained using custom, high-quality instruction/agent data. The objective of this finetuning process is to enhance:

Instruction following
Reasoning
Step-by-step task execution
Conversational quality
Task planning & agent-style responses
This finetuned model retains the capabilities of the base LLM while becoming more aligned, structured, and usable for real-world assistant-style tasks.

πŸš€ Model Details Property Details Base Model GPT-OSS (20B) Finetuning Type Supervised Finetuning (SFT) Format Used Merged LoRA β†’ FP16 / MXFP4 depending on release Tokenizer Same as base GPT-OSS tokenizer Architecture Decoder-only Transformer Context Length As per base model (typically 4k–8k tokens)

πŸ› οΈ Training Training Objective The goal of finetuning was to improve:

natural language understanding structured reasoning agent-style task completion multi-turn conversation coherence
instruction following helpfulness & clarity

πŸ“š Dataset

The model was trained on a custom instruction dataset containing:

Conversational instructions Agent-like ReAct-style prompts Structured reasoning demonstrations Multi-turn dialog tasks Task-planning instructions Domain-specific prompts for practical usage All data was manually curated and aligned for safety and clarity. The dataset is private but can be replaced with your own custom instruction set.

πŸ§ͺ How the Model Was Trained

Finetuning was performed in Google Colab using:

HuggingFace Transformers PEFT / LoRA Bitsandbytes

Custom training notebook

The Colab notebook used for finetuning is included in a separate dataset repo:

πŸ“„ Training Notebook: gpt_oss_(20B)_Fine_tuning.ipynb

(You may replace with your actual HF link.)

πŸ“¦ Model Variants Available

Variant Name Description Best Use merged_16bit FP16 merged model vLLM, SGLang, servers mxfp4 4-bit quantized Local GPU (4–8GB), CPU inference

πŸ’‘ Intended Use This model is useful for:

AI assistants Chatbots Reasoning tasks Educational tools Task-execution agents General conversation Planning / workflow guidance

⚠️ Limitations

Like any LLM, GPT-OSS has limitations:

May hallucinate facts Not suitable for legal/medical/financial advice Not trained for harmful or unsafe domains May produce incorrect reasoning occasionally

πŸ”’ Safety

The dataset was manually curated to avoid:

Toxic content Hate speech Violence Personal data Malware instructions Nevertheless, always evaluate outputs before deploying to production.

▢️ Example Usage from transformers import AutoModelForCausalLM, AutoTokenizer import torch

  model_id = "Neo404/gpt_oss_finetuned"
  
  tokenizer = AutoTokenizer.from_pretrained(model_id)
  model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
  
  prompt = "Explain how reinforcement learning works in simple terms."
  
  inputs = tokenizer(prompt, return_tensors="pt")
  outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🌐 Deployment Notes

For vLLM: Use the merged_16bit model For local GPU (GTX 1650): Use the mxfp4 4-bit quantized version

πŸ“œ License

This model follows the license of the original GPT-OSS base model. Users must comply with the original terms when using derivative models.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Neo404/oss_finetuned

Base model

openai/gpt-oss-20b
Adapter
(121)
this model

Dataset used to train Neo404/oss_finetuned