unnamed

๐Ÿ Ouroboros-1M: The Infinite Context Nano-Model

Developed by: Loay Abd Alsalam (AI Engineer, Egypt ๐Ÿ‡ช๐Ÿ‡ฌ)

๐ŸŒŸ Overview

Ouroboros-1M is a proof-of-concept engineering feat that scales the tiny gemma-3-270m-it to support a 1 Million Token Context Window. This was achieved through Frequency Modulation (RoPE Scaling x128) and Self-Instruction Fine-tuning on synthetic logic chains.

It allows you to process massive documents on extremely low-resource hardware (even T4 GPUs or Consumer Laptops).

Full benchmark data is available in benchmark_results.json in this repo.

๐Ÿ’ป Usage



import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig, BitsAndBytesConfig

# 1. ุงุณู… ุงู„ู…ูˆุฏูŠู„ ุงู„ุฎุงุต ุจูƒ
model_id = "loaiabdalslam/Ouroboros-1MContext-Gemma-270m"

print(f"๐ŸŒ Connecting to Hugging Face: {model_id}...")

def enable_infinite_context(config):
    config.max_position_embeddings = 1048576 
    if hasattr(config, "rope_parameters") and config.rope_parameters:
        for layer_type in config.rope_parameters:
            # ู†ุชุฃูƒุฏ ุฃู† ุงู„ุชุฑุฏุฏ ู…ุถุฑูˆุจ ููŠ 128
            original_base = 10000.0 # ุงู„ุชุฑุฏุฏ ุงู„ุฃุตู„ูŠ
            config.rope_parameters[layer_type]['base'] = original_base * 128.0
    return config

# 3. ุชุญู…ูŠู„ ุงู„ูƒูˆู†ููŠุฌ ูˆุชุนุฏูŠู„ู‡
try:
    config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
    config = enable_infinite_context(config)
except:
    # ู„ูˆ ุญุตู„ ู…ุดูƒู„ุฉ ููŠ ุงู„ุชุญู…ูŠู„ุŒ ู†ุณุชุฎุฏู… ุงู„ูƒูˆู†ููŠุฌ ุงู„ุงูุชุฑุงุถูŠ ูˆู†ุนุฏู„ู‡
    print("โš ๏ธ Note: Applying manual config patch...")

# 4. ุชุญู…ูŠู„ ุงู„ู…ูˆุฏูŠู„ (ู…ุน ุถุบุท 4-bit ู„ุชูˆููŠุฑ ุงู„ุฑุงู…ุงุช)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    config=config,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(model_id)


# ุจุฑูˆู…ุจุช ุจุณูŠุท ู„ู„ุชุฌุฑุจุฉ
prompt_text = "Who are you and what makes your context window special?"

messages = [{"role": "user", "content": prompt_text}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

print("\n๐Ÿค– Ouroboros Generating...")
with torch.no_grad():
    outputs = model.generate(
        inputs, 
        max_new_tokens=150,
        do_sample=True,
        temperature=0.7
    )

response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(f"Answer:\n{response}")

๐Ÿ› ๏ธ Methodology Frequency Hack: Modified the RoPE base frequency in the config.json to compress distance perception.

Ouroboros Loop: The model generated its own training data (logic puzzles) and was fine-tuned on them to prevent "stupor" from the extended context.

Merge: This model is a full merge of the LoRA adapter into the base, ready for deployment.

Created with โค๏ธ in Alexandria, Egypt. """

Downloads last month
56
Safetensors
Model size
0.3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for loaiabdalslam/Ouroboros-1MContext-Gemma-270m

Finetuned
(1080)
this model