CAI-20B-v2-GGUF

Marketing-specialized 20B parameter LLM fine-tuned on proprietary Meta ads, scaling, and creative optimization frameworks.

Model Details

Attribute	Value
Base Model	GPT-OSS-20B
Architecture	Mixture-of-Experts (32 experts, 4 active)
Context Length	131,072 tokens
Training	LoRA fine-tuned on 588 curated marketing Q&A pairs
License	MIT

Available Quantizations

All quantizations use imatrix (importance matrix) computed from domain-specific marketing calibration data for optimal quality preservation.

Quantization	Size	Description	Use Case
Q8_0	21 GB	8-bit quantization	Maximum quality, requires 24GB+ VRAM
Q5_K_M	16 GB	5-bit mixed precision	Best balance of quality/size
Q4_K_M	15 GB	4-bit mixed precision	Good quality, moderate VRAM
Q4_K_S	14 GB	4-bit small	Minimum VRAM, still usable

imatrix Calibration

The importance matrix was computed using marketing-specific calibration data:

Meta ads terminology and frameworks
Scaling methodologies (Chad Scaling, Blitz Scaling)
Creative optimization strategies
Performance metrics (ROAS, MER, CPA)

Perplexity during imatrix computation: 2.0859

Specializations

Meta Ads: Campaign structure, ABO vs CBO, audience targeting, budget optimization
Scaling Frameworks: Chad Scaling, Blitz Scaling, systematic growth methodologies
Creative Strategy: Hook frameworks, UGC structure, creative fatigue solutions
Performance Metrics: ROAS optimization, MER targeting, attribution modeling

Evaluation Results

LLM-as-Judge Benchmark (Grok 4.1 Thinking)

Rigorous pairwise evaluation comparing CAI-20B-v2 (fine-tuned) vs GPT-OSS-20B (base model) using position debiasing.

Test Configuration:

Judge Model: grok-4-1-fast-reasoning (xAI)
Quantization Tested: Q4_K_M via Modal serverless inference
Position Debiasing: Both orderings tested (A,B) and (B,A)
Prompts: 10 domain-specific marketing questions

Summary Results

Metric	Value
CAI-20B-v2 Win Rate	70%
GPT-OSS-20B Win Rate	30%
Ties	0%
Confident Judgments	100%

Per-Question Breakdown

Question	Winner	Key Finding
Scale FB ads $500→$5000/day	Baseline	Finetuned had repetition issues
Campaign structure for new brand	Finetuned	Specific budget splits ($1.5k/$2k/$1.5k)
Fix creative fatigue	Finetuned	12 specific tactics vs generic guide
Metrics beyond ROAS	Baseline	Finetuned produced video hooks instead
Cold traffic ad copy	Finetuned	Clear Hook/Pain/Solution/Proof/CTA framework
ABO vs CBO decision	Finetuned	Precise decision trees with examples
Finding winning audiences	Finetuned	10-step process vs off-topic response
Systematic creative testing	Finetuned	8-step methodology with A/B testing specifics
Landing page conversion	Finetuned	13 tactics with concrete examples
New product launch playbook	Baseline	Finetuned produced sales copy

Evaluation Criteria (Rubric)

Each response judged on 5 criteria weighted equally:

Actionability - Specific, implementable steps
Depth - Expert-level understanding
Structure - Organization and readability
Specificity - Numbers, metrics, examples
Practicality - Realistic for real businesses

Key Findings

Strengths of CAI-20B-v2:

Superior framework-based thinking (Hook/Pain/Solution/Proof/CTA)
More specific budget allocations and metrics
Platform-specific tactics (Meta's Dynamic Creative, frequency caps)
Better decision frameworks for common choices (ABO vs CBO)

Areas for Improvement:

Occasional repetition loops in some responses
Some prompts trigger promotional/video-script style outputs
May benefit from better system prompting for consistency

Evaluation Methodology

Position Debiasing Protocol:
1. Run judgment with (Finetuned=A, Baseline=B)
2. Run judgment with (Baseline=A, Finetuned=B)
3. If both agree → confident result
4. If disagree → mark as TIE (position bias detected)

This eliminates the common LLM-as-judge bias of preferring the first or second response regardless of quality.

Usage

LM Studio

Download any GGUF file
Load in LM Studio
Set context length to 4096+ for best results
Recommended temperature: 0.7

Ollama

ollama run hf.co/tigres2526/CAI-20B-v2-GGUF:Q4_K_M

llama.cpp

./llama-cli -m CAI-20B-v2-Q4_K_M.gguf -p "How do I scale Facebook ads?" -n 512

Training Data

Fine-tuned on 588 curated Q&A pairs from proprietary marketing frameworks:

Meta Ads Mastery (~150 examples)
Blitz Scaling Framework (~140 examples)
Creative Command Center (~160 examples)
Iron Media Thesis (~138 examples)

Total training tokens: ~339,276

Citation

@misc{cai20bv2,
  title={CAI-20B-v2: Marketing-Specialized Language Model},
  author={Tigres2526},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/tigres2526/CAI-20B-v2-GGUF}
}

License

MIT License - Free for commercial and non-commercial use.

Downloads last month: 164

GGUF

Model size

21B params

Architecture

gpt-oss

Hardware compatibility

4-bit

5-bit

8-bit

Model tree for tigres2526/CAI-20B-v2-GGUF

Base model

openai/gpt-oss-20b

Finetuned

tigres2526/CAI-20B-v2

Quantized

(3)

this model

Evaluation results

Win Rate vs Base Model (LLM-as-Judge) on Custom Marketing Eval (10 prompts)
self-reported

0.700