CAI-20B-v2-GGUF
Marketing-specialized 20B parameter LLM fine-tuned on proprietary Meta ads, scaling, and creative optimization frameworks.
Model Details
| Attribute | Value |
|---|---|
| Base Model | GPT-OSS-20B |
| Architecture | Mixture-of-Experts (32 experts, 4 active) |
| Context Length | 131,072 tokens |
| Training | LoRA fine-tuned on 588 curated marketing Q&A pairs |
| License | MIT |
Available Quantizations
All quantizations use imatrix (importance matrix) computed from domain-specific marketing calibration data for optimal quality preservation.
| Quantization | Size | Description | Use Case |
|---|---|---|---|
| Q8_0 | 21 GB | 8-bit quantization | Maximum quality, requires 24GB+ VRAM |
| Q5_K_M | 16 GB | 5-bit mixed precision | Best balance of quality/size |
| Q4_K_M | 15 GB | 4-bit mixed precision | Good quality, moderate VRAM |
| Q4_K_S | 14 GB | 4-bit small | Minimum VRAM, still usable |
imatrix Calibration
The importance matrix was computed using marketing-specific calibration data:
- Meta ads terminology and frameworks
- Scaling methodologies (Chad Scaling, Blitz Scaling)
- Creative optimization strategies
- Performance metrics (ROAS, MER, CPA)
Perplexity during imatrix computation: 2.0859
Specializations
- Meta Ads: Campaign structure, ABO vs CBO, audience targeting, budget optimization
- Scaling Frameworks: Chad Scaling, Blitz Scaling, systematic growth methodologies
- Creative Strategy: Hook frameworks, UGC structure, creative fatigue solutions
- Performance Metrics: ROAS optimization, MER targeting, attribution modeling
Evaluation Results
LLM-as-Judge Benchmark (Grok 4.1 Thinking)
Rigorous pairwise evaluation comparing CAI-20B-v2 (fine-tuned) vs GPT-OSS-20B (base model) using position debiasing.
Test Configuration:
- Judge Model:
grok-4-1-fast-reasoning(xAI) - Quantization Tested: Q4_K_M via Modal serverless inference
- Position Debiasing: Both orderings tested (A,B) and (B,A)
- Prompts: 10 domain-specific marketing questions
Summary Results
| Metric | Value |
|---|---|
| CAI-20B-v2 Win Rate | 70% |
| GPT-OSS-20B Win Rate | 30% |
| Ties | 0% |
| Confident Judgments | 100% |
Per-Question Breakdown
| Question | Winner | Key Finding |
|---|---|---|
| Scale FB ads $500β$5000/day | Baseline | Finetuned had repetition issues |
| Campaign structure for new brand | Finetuned | Specific budget splits ($1.5k/$2k/$1.5k) |
| Fix creative fatigue | Finetuned | 12 specific tactics vs generic guide |
| Metrics beyond ROAS | Baseline | Finetuned produced video hooks instead |
| Cold traffic ad copy | Finetuned | Clear Hook/Pain/Solution/Proof/CTA framework |
| ABO vs CBO decision | Finetuned | Precise decision trees with examples |
| Finding winning audiences | Finetuned | 10-step process vs off-topic response |
| Systematic creative testing | Finetuned | 8-step methodology with A/B testing specifics |
| Landing page conversion | Finetuned | 13 tactics with concrete examples |
| New product launch playbook | Baseline | Finetuned produced sales copy |
Evaluation Criteria (Rubric)
Each response judged on 5 criteria weighted equally:
- Actionability - Specific, implementable steps
- Depth - Expert-level understanding
- Structure - Organization and readability
- Specificity - Numbers, metrics, examples
- Practicality - Realistic for real businesses
Key Findings
Strengths of CAI-20B-v2:
- Superior framework-based thinking (Hook/Pain/Solution/Proof/CTA)
- More specific budget allocations and metrics
- Platform-specific tactics (Meta's Dynamic Creative, frequency caps)
- Better decision frameworks for common choices (ABO vs CBO)
Areas for Improvement:
- Occasional repetition loops in some responses
- Some prompts trigger promotional/video-script style outputs
- May benefit from better system prompting for consistency
Evaluation Methodology
Position Debiasing Protocol:
1. Run judgment with (Finetuned=A, Baseline=B)
2. Run judgment with (Baseline=A, Finetuned=B)
3. If both agree β confident result
4. If disagree β mark as TIE (position bias detected)
This eliminates the common LLM-as-judge bias of preferring the first or second response regardless of quality.
Usage
LM Studio
- Download any GGUF file
- Load in LM Studio
- Set context length to 4096+ for best results
- Recommended temperature: 0.7
Ollama
ollama run hf.co/tigres2526/CAI-20B-v2-GGUF:Q4_K_M
llama.cpp
./llama-cli -m CAI-20B-v2-Q4_K_M.gguf -p "How do I scale Facebook ads?" -n 512
Training Data
Fine-tuned on 588 curated Q&A pairs from proprietary marketing frameworks:
- Meta Ads Mastery (~150 examples)
- Blitz Scaling Framework (~140 examples)
- Creative Command Center (~160 examples)
- Iron Media Thesis (~138 examples)
Total training tokens: ~339,276
Citation
@misc{cai20bv2,
title={CAI-20B-v2: Marketing-Specialized Language Model},
author={Tigres2526},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/tigres2526/CAI-20B-v2-GGUF}
}
License
MIT License - Free for commercial and non-commercial use.
- Downloads last month
- 164
4-bit
5-bit
8-bit
Model tree for tigres2526/CAI-20B-v2-GGUF
Evaluation results
- Win Rate vs Base Model (LLM-as-Judge) on Custom Marketing Eval (10 prompts)self-reported0.700