CAI-20B-v2-GGUF

Marketing-specialized 20B parameter LLM fine-tuned on proprietary Meta ads, scaling, and creative optimization frameworks.

Model Details

Attribute Value
Base Model GPT-OSS-20B
Architecture Mixture-of-Experts (32 experts, 4 active)
Context Length 131,072 tokens
Training LoRA fine-tuned on 588 curated marketing Q&A pairs
License MIT

Available Quantizations

All quantizations use imatrix (importance matrix) computed from domain-specific marketing calibration data for optimal quality preservation.

Quantization Size Description Use Case
Q8_0 21 GB 8-bit quantization Maximum quality, requires 24GB+ VRAM
Q5_K_M 16 GB 5-bit mixed precision Best balance of quality/size
Q4_K_M 15 GB 4-bit mixed precision Good quality, moderate VRAM
Q4_K_S 14 GB 4-bit small Minimum VRAM, still usable

imatrix Calibration

The importance matrix was computed using marketing-specific calibration data:

  • Meta ads terminology and frameworks
  • Scaling methodologies (Chad Scaling, Blitz Scaling)
  • Creative optimization strategies
  • Performance metrics (ROAS, MER, CPA)

Perplexity during imatrix computation: 2.0859

Specializations

  • Meta Ads: Campaign structure, ABO vs CBO, audience targeting, budget optimization
  • Scaling Frameworks: Chad Scaling, Blitz Scaling, systematic growth methodologies
  • Creative Strategy: Hook frameworks, UGC structure, creative fatigue solutions
  • Performance Metrics: ROAS optimization, MER targeting, attribution modeling

Evaluation Results

LLM-as-Judge Benchmark (Grok 4.1 Thinking)

Rigorous pairwise evaluation comparing CAI-20B-v2 (fine-tuned) vs GPT-OSS-20B (base model) using position debiasing.

Test Configuration:

  • Judge Model: grok-4-1-fast-reasoning (xAI)
  • Quantization Tested: Q4_K_M via Modal serverless inference
  • Position Debiasing: Both orderings tested (A,B) and (B,A)
  • Prompts: 10 domain-specific marketing questions

Summary Results

Metric Value
CAI-20B-v2 Win Rate 70%
GPT-OSS-20B Win Rate 30%
Ties 0%
Confident Judgments 100%

Per-Question Breakdown

Question Winner Key Finding
Scale FB ads $500β†’$5000/day Baseline Finetuned had repetition issues
Campaign structure for new brand Finetuned Specific budget splits ($1.5k/$2k/$1.5k)
Fix creative fatigue Finetuned 12 specific tactics vs generic guide
Metrics beyond ROAS Baseline Finetuned produced video hooks instead
Cold traffic ad copy Finetuned Clear Hook/Pain/Solution/Proof/CTA framework
ABO vs CBO decision Finetuned Precise decision trees with examples
Finding winning audiences Finetuned 10-step process vs off-topic response
Systematic creative testing Finetuned 8-step methodology with A/B testing specifics
Landing page conversion Finetuned 13 tactics with concrete examples
New product launch playbook Baseline Finetuned produced sales copy

Evaluation Criteria (Rubric)

Each response judged on 5 criteria weighted equally:

  1. Actionability - Specific, implementable steps
  2. Depth - Expert-level understanding
  3. Structure - Organization and readability
  4. Specificity - Numbers, metrics, examples
  5. Practicality - Realistic for real businesses

Key Findings

Strengths of CAI-20B-v2:

  • Superior framework-based thinking (Hook/Pain/Solution/Proof/CTA)
  • More specific budget allocations and metrics
  • Platform-specific tactics (Meta's Dynamic Creative, frequency caps)
  • Better decision frameworks for common choices (ABO vs CBO)

Areas for Improvement:

  • Occasional repetition loops in some responses
  • Some prompts trigger promotional/video-script style outputs
  • May benefit from better system prompting for consistency

Evaluation Methodology

Position Debiasing Protocol:
1. Run judgment with (Finetuned=A, Baseline=B)
2. Run judgment with (Baseline=A, Finetuned=B)
3. If both agree β†’ confident result
4. If disagree β†’ mark as TIE (position bias detected)

This eliminates the common LLM-as-judge bias of preferring the first or second response regardless of quality.


Usage

LM Studio

  1. Download any GGUF file
  2. Load in LM Studio
  3. Set context length to 4096+ for best results
  4. Recommended temperature: 0.7

Ollama

ollama run hf.co/tigres2526/CAI-20B-v2-GGUF:Q4_K_M

llama.cpp

./llama-cli -m CAI-20B-v2-Q4_K_M.gguf -p "How do I scale Facebook ads?" -n 512

Training Data

Fine-tuned on 588 curated Q&A pairs from proprietary marketing frameworks:

  • Meta Ads Mastery (~150 examples)
  • Blitz Scaling Framework (~140 examples)
  • Creative Command Center (~160 examples)
  • Iron Media Thesis (~138 examples)

Total training tokens: ~339,276

Citation

@misc{cai20bv2,
  title={CAI-20B-v2: Marketing-Specialized Language Model},
  author={Tigres2526},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/tigres2526/CAI-20B-v2-GGUF}
}

License

MIT License - Free for commercial and non-commercial use.

Downloads last month
164
GGUF
Model size
21B params
Architecture
gpt-oss
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for tigres2526/CAI-20B-v2-GGUF

Base model

openai/gpt-oss-20b
Quantized
(3)
this model

Evaluation results

  • Win Rate vs Base Model (LLM-as-Judge) on Custom Marketing Eval (10 prompts)
    self-reported
    0.700