IntelliSA-220m

IntelliSA-220m is a fine-tuned Salesforce/codet5p-220m model for detecting security vulnerabilities in Infrastructure as Code (IaC) configurations across Chef, Ansible, and Puppet.

Model Details

Base Model: Salesforce/codet5p-220m (220M parameters)
Architecture: T5ForSequenceClassification
Task: Binary classification (secure vs vulnerable)
License: MIT

Performance

Technology	F1 Score
Ansible	0.884
Puppet	0.756
Chef	0.698
Combined	0.779

Usage

from transformers import T5ForSequenceClassification, RobertaTokenizer
import torch

model = T5ForSequenceClassification.from_pretrained("colemei/IntelliSA-220m")
tokenizer = RobertaTokenizer.from_pretrained("colemei/IntelliSA-220m")

THRESHOLD = 0.61  # Classification threshold

def predict_vulnerability(code_snippet):
    inputs = tokenizer(code_snippet, return_tensors="pt", max_length=512,
                      truncation=True, padding=True)

    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

    score = predictions[0][1].item()
    is_vulnerable = score >= THRESHOLD
    return score, is_vulnerable

# Example
code = """
cookbook_file '/tmp/file' do
  mode '0777'
end
"""
score, is_vulnerable = predict_vulnerability(code)
print(f"Vulnerability score: {score:.3f}, Vulnerable: {is_vulnerable}")

Training

Learning Rate: 4e-5, Batch Size: 8, Epochs: 6, Weight Decay: 0.01
Framework: Transformers 4.45.2, PyTorch

Citation

PLACEHOLDER

Downloads last month: 6

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for colemei/IntelliSA-220m

Base model

Salesforce/codet5p-220m

Finetuned

(90)

this model

Evaluation results

Combined F1 Score
self-reported

0.779
Ansible F1 Score
self-reported

0.884
Puppet F1 Score
self-reported

0.756
Chef F1 Score
self-reported

0.698