IntelliSA-220m

IntelliSA-220m is a fine-tuned Salesforce/codet5p-220m model for detecting security vulnerabilities in Infrastructure as Code (IaC) configurations across Chef, Ansible, and Puppet.

Model Details

  • Base Model: Salesforce/codet5p-220m (220M parameters)
  • Architecture: T5ForSequenceClassification
  • Task: Binary classification (secure vs vulnerable)
  • License: MIT

Performance

Technology F1 Score
Ansible 0.884
Puppet 0.756
Chef 0.698
Combined 0.779

Usage

from transformers import T5ForSequenceClassification, RobertaTokenizer
import torch

model = T5ForSequenceClassification.from_pretrained("colemei/IntelliSA-220m")
tokenizer = RobertaTokenizer.from_pretrained("colemei/IntelliSA-220m")

THRESHOLD = 0.61  # Classification threshold

def predict_vulnerability(code_snippet):
    inputs = tokenizer(code_snippet, return_tensors="pt", max_length=512,
                      truncation=True, padding=True)

    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

    score = predictions[0][1].item()
    is_vulnerable = score >= THRESHOLD
    return score, is_vulnerable

# Example
code = """
cookbook_file '/tmp/file' do
  mode '0777'
end
"""
score, is_vulnerable = predict_vulnerability(code)
print(f"Vulnerability score: {score:.3f}, Vulnerable: {is_vulnerable}")

Training

  • Learning Rate: 4e-5, Batch Size: 8, Epochs: 6, Weight Decay: 0.01
  • Framework: Transformers 4.45.2, PyTorch

Citation

PLACEHOLDER
Downloads last month
6
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for colemei/IntelliSA-220m

Finetuned
(90)
this model

Evaluation results