Instructions to use fmops/distilbert-prompt-injection with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use fmops/distilbert-prompt-injection with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="fmops/distilbert-prompt-injection")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("fmops/distilbert-prompt-injection") model = AutoModelForSequenceClassification.from_pretrained("fmops/distilbert-prompt-injection") - Notebooks
- Google Colab
- Kaggle
What is the context length of this model?
I can't seem to find details in the model card. What is the context length? Any ideas for how to use it beyond the length?
For BERT/DistilBERT-style prompt-injection classifiers the practical ceiling is usually the tokenizer/model max length, commonly 512 tokens. You can confirm with tokenizer.model_max_length, as noted above.
For longer inputs, I would avoid head-only truncation. The failure mode is that an injection appended after benign content disappears before classification. A safer runtime pattern is:
- split into overlapping windows near the model max length
- score every window
- aggregate with max-risk / any-risk semantics
- keep the triggering span or window in the result so the caller can explain why it blocked
If this is going into a tool-calling agent, it also helps to scan by surface: retrieved content, model output, tool-call args, and outbound payloads should not necessarily share the same threshold. We are taking that staged approach in Armorer Guard as a fast local pre-tool-call gate: https://huggingface.co/armorer-labs/armorer-guard-semantic-classifier