# AI Guardrails Index

## AI Guardrails Categories

We broke AI safety down into 6 categories and curated datasets and models that demonstrate the state of AI guardrails using LLMs and other open source models.

**Jailbreaking**  
Jailbreaking LLMs bypasses safety measures to generate harmful content, posing risks across industries. Learn how effective models resist attempts to bypass their safety controls and restrictions.  
Best Model: Detect Jailbreak  
Performance: 0.81

**PII Detection**  
Exposing unredacted PII in AI applications risks compliance violations and privacy breaches. Learn how well models identify and mask PII to ensure compliance and privacy.  
Best Model: Guardrails PII  
Performance: 0.65

**Content Moderation**  
Unchecked AI outputs can spread harmful content, posing reputational and compliance risks. Learn how well models filter toxic language and prevent the amplification of harmful content.  
Best Model: Toxic Language  
Performance: 0.72

**Topic Restriction**  
LLMs can generate off-topic or unauthorized content, leading to misuse and compliance concerns. Learn how well identify deviation from topic boundaries and guidelines.  
Best Model: Restrict to Topic (Hybrid)  
Performance: 0.93

**Competitors Check**  
The inadvertent creation or favoring of competitor mentions can impact brand equity and control. Learn how well models handle discussions of competing AI companies appropriately.  
Best Model: Competitor Check  
Performance: 0.67

**Hallucination**  
AI hallucinations can result in inaccurate and misleading text that is nonetheless compelling and convincing. Learn how different models tend to generate false or unsupported information.  
Best Model: Provenance LLM  
Performance: 0.77

## Model Leaderboard

A comprehensive visual comparison of how top-performing models stack up across key benchmarks like hallucinations, PII data exposure, and alignment with your AI strategy.

| Model                      | PII Score | Jailbreak Score | Content Moderation Score | On Topic Score | Competitor Check Score | Hallucinations Score | Latency (CPU) | Latency (GPU) |
|---------------------------|-----------|-----------------|-------------------------|----------------|------------------------|---------------------|----------------|----------------|
| GLiNER for PII           | 0.75      | 0.6             | 0.45                    | 0s             | 0.3                    | 0.5031s             | 0.0460s        |                |
| Guardrails PII           |           |                 |                         |                |                        | 0.5002s             | 0.0678s        |                |
| Presidio                  |           |                 |                         |                |                        | 0.0161s             | 0.0150s        |                |

## Deep dive into our findings

Learn more about our dataset curation process, our evaluation methodologies and our findings on the effectiveness of various guardrails.

## 24  
#### Guardrails tested

## 6  
#### Number of Datasets

## 32  
#### Days spent on GPU
