Swift Security Docs
  • Introduction to Swift Security
  • Onboarding
    • Tenant Setup
    • Product Deployments
      • Browser Extension
      • LLM Guardrails
        • LLM Guardrails API Integration
      • VS Code IDE Extension
    • Directory Sync
      • Configuring Google Directory Sync
      • Configuring Microsoft Directory Sync
    • MDM
      • Extension Deployment Via Google Workspace
      • Extension deployment via google workspace + MDM at device level
      • Extension Deployment Via Microsoft Intune
        • Chromium Browsers in Windows
        • Edge Browsers in Windows
        • Firefox Browsers in Windows
      • Extension Deployment Via Kandji
        • Chromium Browsers in Mac
    • Infrastructure
      • SaaS Model
      • Hybrid deployment (coming soon)
  • SSO (Single Sign-On)
    • SSO Configurations identity provider - Google workspace
  • Administrative Guide
    • Console Users
      • Role Creation
      • RBAC General Settings for Login Methods (for the Console)
      • User Creation
      • SSO login(okta)
    • Swift Detection Engines
      • Data Identifiers
        • Custom Data Identifiers
      • EDM Dictionaries
        • EDM Rule
        • EDM Profile
        • EDM Extension Policy
      • Data Rules
        • Custom Rules
      • Data Profiles
      • LLM Guardrail Scanners
        • Data Protection Scanner
        • Gibberish Scanner
        • Ban Substrings Scanner
        • Invisible Text Scanner
        • Code Scanner
        • Language Scanner
        • Sentiment Analysis Scanner
        • Jailbreak Scanner
        • Toxicity Scanner
        • Prompt Injection Scanner
        • Token Limit Scanner
        • Reading Time Scanner
        • Language Same Scanner
        • No Refusal Scanner
        • Factual Consistency Scanner
        • Bias Detection Scanner
        • URL Reachability Scanner
        • Nudity Scanner
        • Gender Scanner
        • Celebrity Scanner
        • Face Scanner
        • Race Scanner
        • Performance and Benchmark
    • Browser Extension
      • Extension Installation
      • Granular Policies
        • Control URL access
        • Protect company data
        • Protect against Threats (Coming Soon)
      • Extension Alerts
      • Extension Events
      • Extension Popups
      • Browser Extension Coverage
    • LLM Guardrails
      • LLM Guardrails Policies
      • LLM Guardrails Alert
      • LLM Guardrails Events
    • Regulation Laws
  • Assets
    • Applications
    • Users
    • Extensions
  • Integration
    • Notification
      • Jira
      • ServiceNow
      • Slack
      • Splunk
    • Forensic
    • Feature
      • Rules Glossary
        • United States
        • Canada
        • Latin America
        • European Union
        • Australia
        • APAC (Asia-Pacific)
        • EMEA Countries
        • Others
      • Data identifiers Glossary
        • United States
        • Canada
        • Latin America
        • European Union
        • Australia
        • APAC (Asia-Pacific)
        • EMEA Countries
        • Others
      • Supported MIME Types
      • Supported OCR Format
    • Manage unauthorized access from unmanaged browser
  • Settings
    • Manage Reasons
    • Audit Log
  • Release Notes
    • Version - 1.27
    • Version - 1.26
    • Version - 1.25
    • Version - 1.24
    • Version - 1.23
    • Version - 1.16
    • Version - 1.15
    • Version - 1.14
    • Version - 1.13
    • Version - 1.12
    • Version - 1.11
    • Version - 1.10
    • Version - 1.09
    • Version - 1.08
    • Version - 1.07
    • Version - 1.06
    • Version - 1.05
    • Version - 1.04
    • Version - 1.03
    • Version - 1.02
    • Version - 1.01
Powered by GitBook
On this page
  1. Administrative Guide
  2. Swift Detection Engines
  3. LLM Guardrail Scanners

Jailbreak Scanner

(Input scanner)

PreviousSentiment Analysis ScannerNextToxicity Scanner

Last updated 9 months ago

The Jailbreak Scanner is designed to identify and prevent attempts to bypass security restrictions within AI systems.

How it works

The scanner analyzes the content of the prompt for patterns or keywords commonly associated with jailbreak attempts. This includes examining the text for instructions or commands designed to bypass security measures.

Jailbreak Detection: If the text is classified as Jailbreak, the Jailbreak score corresponds to the model's confidence in this classification.

Threshold-Based Flagging: Text is flagged as Jailbreak if the Jailbreak score exceeds a predefined threshold (default: 0.5).

Jailbreak Detection Policy for AI Chatbot

Create a new policy as same as shown in LLM Guardrails Policy, for Jailbreak detection select scanner Jailbreak.

Optionally, perform a test to ensure the policy is functioning as intended. Check that Jailbreak is detected and blocked as specified.