Swift Security Docs
  • Introduction to Swift Security
  • Onboarding
    • Tenant Setup
    • Product Deployments
      • Browser Extension
      • LLM Guardrails
        • LLM Guardrails API Integration
      • VS Code IDE Extension
    • Directory Sync
      • Configuring Google Directory Sync
      • Configuring Microsoft Directory Sync
    • MDM
      • Extension Deployment Via Google Workspace
      • Extension deployment via google workspace + MDM at device level
      • Extension Deployment Via Microsoft Intune
        • Chromium Browsers in Windows
        • Edge Browsers in Windows
        • Firefox Browsers in Windows
      • Extension Deployment Via Kandji
        • Chromium Browsers in Mac
    • Infrastructure
      • SaaS Model
      • Hybrid deployment (coming soon)
  • SSO (Single Sign-On)
    • SSO Configurations identity provider - Google workspace
  • Administrative Guide
    • Console Users
      • Role Creation
      • RBAC General Settings for Login Methods (for the Console)
      • User Creation
      • SSO login(okta)
    • Swift Detection Engines
      • Data Identifiers
        • Custom Data Identifiers
      • EDM Dictionaries
        • EDM Rule
        • EDM Profile
        • EDM Extension Policy
      • Data Rules
        • Custom Rules
      • Data Profiles
      • LLM Guardrail Scanners
        • Data Protection Scanner
        • Gibberish Scanner
        • Ban Substrings Scanner
        • Invisible Text Scanner
        • Code Scanner
        • Language Scanner
        • Sentiment Analysis Scanner
        • Jailbreak Scanner
        • Toxicity Scanner
        • Prompt Injection Scanner
        • Token Limit Scanner
        • Reading Time Scanner
        • Language Same Scanner
        • No Refusal Scanner
        • Factual Consistency Scanner
        • Bias Detection Scanner
        • URL Reachability Scanner
        • Nudity Scanner
        • Gender Scanner
        • Celebrity Scanner
        • Face Scanner
        • Race Scanner
        • Performance and Benchmark
    • Browser Extension
      • Extension Installation
      • Granular Policies
        • Control URL access
        • Protect company data
        • Protect against Threats (Coming Soon)
      • Extension Alerts
      • Extension Events
      • Extension Popups
      • Browser Extension Coverage
    • LLM Guardrails
      • LLM Guardrails Policies
      • LLM Guardrails Alert
      • LLM Guardrails Events
    • Regulation Laws
  • Assets
    • Applications
    • Users
    • Extensions
  • Integration
    • Notification
      • Jira
      • ServiceNow
      • Slack
      • Splunk
    • Forensic
    • Feature
      • Rules Glossary
        • United States
        • Canada
        • Latin America
        • European Union
        • Australia
        • APAC (Asia-Pacific)
        • EMEA Countries
        • Others
      • Data identifiers Glossary
        • United States
        • Canada
        • Latin America
        • European Union
        • Australia
        • APAC (Asia-Pacific)
        • EMEA Countries
        • Others
      • Supported MIME Types
      • Supported OCR Format
    • Manage unauthorized access from unmanaged browser
  • Settings
    • Manage Reasons
    • Audit Log
  • Release Notes
    • Version - 1.27
    • Version - 1.26
    • Version - 1.25
    • Version - 1.24
    • Version - 1.23
    • Version - 1.16
    • Version - 1.15
    • Version - 1.14
    • Version - 1.13
    • Version - 1.12
    • Version - 1.11
    • Version - 1.10
    • Version - 1.09
    • Version - 1.08
    • Version - 1.07
    • Version - 1.06
    • Version - 1.05
    • Version - 1.04
    • Version - 1.03
    • Version - 1.02
    • Version - 1.01
Powered by GitBook
On this page
  1. Administrative Guide
  2. Swift Detection Engines
  3. LLM Guardrail Scanners

No Refusal Scanner

(Output scanner)

PreviousLanguage Same ScannerNextFactual Consistency Scanner

Last updated 9 months ago

It is specifically designed to detect refusals in the output of language models.

It can be especially useful to detect when someone is trying to force the model to produce a harmful output.

How it works

it has lighter version that uses a simple rule-based approach to detect refusals. Such approach is common in research papers when evaluating language models.

No Refusal Detection: If the text is classified as No Refusal, the No Refusal score corresponds to the model's confidence in this classification.

Threshold-Based Flagging: Text is flagged as No Refusal if the No Refusal score exceeds a predefined threshold (default: 0.5).

No Refusal Detection Policy for AI Chatbot

Create a new policy as same as shown in , for No Refusal detection select output scanner No Refusal.

Optionally, perform a test to ensure the policy is functioning as intended. Check that No Refusal is detected and blocked as specified.

LLM Guardrails Policy