Invisible Text Scanner

(Input scanner)

PreviousBan Substrings Scanner NextCode Scanner

Last updated 9 months ago

Invisible Text Scanner

(Input scanner)

The Invisible Text Scanner is designed to detect and remove non-printable, invisible Unicode characters from text inputs. This is crucial for maintaining text integrity in Large Language Models (LLMs) and safeguarding against steganography-based attacks.

How it works

The scanner targets invisible Unicode characters, particularly in the Private Use Areas (PUA) of Unicode, which include:

Basic Multilingual Plane: U+E000 to U+F8FF
Supplementary Private Use Area-A: U+F0000 to U+FFFFD
Supplementary Private Use Area-B: U+100000 to U+10FFFD

These characters, while valid in Unicode, are not rendered by most fonts but can be checked .

It detects and removes characters in categories 'Cf' (Format characters), 'Cc' (Control characters), 'Co' (Private use characters), and 'Cn' (Unassigned characters), which are typically non-printable.

Invisible Text Detection Policy for AI Chatbot

Create a new policy as same as shown in LLM Guardrails Policy, for Invisible Text detection select scanner Invisible Text.

Optionally, perform a test to ensure the policy is functioning as intended. Check that Invisible Text is detected and blocked as specified.

PreviousBan Substrings Scanner NextCode Scanner

Last updated 9 months ago