Invisible Text Scanner

(Input scanner)

The Invisible Text Scanner is designed to detect and remove non-printable, invisible Unicode characters from text inputs. This is crucial for maintaining text integrity in Large Language Models (LLMs) and safeguarding against steganography-based attacks.

How it works

The scanner targets invisible Unicode characters, particularly in the Private Use Areas (PUA) of Unicode, which include:

Basic Multilingual Plane: U+E000 to U+F8FF
Supplementary Private Use Area-A: U+F0000 to U+FFFFD
Supplementary Private Use Area-B: U+100000 to U+10FFFD

These characters, while valid in Unicode, are not rendered by most fonts but can be checked here.

It detects and removes characters in categories 'Cf' (Format characters), 'Cc' (Control characters), 'Co' (Private use characters), and 'Cn' (Unassigned characters), which are typically non-printable.

Invisible Text Detection Policy for AI Chatbot

Create a new policy as same as shown in LLM Guardrails Policy, for Invisible Text detection select scanner Invisible Text.

Optionally, perform a test to ensure the policy is functioning as intended. Check that Invisible Text is detected and blocked as specified.

PreviousBan Substrings Scanner NextCode Scanner

Last updated 5 months ago