Invisible Text Scanner
(Input scanner)
Last updated
(Input scanner)
Last updated
The Invisible Text Scanner is designed to detect and remove non-printable, invisible Unicode characters from text inputs. This is crucial for maintaining text integrity in Large Language Models (LLMs) and safeguarding against steganography-based attacks.
The scanner targets invisible Unicode characters, particularly in the Private Use Areas (PUA) of Unicode, which include:
Basic Multilingual Plane: U+E000 to U+F8FF
Supplementary Private Use Area-A: U+F0000 to U+FFFFD
Supplementary Private Use Area-B: U+100000 to U+10FFFD
These characters, while valid in Unicode, are not rendered by most fonts but can be checked here.
It detects and removes characters in categories 'Cf' (Format characters), 'Cc' (Control characters), 'Co' (Private use characters), and 'Cn' (Unassigned characters), which are typically non-printable.
Invisible Text Detection Policy for AI Chatbot
Create a new policy as same as shown in LLM Guardrails Policy, for Invisible Text detection select scanner Invisible Text.
Optionally, perform a test to ensure the policy is functioning as intended. Check that Invisible Text is detected and blocked as specified.