Invisible Text Scanner
(Input scanner)
Last updated
(Input scanner)
Last updated
The Invisible Text Scanner is designed to detect and remove non-printable, invisible Unicode characters from text inputs. This is crucial for maintaining text integrity in Large Language Models (LLMs) and safeguarding against steganography-based attacks.
The scanner targets invisible Unicode characters, particularly in the Private Use Areas (PUA) of Unicode, which include:
Basic Multilingual Plane: U+E000 to U+F8FF
Supplementary Private Use Area-A: U+F0000 to U+FFFFD
Supplementary Private Use Area-B: U+100000 to U+10FFFD
These characters, while valid in Unicode, are not rendered by most fonts but can be checked .
It detects and removes characters in categories 'Cf' (Format characters), 'Cc' (Control characters), 'Co' (Private use characters), and 'Cn' (Unassigned characters), which are typically non-printable.
Invisible Text Detection Policy for AI Chatbot
Create a new policy as same as shown in LLM Guardrails Policy, for Invisible Text detection select scanner Invisible Text.
Optionally, perform a test to ensure the policy is functioning as intended. Check that Invisible Text is detected and blocked as specified.