Spytech: The Simple Guide to Secret Communication

communication

Image and audio-based steganography has been covered a lot, which involves changing the least significant digit of individual pixels on a photo or bits of an audio file. While you might be familiar with hiding data in images, I will show you a simple way one could make his own secret channel for communication by hiding data in plain sight – regular text.

Data can be hidden almost anywhere, you don’t even need fancy tools.

Characters Before Unicode

Fundamentally, computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different systems, called character encodings, for assigning these numbers.

These early character encodings were limited and could not contain enough characters to cover all the world’s languages. Even for a single language like English, no single encoding was adequate for all the letters, punctuation, and technical symbols in common use. Unicode has changed all that.

After Unicode

Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.

Support of Unicode forms the foundation for the representation of languages and symbols in all major operating systems, search engines, browsers, laptops, and smartphones — plus the Internet and World Wide Web (URLs, HTML, XML, CSS, JSON, etc.).

So how does this work?

secret communication

Thanks to such monumental scope, Unicode inherently supports characters from old languages like Persian, and it turns out some of them have special powers of invisibility! They are called zero-width-space and zero-width-non-joiner characters.

The most important thing about these special character types is that they’re not used in the English language. As the name tells, they have zero width and are not displayed or rendered on the user side. The hidden characters don’t even show up in text editors like nano!

This fact allows us to pick two arbitrary zero-width characters and designate them as one and zero. We can then hide any message in plain text by splitting it into single characters and encoding it in binary with zero-width characters acting like the ones and zeros. The best practice is to add the zero-width binary code in the spaces between words. Otherwise, spellcheckers tend to think the word is misspelled.

You can do all this in any programming language, but for example, we will use PHP.

Conversion from string to binary and vice versa:

// Convert a string into binary data
function str2bin($text){
    $bin = array();
    for($i=0; strlen($text)>$i; $i++)
        $bin[] = decbin(ord($text[$i]));
    return implode(' ',$bin);
}

// Convert binary data into a string
function bin2str($bin){
    $text = array();
    $bin = explode(' ', $bin);
    for($i=0; count($bin)>$i; $i++)
        $text[] = chr(bindec($bin[$i]));
    return implode($text);
}

Hiding it in the middle:

// Convert the ones, zeros, and spaces of the hidden binary data to their respective zero-width characters 
function bin2hidden($str) {
    $str = str_replace(' ', "\xE2\x81\xA0", $str); // Unicode Character 'WORD JOINER' (U+2060) 0xE2 0x81 0xA0
    $str = str_replace('0', "\xE2\x80\x8B", $str); // Unicode Character 'ZERO WIDTH SPACE' (U+200B) 0xE2 0x80 0x8B
    $str = str_replace('1', "\xE2\x80\x8C", $str); // Unicode Character 'ZERO WIDTH NON-JOINER' (U+200C) 0xE2 0x80 0x8C
    return $str;
}

// Convert zero-width characters to hidden binary data
function hidden2bin($str) {
    $str = str_replace("\xE2\x81\xA0", ' ', $str); // Unicode Character 'WORD JOINER' (U+2060) 0xE2 0x81 0xA0
    $str = str_replace("\xE2\x80\x8B", '0', $str); // Unicode Character 'ZERO WIDTH SPACE' (U+200B) 0xE2 0x80 0x8B
    $str = str_replace("\xE2\x80\x8C", '1', $str); // Unicode Character 'ZERO WIDTH NON-JOINER' (U+200C) 0xE2 0x80
    return $str;
}

Conclusion

You can use the classic spy trick of publishing an article, tweet, Facebook post, or some type of text document in a public space. For example, you could hide a secret message in a Craigslist ad, then have an individual recipient or group of people periodically check local Craigslist ads for a specific keyword. They would know to check the description for hidden zero-width character messages.

There are also some detectors you can check out that are built to detect these kinds of characters.

Check out also whitespace steganography that conceals messages in ASCII text by appending whitespace to the end of lines. Because spaces and tabs are generally not visible in text viewers, the message is effectively hidden from casual observers.

Stay curious and hack on!