Monday, August 19, 2013

XOR Encryption

Introduction

With Edward Snowden and NSA spying being in the news recently, I've developed an interest in cryptography. While I'm, by no means, a knowledgable cryptographer, I worked with my friend, Billy Rieger, to implement a 3-pass XOR encryption script in Python. In this post I'll describe how it works, its benefits, and its problems.

XOR Encryption

XOR, or eXclusive-OR, is a binary operation that compares two binary values. XOR only returns true when the two binary values that are being compared are different. Below is an example of XOR in a truth-table:

  p     q   p XOR q
T T F
T F T
F T T
F F F

Notice that if we XOR q and p XOR q, we get p. If we think of p and q as binary numbers (1 or 0), XOR acts as binary addition without carry. Additionally, XOR is commutative and associative, which means that q XOR (p XOR q) is logically equivalent to (q XOR p) XOR q and (q XOR q) XOR p. Since any binary number XOR'd by itself results in a 0, (q XOR q) XOR p is equivalent to 0 XOR p, which is equivalent to just p.

XOR encryption takes advantage of these properties by having the message you want to send be p and the key you want to encrypt the message with be q. When the message and the key are XOR'd, the resulting encrypted message is nearly impossible to crack without the key. However, the encrypted message can be easily decoded by XOR-ing it with the key.

3-Pass Protocol

Car Talk describes an easy to understand version of the 3-Pass Protocol with one of their weekly puzzlers:
Imagine you have a friend who lives in Russia where the KGB spies on everyone and everything and you want to send a valuable object to this friend. So you have a box which is more than large enough to contain the object and you have several locks with keys. Now this box, I suppose you could call it a strongbox, has a lock ring which is more than large enough to have a padlock attached to it. In fact it's large enough to accommodate several locks. But your friend does not have the key to any lock that you have. Now you can't send a key in the mail because the KGB will intercept it and they will copy it. And you can't not lock the box, because the object is very valuable. So you have to send it through the mail. You can't hand deliver it. You want to lock it so that your friend can open it, but the KGB can't.
The trick is to mail the box three times. First, you put your object in the box and secure it with a lock that only you have a key to, and mail it to your friend. Your friend then puts a second lock on the box (a lock that only they have a key to) and mail it back to you. Once received, you remove your lock and mail the box for a third time, back to your friend. Your friend can now unlock the box, rest assured that it never traveled through the mail unlocked.

We can do the same thing with XOR encryption: the first person creates a 'key' of random letters that is the same length as the message they want to send. The message and the key are converted to binary, XOR'd together, and the result is sent to the second person. The second person makes a second key that is the same length as the single-encrypted message, converts both to binary, XOR's them, and sends the, now, double-encrypted message to the first person. Since XOR is a commutative operation, the first person can XOR the double-encrypted message with the first key to remove the encryption from the first transmission. The, now, single-encrypted message is sent to the second person who can decrypt it using the second key to decode the original message.

The Problem(s)

The robustness of 3-pass XOR encryption falls apart in a worst-case scenario where an attacker knows what algorithm is being used and captures all three transmissions. If we represent the message as m, the first key as a, and the second key as b, capturing all three transmissions means the attacker would have copies of: m XOR a, (m XOR a) XOR b, and m XOR b. The attacker can XOR the first two transmissions to isolate b, which can then use to XOR and decrypt m.

One semi-solution would be to mix in different commutative operations into your protocol, such as addition/subtraction, multiplication/division, exponentiation/logarithm, etc, and switch between them after each set of transmissions. However, if an attacker knows what commutative operation is being used, this is susceptible to the same type of attack as XOR.

Another problem with 3-pass XOR encryption is the memory and bandwidth requirements. For any chunk of data that needs to be sent, you need twice that space to make a key of the same length (although this could be mitigated by algorithmically making a key during encryption so the key doesn't need to be stored en masse). Furthermore, you need three-times the bandwidth to send the data. For large messages being sent (say, a 10gb payload of documents), this adds up quickly.

The Code

Even with its problems, 3-pass XOR encryption is still worth learning because its a straight-forward introduction to encryption. Below is a hastily-written Python implementation of this algorithm.

(Note: Python has different behavior when reading and writing files in Windows and Unix, depending on which open() setting you pass it. Billy and I have tested this script and have confirmed that it works on 64 bit Windows and Mac OS X, so I will tentatively say that it is 'cross-platform')


No comments :

Post a Comment