XSE Algorithm Description
This scheme uses Marsaglia's XorShift PRNG set to generate a keystream, and
then uses it to encrypt plaintext by simple XOR operation (in a way possibly
not so easy to break). Each PRNG requires a 32bit seed and generates
a different sequence of 32bit integers of period 2^32-1. As you probably
already know, they are fast (3 shifts and 3 xor to generate a number) and
have good statistical properties (and yes, they can be seen as linear
functions). Shift amounts and directions differentiates the 648 PRNGs that form
the complete set.
The primary key of the algorithm is 128bit long (the MD5 hash of the string
inserted on the command line). Firstly this key is extended to 1040 bytes by
using the MD5 hash function. This key can be viewed as an array of 32bit words.
For every 32bit word of input (read sequentially), 4 (possibly) different PRNGs
are selected from the set. Four sequential key words chosen from the extended
1040-bytes key are used as seed for the 4 PRNGs, generating 4 new numbers.
These four numbers are used to replace the four words of the key and to encrypt
the input word in this way: the first two are xored together and the result is
rotated by a certain amount that depends on the chosen PRNGs.
The same is done for the 3rd and 4th word.
The two resulting values are xored with the input word, together with another
value that depends (again) on the 4 chosen PRNGs. To achieve higher security,
the algorithm periodically regenerates the key and changes the PRNG selection
function. (I know that all this sounds vague and confusing - read carefully
the pseudo code below for a throughtful explanation)
The following technique is used to make the cipher resistant to chosen and
known plaintext attacks, and to verify payload integrity:
upon encryption an MD5 value ("uniqueID") is output before the actual
ciphertext. If the plaintext comes from a file, this value is equal to
MD5(MD5(plaintext)+key); if it comes from stdin, this value is set to 16 bytes
read from "/dev/urandom". The actual base key used during encryption is
MD5(key+uniqueID). By doing this, each bit of the key (and thus of the
ciphertext) depends on the entire plaintext. Upon decryption, a check is made
to verify that the uniqueID of the decrypted data matches the one included in
the ciphertext. If the ciphertext is modified by an attacker, this check will
fail. If the uniqueID comes from "/dev/urandom" then this check is skipped
(...only in this version...it could be made by calculating
MD5(MD5(plaintext)+key) on input stream EOF and by appending it at
the end of the output stream...but I'm lazy! :-).
To be more precise, here is the detailed pseudo code of the encryption process.
(k[0..259] is the extended 1040 bytes key (1040 bytes=260 32bit ints))
- Compute command_line_key=MD5(command line key string)
- Compute uniqueID=MD5(MD5(input file)+command_line_key)
- Compute key[0..3]=MD5(command_line_key+uniqueID)
- While any of the 32bit words k[0..3] equals 0 --> key[0..3]=MD5(key+uniqueID)
- selector=key[0]^key[1]^key[2]^key[3]
- orig_key=key[0..3]
- Extend key from 16 bytes to 1040 bytes: repeat key[i..i+3]=MD5(orig_key+key[i-4..i-1])
for "i" ranging from 4 to 256, step 4.
- Using the same method, generate 81 more md5 sums (1296 bytes). Associate
these bytes to the xorshift PRNGs (2 for each prng - a[0..647] and b[0..647])
- Shuffle the complete set of xorshift PRNGs, depending on the extendend key.
- Choose one of the 648 PRNGs as the function used to alter "selector" during
encryption ("selrng")
- Choose koff = offset in key[0..255]
- Set keyptr=&key[0]
- While more data to process:
- Read one block (possibly) of input data (consisting of 16384 or 1024
bytes, depending on the command line)
- Repeat "rounds" times the following:
For each input 32bit word of the block:
- Use the first byte of "selector" as an index in the shuffled PRNG
array. Use the indexed PRNG to transform the key word koff[0]. Use the
generated number to replace the key word, and annotate it.
- Use second byte of "selector" as an index in the shuffled PRNG
array+131. Use the indexed PRNG to transform the key word koff[1]. Use the
generated number to replace the key word, and annotate it.
- Use third byte of "selector" as an index in the shuffled PRNG
array+261. Use the indexed PRNG to transform the key word koff[2]. Use the
generated number to replace the key word, and annotate it.
- Use last byte of "selector" as an index in the shuffled PRNG
array+392. Use the indexed PRNG to transform the key word koff[3]. Use the
generated number to replace the key word, and annotate it.
- XOR the first two number together, rotate the result right by
b[selector[2]]&31 bits
- XOR the third and the fourth number together, rotate the result right by
b[selector[3]]&31 bits
- XOR the input word with these two rotated values
- XOR the input word with another 32bit int formed by concatenating
a[selector[0]] + a[selector[1]+131] + a[selector[2]+261] + a[selector[3]+392]
- Output the result as encrypted/decrypted word
- Update selector by running it through "selrng"
- Update koff by xoring it with
b[selector[0]] ^ b[selector[1]] ^ b[selector[2]] ^ b[selector[3]]
- Output processed block
- Regenerate 4 words of the key: Compute keyptr[0..3]=MD5(keyptr[0..3]+orig_key) until
keyptr[0..3] does not contain null 32bit words
- Reset selector: selector=keyptr[0]^keyptr[1]^keyptr[2]^keyptr[3]
- keyptr += 4
- Reset selrng and koff
I decided to make the number of rounds configurable so that if you have to
encrypt a small file (say 5MB or less -- should cover 90% of uses) and you
accept to wait a few tenth of second more, you can use 8 rounds with rapid key
regeneration ("e8+" mode) to reach a high level of security. As a general rule,
watch your PC's hd activity led during the "Encrypting..." phase. Try incrementing
the number of rounds (or add the "+" option) until the led starts to blink. By
doing this, you won't waste time waiting for your HD to do it's job. However,
people needing speed (say for streaming encrypted content) achieve speeds faster
than AES with default 1-round mode (unoptimized asm version runs 2x faster than
many AES implementations).
A word on self-decrypting files: by using this mode,
you enable your friends to receive files encrypted with XSE and decrypt them
without having anything installed on their machine. They just double-click on the
self-decrypting executable and, after introducing the key, the original file
appears next to it. It couldn't be easier! Keep in mind that the original filename
is stored inside the executable in plain text (at least in this version).
Two
friends of mine ("Roccio" and "Mandingo"), made a nice little GUI for windows
that uses the drag&drop method to automatically encrypt/decrypt files by calling
the dos executable. It doesn't mess your registry file up, and it's extremely easy
to use...try it!
Encryption raw speed (in MB/sec, not including I/O)
Tested on a 1.4GHz Centrino Notebook on 200MB of data
The use of the "+"
option (fast key regeneration) slows down speed by 1%..4% depending on the
number of rounds
| Rounds | Speed |
| 1 | 81.733 |
| 2 | 41.003 |
| 3 | 27.366 |
| 4 | 20.536 |
| 5 | 16.411 |
| 6 | 13.698 |
| 7 | 11.731 |
| 8 | 10.267 |
Final considerations
I really hope to get some feedback from all this (flames excluded). I've done
my best to make it secure, fast and scalable. Every pass of this scheme has a
reason behind and aims at making an attacker's task more difficult. But while
its security is kind of speculative, perhaps its most important quality is
simplicity. It should be very easy to cryptanalize it and to find
weaknesses/strong points against known attacks (which I know very little, I
admit...but that's why I'm posting it in a sci.crypt sandbox!). I've even
included a command line option ("-") that prevents the periodic state reset, so
that you can study the cipher more "confortably". PLEASE, if you
spend some time on this cipher, share your thoughts (positive or negative) with
me. Maybe it's strong, maybe I can improve it, maybe it's dead in the water,
but I'd really like to know your opinion - and learn something more.
One final word: if someone among you is an assembly code master and
knows how to optimize my little and completely unoptimized asm loop in xse.c,
you're welcome! :-)