Archive for July, 2008

This post continues the series that I have started here.

Some months ago Dark Reverser published a python script that allows to strip the DRM encryption from a secure Mobipocket ebook. For legal reasons the source code has been removed from the webpage but it’s too late. The knowledge about the encryption algorithm is out and I think it’s time to have a look at its strength (or rather weakness)

MobiPocket uses the 128bit version of the Pukall (PC1) algorithm. The 128 bit key is calculated by using the device PID and a secret key (which is not secret anymore…). But that’s not all. As you know, the current format allows up to 4 different PIDs with which a book can be decrypted. Technically this problem is solved in a nice way. The final key, which is used to encrypt the book, is encrypted with the temp key generated with the PID. The result is stored in the DRM section of the book together with a simple checksum and a verification number. To find the working key, decrypt the data with the key that has been generated using the PID.

TempKey  = (PID) encrypt with PC1 using (Secret Key)
FinalKey = (FinalKeyEncrypted) decrypt with PC1 using (TempKey)

According to the Pukall author, the 128 bit key length should guarantee that the key cannot be hacked in a reasonable timeframe. This is true, however, the implementation in MobiPocket makes finding the PID much easier:

  1. The PID consists of 8 alphanumeric characters (A-Z, 0-9)
  2. Only upper-case letters are used.
  3. The 8th character is a $ for PIDs generated on the PC and * for the Amazon Kindle.

To brute-force all combinations, a program whould have to do only 1’838’317’255’040 tests (41 bit instead of 128 bit!). This still sounds much but one has to remember that the PC1 algorithm has been optimized for speed. On fast hardware I was able to do roughly 250’000 tests per second, which means that a single CPU would be busy for more than 85 days. Today it’s no problem to spread the load though, so if you have 16 cores available the tests would be finished in roughly 5 days, with 32 cores it would be a good 2.5 days.

But wait, it’s possible to speed up the test. The data, which holds the encrypted data, starts with the verification field (4 bytes). Instead of decrypting the whole 32 bytes, it’s enough to start with the first 4 bytes and check if the verification entry matches. By using this trick I was able to get 590’000 tests per second. It’s only 35 days of work for a single CPU now or, with access to 35 cores, roughly 1 day.

It looks like PIDs generated for the PC have a $ as last character. If this is true, the number of tests is 1/35th of the total combinations (36 bit)!

An encryption scheme that can be brute-forced in such a short time cannot be considered as strong. Theoretically the used algorithm is strong enough but the implementation fails to make use of it. By the way, if all 10 characters of the PID would have been used, the time required to test all combinations would increase by a factor of 1156 (46 bit) and make a brute-force attack impractical.


Read Full Post »

A Better Dictionary in uBook

I have already mentioned before that my favourite program for reading eBooks on the Pocket PC is uBook. The only disadvantage is that no decent dictionary is available. This was already a problem when I had to purchase books in secure eReader format. Instead of using my already available Pocket Oxford English Dictionary I had to buy another one that could be integrated into the eReader. Now the same situation, only that no serious publisher will sell a dictionary for uBook. Deadlock.

There is hope, my friends. In the last months a couple of tools (with  questionable legal status) became available, among them a script called eReader2Html. With this script it’s possible to convert a secure eReader pdb file into plain html. The original version doesn’t work with dictionary files but you can open the script in an editor, look for the “Invalid file format” error and change the raise into a print.

print ValueError(‘Invalid file format’)

After entering name and credit card number as unlock keys and waiting loooong minutes you finally have a good old html file.

The uBook dictionary format is a little bit, well, special. You must surround the entries, as described in the FAQ, with <df> and </df>, e.g.

<DF><strong>acid rain</strong></DF> rain or other precipitation with a high concentration of acids produced by sulfur dioxide, nitrogen dioxide, and other such gases that result from the combustion of fossil fuels: it has a destructive effect on plant and aquatic life, buildings, etc.<br>

That’s not all. The html file should be split into several parts to improve the performance. The author suggested that you use splitdict, a program that is available in the tool section on the author’s homepage. However, this program looks for <dd> and <dt> tags, not for <df>. I finally ended up converting the more than 80000 lines manually. This was a great task for UltraEdit but you can use any editor you like. I have added one more <br> at the end of each line to make the files better readable.

The last step is to create a short 000.html file. Usually it serves as the index for the reader but if you don’t need it (I don’t) you can make it very short. It must be there though, otherwise your newly created dictionary won’t be recognized. Put all the files into a ZIP, copy it into the program’s dictionary folder and you are done.

You should try with the desktop version first to see if the dictionary is really recognized. Open the options menu and activate the new dictionary, then click on it. If uBook doesn’t open it you have made something wrong. If the index page is shown, everything is fine. 🙂

And so does it look like:

Read Full Post »

eBooks and DRM (1)

If you read eBooks, you will sooner or later stumble upon the different DRM methods used by the various book formats. The annoying thing is not that the eBook is restricted but that you are forced to use a specific reader. The big question for me as a customer is: is it legal? Am I allowed to bypass the DRM to be able to read my own private, purchased copy with another program? Let’s keep the question open for a while.

The least appealing program is the Microsoft Reader. After replacing my old PocketPC with a newer WM5 model I had big trouble activating my device. I found some information on the web and finally succeeded but it’s incredible that it didn’t work out of the box. Unfortunately Microsoft has stopped putting efforts into their reader, resulting in a GUI that is only barely customizable. There is hope though. The OpenSource ConvertLit lets you decrypt and explode all available LIT eBooks. Once this is done you can convert it into another format and use whatever reader you like.

A program that I like is the eReader. In the (now free) Pro version you can integrate dictionaries, which helps a lot if you are not a native speaker. eBooks are in the PDB format. Some clever person was able to figure out the protection scheme. You can use pdbshred to decrypt and explode your pdbs using your name and credit card number (as in eReader). Recently I became aware of another program that not only decrypts an eBook but also converts it from PML into HTML.

The last format, with a reader I basically like, is MobiPocket. The prc format is similar to PDB with some custom (undocumented) extensions. Early versions had a weak protection scheme allowing a generic ID to be used as decryption key. This has changed but again some smart person was able to figure out the protection scheme.

I really wonder how it was possible to reverse engineer the secure file formats. In case of the Microsoft LIT format I can imagine that someone found the DLL which does the actual decryption. This isolates the responsible code and makes it easier to attack the routines. No inside knowledge would be required.

PDBs and PRCs are a different story. Both require some basic knowledge about the data structures and about the used encryption routines. Take the secure Moibipocket format for example. The key is encrypted using the Pukall stream cipher hash. Some years ago I was highly interested in cryptography but I’ve never heard of this particular algorithm (created in the 90s). So how did the knowledge leak out? Or is it possible to find it out using normal reverse engineering? We will see. In my next blog entries I plan to have a closer look at the DRM routines.

One last point. The status of ConvertLit was discussed heavily when the program came out. I am wondering that an official website exists where tools are published that allow people to easily crack any LIT file. Is it tolerated because LIT comes from Microsoft? Does Microsoft simply don’t care? (Unlikely.) pdbshred is more hidden but nevertheless easily available. The latest version can only attack secure eReader files though and not MobiPocket books but it comes in source and binary format and is easy to use. The quite new MobiDeDrm though is a Python script and requires an interpreter. A normal Windows user, not used to the command line or the inner works of script files, would have big problems using it. Despite its limited use it’s impossible (?) to find the script. Traces are removed very quickly (e.g. Dark Reverser’s webblog has the source code only as comment; the pastebin entry is gone). This could mean that the used algorithm is indeed rather weak – we will see.

(If you didn’t know already, Mobipocket is owned by Amazon and the Amazon Kindle is using a slightly changed secure PRC format.)

Read Full Post »