Feeds:
Posts
Comments

Archive for the ‘eBooks’ Category

Amazon Kindle

After much talk about chess, a short switch to another hobby of mine: ebooks. With a jealous eye I always looked at the Kindle page of Amazon. It has by far the widest offer of ebooks. Usually I look around for something new and when I see that a book is available in electronic format, I check other sites if they offer it in a normal format for my ebook reader. You might already guess the sad truth – very often the book was available in Kindle format only.

To my great joy I found out that Amazon offers a Kindle reader application for the PC. The books are based on the Mobipocket format so the next question was: can I get the books working on my reader? Obviously I have missed much of the latest progress and the trouble the people went through to make it work but thanks to skindle it is possible to break the protection scheme and to create a normal Mobipocket book (which can be converted to HTML).

It’s interesting to notice here that the required PIDs to encrypt a book are only used one time and are generated for each book. They are also not limited to uppercase characters and numbers but make use of the whole alphabet. A generic brute-force attack (see my previous entry) is no longer feasible. On the other hand it’s amazing that people were able to figure out how to calucate the PID that is used. According to this link, it’s not so difficult at all but believe me, using a disassembler is very time consuming and requires a good portion of knowledge.

The final question is of course: is it legal? The Terms of use clearly state that it is not. You may only use the digital contents on the PC or on an authorized device. That really sucks. In addition, it makes a difference from which country you are. The price for US citizens can be lower and sometimes there are regional restrictictions as well. I noticed this first when I wanted to buy a Vonnegut book from Fictionwise. It was a frustrating experience to browse the online catalog and to be told that you cannot have the ebook. While Amazon only requires you to fill in an address, Fictionwise cannot be fooled that easily because they read the country information from the credit card number.

I don’t mind DRM and I respect that authors own the copyright but I don’t understand the way ebooks are treated and sold.

Read Full Post »

eBooks and DRM (3) – eReader

Last time I looked at the DRM implementation of the secure MobiPocket format. Now it’s time to see how eReader has done the homework.

First of all kudos to the Dark Reverser who has written a python script that shows how the DRM protection works (google for eReader2Html). Without him this article wouldn’t have been possible.

For a brute-force attack to succeed, the number of tested combinations shouldn’t be higher than ~ 40 bit. Even this would require a lot of compute power but it’s manageable.

eReader requires that the user enters two information to unlock a protected book: his name as it appears on the credit card and his credit card number. After some preparation two CRC32 checksums are created and used as input (64 bit). Let’s stop here for a while. The first CRC32 checksum can’t be predicted. Even after some transformation, the possible combinations are endless. The second part is different, we are talking about a credit card number from which the last 8 digits are used. Instead of 2^32 different numbers, this results in less than 2^27 only, so in total there are ~ 2^59 combinations (59 bit).

The eReader file is encrypted using the DES algorithm. An encrpyted key (which is stored in the DRM section of the file) must be decrypted using the 64bit input from step 1. This provides the real key with which the file can be decrypted. Before the decryption starts, a sanity check is done using an SHA-1 message digest.

Different things could be done to attack the protection scheme:

1. Brute-force to find the correct decryption key

As mentioned, not the whole 64bit range is used but only 59bit. This is still a lot – for the Mobipocket DRM protection only 41bit had to be tested. The SHA-1 digest can’t be cheated due to the avalance effect. A slight change will cause the entire digest to change and not only parts of it. I did a test with a simple python script and could create ~ 400’000 digests in 1 second – a brute force test with today’s hardware is completely absurd, even when using a program optimized for speed instead of the python script. On top of the digest creation we have to include the time to decrypt the stored key, which has to be done first and needs time as well.

2. Brute force using knowledge about the encrypted file

As we have seen in point 1, a pure brute-force attack is useless. However, the files are in a markup lanuage called PML and each section is compressed with zlib. If a pattern could be found, it would be enough to decrypt the first bytes and test if the key is valid. Obiously the first character is always a backslash so that the first 2 bytes of the compressed file are 0x78 and 0x9C. Another check could be to look for invalid characters – only printable ASCII characters would be allowed in the file. This would require a decryption of the first part of the file (which takes time). Unfortunately, and that’s the decisive factor, it’s not possible with such an approach to reduce the number of combinations that have to be tested so in worst case 64bit combinations must be tested.

3. Attacking the SHA-1 algorithm

The SHA-1 digest stores a hashsum of the encryption key. The hashsum has 160 bits and no brute force attack is known that would bring down the number of combinations to a reasonable number.

Conclusion

The protection used in the eReader withstands brute-force attacks.

Read Full Post »

Other Earths will be available soon. This is an original anthology of stories about alternate history. Usually I am not a big fan of this stuff but this time it’s different. Among the authors who have contributed are Robert Charles Wilson, Jeff Vandermeer, Gene Wolfe, Alastair Reynolds and Lucius Shepard! This is really exciting.

I face an interesting decision though. The book is only available as paperback and as eBook. After purchasing a Bebook this year I prefer to buy books as eBook and consider the hardcover or trade paperback only if there is a chance that I read the book more than once. The readability on the Bebook is superior to all other formats (paperback, mass paperback) because of adjustable fonts, font-size and line spacing. Trade paperbacks would be on par and have the slight advantage of a bigger book size but they cost more.

So what about Other Earths? Should I buy the paperback or stick to the eBook, which is – for whatever reason – not even cheaper when you buy a non-kindle version? I think I will go for the eBook. Fictionwise is currently running one of their special sales and give 30% rebate, which reduces the price to  $5.19.

By the way, if you think that this collection is hot then wait for the Songs of the Dying Earth with stories from Dan Simmons, Neil Gaiman, George R.R. Martin, Lucius Shepard, Jeff Vandermeer, Robert Silverberg, …!

Read Full Post »

This post continues the series that I have started here.

Some months ago Dark Reverser published a python script that allows to strip the DRM encryption from a secure Mobipocket ebook. For legal reasons the source code has been removed from the webpage but it’s too late. The knowledge about the encryption algorithm is out and I think it’s time to have a look at its strength (or rather weakness)

MobiPocket uses the 128bit version of the Pukall (PC1) algorithm. The 128 bit key is calculated by using the device PID and a secret key (which is not secret anymore…). But that’s not all. As you know, the current format allows up to 4 different PIDs with which a book can be decrypted. Technically this problem is solved in a nice way. The final key, which is used to encrypt the book, is encrypted with the temp key generated with the PID. The result is stored in the DRM section of the book together with a simple checksum and a verification number. To find the working key, decrypt the data with the key that has been generated using the PID.

TempKey  = (PID) encrypt with PC1 using (Secret Key)
FinalKey = (FinalKeyEncrypted) decrypt with PC1 using (TempKey)

According to the Pukall author, the 128 bit key length should guarantee that the key cannot be hacked in a reasonable timeframe. This is true, however, the implementation in MobiPocket makes finding the PID much easier:

  1. The PID consists of 8 alphanumeric characters (A-Z, 0-9)
  2. Only upper-case letters are used.
  3. The 8th character is a $ for PIDs generated on the PC and * for the Amazon Kindle.

To brute-force all combinations, a program whould have to do only 1’838’317’255’040 tests (41 bit instead of 128 bit!). This still sounds much but one has to remember that the PC1 algorithm has been optimized for speed. On fast hardware I was able to do roughly 250’000 tests per second, which means that a single CPU would be busy for more than 85 days. Today it’s no problem to spread the load though, so if you have 16 cores available the tests would be finished in roughly 5 days, with 32 cores it would be a good 2.5 days.

But wait, it’s possible to speed up the test. The data, which holds the encrypted data, starts with the verification field (4 bytes). Instead of decrypting the whole 32 bytes, it’s enough to start with the first 4 bytes and check if the verification entry matches. By using this trick I was able to get 590’000 tests per second. It’s only 35 days of work for a single CPU now or, with access to 35 cores, roughly 1 day.

It looks like PIDs generated for the PC have a $ as last character. If this is true, the number of tests is 1/35th of the total combinations (36 bit)!

An encryption scheme that can be brute-forced in such a short time cannot be considered as strong. Theoretically the used algorithm is strong enough but the implementation fails to make use of it. By the way, if all 10 characters of the PID would have been used, the time required to test all combinations would increase by a factor of 1156 (46 bit) and make a brute-force attack impractical.

Read Full Post »

A Better Dictionary in uBook

I have already mentioned before that my favourite program for reading eBooks on the Pocket PC is uBook. The only disadvantage is that no decent dictionary is available. This was already a problem when I had to purchase books in secure eReader format. Instead of using my already available Pocket Oxford English Dictionary I had to buy another one that could be integrated into the eReader. Now the same situation, only that no serious publisher will sell a dictionary for uBook. Deadlock.

There is hope, my friends. In the last months a couple of tools (with  questionable legal status) became available, among them a script called eReader2Html. With this script it’s possible to convert a secure eReader pdb file into plain html. The original version doesn’t work with dictionary files but you can open the script in an editor, look for the “Invalid file format” error and change the raise into a print.

print ValueError(‘Invalid file format’)

After entering name and credit card number as unlock keys and waiting loooong minutes you finally have a good old html file.

The uBook dictionary format is a little bit, well, special. You must surround the entries, as described in the FAQ, with <df> and </df>, e.g.

<DF><strong>acid rain</strong></DF> rain or other precipitation with a high concentration of acids produced by sulfur dioxide, nitrogen dioxide, and other such gases that result from the combustion of fossil fuels: it has a destructive effect on plant and aquatic life, buildings, etc.<br>

That’s not all. The html file should be split into several parts to improve the performance. The author suggested that you use splitdict, a program that is available in the tool section on the author’s homepage. However, this program looks for <dd> and <dt> tags, not for <df>. I finally ended up converting the more than 80000 lines manually. This was a great task for UltraEdit but you can use any editor you like. I have added one more <br> at the end of each line to make the files better readable.

The last step is to create a short 000.html file. Usually it serves as the index for the reader but if you don’t need it (I don’t) you can make it very short. It must be there though, otherwise your newly created dictionary won’t be recognized. Put all the files into a ZIP, copy it into the program’s dictionary folder and you are done.

You should try with the desktop version first to see if the dictionary is really recognized. Open the options menu and activate the new dictionary, then click on it. If uBook doesn’t open it you have made something wrong. If the index page is shown, everything is fine. 🙂

And so does it look like:

Read Full Post »

eBooks and DRM (1)

If you read eBooks, you will sooner or later stumble upon the different DRM methods used by the various book formats. The annoying thing is not that the eBook is restricted but that you are forced to use a specific reader. The big question for me as a customer is: is it legal? Am I allowed to bypass the DRM to be able to read my own private, purchased copy with another program? Let’s keep the question open for a while.

The least appealing program is the Microsoft Reader. After replacing my old PocketPC with a newer WM5 model I had big trouble activating my device. I found some information on the web and finally succeeded but it’s incredible that it didn’t work out of the box. Unfortunately Microsoft has stopped putting efforts into their reader, resulting in a GUI that is only barely customizable. There is hope though. The OpenSource ConvertLit lets you decrypt and explode all available LIT eBooks. Once this is done you can convert it into another format and use whatever reader you like.

A program that I like is the eReader. In the (now free) Pro version you can integrate dictionaries, which helps a lot if you are not a native speaker. eBooks are in the PDB format. Some clever person was able to figure out the protection scheme. You can use pdbshred to decrypt and explode your pdbs using your name and credit card number (as in eReader). Recently I became aware of another program that not only decrypts an eBook but also converts it from PML into HTML.

The last format, with a reader I basically like, is MobiPocket. The prc format is similar to PDB with some custom (undocumented) extensions. Early versions had a weak protection scheme allowing a generic ID to be used as decryption key. This has changed but again some smart person was able to figure out the protection scheme.

I really wonder how it was possible to reverse engineer the secure file formats. In case of the Microsoft LIT format I can imagine that someone found the DLL which does the actual decryption. This isolates the responsible code and makes it easier to attack the routines. No inside knowledge would be required.

PDBs and PRCs are a different story. Both require some basic knowledge about the data structures and about the used encryption routines. Take the secure Moibipocket format for example. The key is encrypted using the Pukall stream cipher hash. Some years ago I was highly interested in cryptography but I’ve never heard of this particular algorithm (created in the 90s). So how did the knowledge leak out? Or is it possible to find it out using normal reverse engineering? We will see. In my next blog entries I plan to have a closer look at the DRM routines.

One last point. The status of ConvertLit was discussed heavily when the program came out. I am wondering that an official website exists where tools are published that allow people to easily crack any LIT file. Is it tolerated because LIT comes from Microsoft? Does Microsoft simply don’t care? (Unlikely.) pdbshred is more hidden but nevertheless easily available. The latest version can only attack secure eReader files though and not MobiPocket books but it comes in source and binary format and is easy to use. The quite new MobiDeDrm though is a Python script and requires an interpreter. A normal Windows user, not used to the command line or the inner works of script files, would have big problems using it. Despite its limited use it’s impossible (?) to find the script. Traces are removed very quickly (e.g. Dark Reverser’s webblog has the source code only as comment; the pastebin entry is gone). This could mean that the used algorithm is indeed rather weak – we will see.

(If you didn’t know already, Mobipocket is owned by Amazon and the Amazon Kindle is using a slightly changed secure PRC format.)

Read Full Post »