Kindle xRef link bug


When I first started generating Mobi files (for the Kindle) in addition to my ePub workflow, I came up with a list of annoying formatting issues that I just accepted and chalked up as “Kindle Formatting Limitations”. ie:

  • No embedded fonts
  • Extremely limited stylesheet options
  • Formatting being ignored when linking from one section to another
  • Page elements being pushed to previous page when linking from one section to another
  • And so on

However, the item highlighted in green (in the above list) bugged me the most because it just didn’t seem right.

After digging much deeper into the issue, I finally found the cause, followed by a solution.
(I use InDesign CS5.5 for the majority of my ePub/Mobi workflows, so that is what I will be referencing in the following examples.)

My workflow:

It seems that the Kindle will choke if an “Anchor” id is placed (inline) with the referenced line of text:

<h1 id="toc_marker-2"><a id="Anchor"/>CHAPTER 2</h1>

This example references the id in “Green” above:

Jumping to Chapter 2 from the TOC link causes no formatting issues on the Kindle

And this example references the id in “Red” above:

Jumping to Chapter 2 from an in-text cross-reference causes the Kindle to ignore formatting for the destination line

Why are there 2 different “id” locations?

InDesign CS5.5 creates ePub “id”s based on 2 types of sources:

  1. In the ePub Export Settings Dialog, you choose which InDesign TOC preset to use in order to create the .ncx file for the ePub

    When InDesign creates these links within the ePub files, it places the “id”s inside the opening html tag:

    <h1 id="toc_marker-2">

    When run through Kindlegen to create the Mobi file, these links work just fine on the Kindle.

  2. When you create internal Hyperlinks and Cross-references within InDesign, they export to ePub links as “id”s wrapped in <a> tags, and then placed in-between the corresponding HTML opening/closing tags:
    <h1 id="toc_marker-2"><a id="Anchor"/>CHAPTER 2</h1>

    When run through Kindlegen to create the Mobi file, these links cause the “loss of formatting” issue (illustrated above).

**It is worth noting that both of these examples of “id” locations are perfectly valid HTML and ePub. The Kindle is the only eBook reader/device (that I am aware of) that does not play nicely with both.

How to fix this on the Kindle?

After some testing, I found that simply moving the “bad id” outside of the HTML tag, solves the issue:

From this:

<h1 id="toc_marker-2"><a id="Anchor"/>CHAPTER 2</h1>

To this:

<a id="Anchor"/><h1 id="toc_marker-2">CHAPTER 2</h1>

How I accomplish this using GREP

I use either Oxygen or TextWrangler (on my Mac) to do my ePub/Mobi post-processing.

  • Oxygen is my preferred program because it does not require cracking open the ePub ahead of time. And it also has built-in ePub 2 validation. However, it is painfully slow when trying to apply batch fixes on large ePub files (hundreds or thousands of internal ePub HTML files), which I do a lot of.
  • I use TextWrangler to run my batch fixes on these larger projects. You need to crack the ePub file open to do so, but it is extremely fast and worth the extra step. And it is also free.

Here is the GREP pattern that I would use to fix the example in this post:


(<h\d/?[^\>]+class="/?[^\>]+">)(<a id="Anchor"/>)



The above pattern splits the search pattern into 2 slices and then flips them.

It can (and should) be re-written to account for other HTML tags (ie. <p>, <pre>, <div>, etc.) and multiple “Anchor id”s (ie. “Anchor-1”, “Anchor-12”, etc.).
Basically, it can be re-written to account for your specific situations.

I hope this is helpful to you all.



Indexes for eBooks (are they necessary?)

I have been banging my head about this issue for a while now.

  • If my printed book has an index, should my eBook?
  • If the book was written and indexed in InDesign, how do I convert the index to clickable hyperlinks for my eBook?
  • Does an eBook index add any more value than the ability to search the digital book?
  • Isn’t the ability to search an eBook much more convenient and intuitive than having an index?

I have converted many types of books to eBooks (ePub and Mobi). Some have included short indexes. Some have no indexes. I have even done one where I included the print book index, but called it “Index of suggested search terms for eBook”, and did not make them clickable.

I am now working on a GIANT cookbook conversion for a 900+ page cookbook that includes an 80+ page printed index. Translated to ePub or mobi, this index list will probably be close to 1000 pages when viewed on a Kindle, iPad, etc. device.

Does this really make any sense?

After thinking about this for a while this morning, I have come to some personal conclusions about indexing for eBooks:

  1. Long indexes have no place in an eBook. The ability to search the content covers any need for it.
  2. A very short index is ok, as long as it doestn’t span more than a few iPad, Kindle, etc. pages.
  3. Printed indexes are convenient because you can quickly flip through the pages with your fingers to find what you are looking for. And then you go to that particular page in your printed book.
  4. Long indexes also tend to break eBooks when reading on certain devices, because it is just too much information for the device to handle.

I am curious as to other people’s thoughts on this subject.