Kindle xRef link bug

Background

When I first started generating Mobi files (for the Kindle) in addition to my ePub workflow, I came up with a list of annoying formatting issues that I just accepted and chalked up as “Kindle Formatting Limitations”. ie:

  • No embedded fonts
  • Extremely limited stylesheet options
  • Formatting being ignored when linking from one section to another
  • Page elements being pushed to previous page when linking from one section to another
  • And so on

However, the item highlighted in green (in the above list) bugged me the most because it just didn’t seem right.

After digging much deeper into the issue, I finally found the cause, followed by a solution.
(I use InDesign CS5.5 for the majority of my ePub/Mobi workflows, so that is what I will be referencing in the following examples.)

My workflow:
InDesign–>ePub–>(Oxygen/TextWrangler)–>ePub–>Kindlegen–>Mobi

It seems that the Kindle will choke if an “Anchor” id is placed (inline) with the referenced line of text:

<h1 id="toc_marker-2"><a id="Anchor"/>CHAPTER 2</h1>

This example references the id in “Green” above:

Jumping to Chapter 2 from the TOC link causes no formatting issues on the Kindle

And this example references the id in “Red” above:

Jumping to Chapter 2 from an in-text cross-reference causes the Kindle to ignore formatting for the destination line

Why are there 2 different “id” locations?

InDesign CS5.5 creates ePub “id”s based on 2 types of sources:

  1. In the ePub Export Settings Dialog, you choose which InDesign TOC preset to use in order to create the .ncx file for the ePub

    When InDesign creates these links within the ePub files, it places the “id”s inside the opening html tag:

    <h1 id="toc_marker-2">

    When run through Kindlegen to create the Mobi file, these links work just fine on the Kindle.

  2. When you create internal Hyperlinks and Cross-references within InDesign, they export to ePub links as “id”s wrapped in <a> tags, and then placed in-between the corresponding HTML opening/closing tags:
    <h1 id="toc_marker-2"><a id="Anchor"/>CHAPTER 2</h1>

    When run through Kindlegen to create the Mobi file, these links cause the “loss of formatting” issue (illustrated above).

**It is worth noting that both of these examples of “id” locations are perfectly valid HTML and ePub. The Kindle is the only eBook reader/device (that I am aware of) that does not play nicely with both.

How to fix this on the Kindle?

After some testing, I found that simply moving the “bad id” outside of the HTML tag, solves the issue:

From this:

<h1 id="toc_marker-2"><a id="Anchor"/>CHAPTER 2</h1>

To this:

<a id="Anchor"/><h1 id="toc_marker-2">CHAPTER 2</h1>

How I accomplish this using GREP

I use either Oxygen or TextWrangler (on my Mac) to do my ePub/Mobi post-processing.

  • Oxygen is my preferred program because it does not require cracking open the ePub ahead of time. And it also has built-in ePub 2 validation. However, it is painfully slow when trying to apply batch fixes on large ePub files (hundreds or thousands of internal ePub HTML files), which I do a lot of.
  • I use TextWrangler to run my batch fixes on these larger projects. You need to crack the ePub file open to do so, but it is extremely fast and worth the extra step. And it is also free.

Here is the GREP pattern that I would use to fix the example in this post:

Find:

(<h\d/?[^\>]+class="/?[^\>]+">)(<a id="Anchor"/>)

Replace:

\2\1

The above pattern splits the search pattern into 2 slices and then flips them.

It can (and should) be re-written to account for other HTML tags (ie. <p>, <pre>, <div>, etc.) and multiple “Anchor id”s (ie. “Anchor-1”, “Anchor-12”, etc.).
Basically, it can be re-written to account for your specific situations.

I hope this is helpful to you all.

Cheers.

Advertisements

7 comments

  1. It’s a nasty MOBI bug (one of many), and this is an interesting method. I never thought about using a self-closing anchor tag. As an alternative I wrap headings in a div id=”xxx” tag and it seems to work okay on the Kindle.

    1. Glad to hear it Howie.
      Also, here is another way to get around this bug:
      For any links or xrefs that you may have within your file, if the target is the first item after a proper “break” (separate xHTML file in the ePub or in the Mobi file), then you don’t need the “anchor” at the end of the url.
      Simply removing the “anchor” from the end of the URL link will also resolve this issue.
      However, do not do this for links that are looking for anchors that are deep within a section. For these, my original suggestion is the way to go.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s