Indexes for eBooks (are they necessary?)

I have been banging my head about this issue for a while now.

  • If my printed book has an index, should my eBook?
  • If the book was written and indexed in InDesign, how do I convert the index to clickable hyperlinks for my eBook?
  • Does an eBook index add any more value than the ability to search the digital book?
  • Isn’t the ability to search an eBook much more convenient and intuitive than having an index?

I have converted many types of books to eBooks (ePub and Mobi). Some have included short indexes. Some have no indexes. I have even done one where I included the print book index, but called it “Index of suggested search terms for eBook”, and did not make them clickable.

I am now working on a GIANT cookbook conversion for a 900+ page cookbook that includes an 80+ page printed index. Translated to ePub or mobi, this index list will probably be close to 1000 pages when viewed on a Kindle, iPad, etc. device.

Does this really make any sense?

After thinking about this for a while this morning, I have come to some personal conclusions about indexing for eBooks:

  1. Long indexes have no place in an eBook. The ability to search the content covers any need for it.
  2. A very short index is ok, as long as it doestn’t span more than a few iPad, Kindle, etc. pages.
  3. Printed indexes are convenient because you can quickly flip through the pages with your fingers to find what you are looking for. And then you go to that particular page in your printed book.
  4. Long indexes also tend to break eBooks when reading on certain devices, because it is just too much information for the device to handle.

I am curious as to other people’s thoughts on this subject.

Ron

Advertisements

35 comments

  1. A good print index is not simply a list of search results for a given keyword. It is a list of significant appearances of a term or combination of terms determined by a a qualified indexer. As such, I think the print index does have a place in an eBook, but it is worthless without hyperlinks to the location of the referenced text within the eBook.
    If print page location IDs are included in the coding of the eBook, it is easy enough to build a linked index in the eBook based on the print index. It is still a bit of a shotgun approach, as the link will not take you to the exact location of the text referenced, but only the beginning of the print page on which it appears. It may be two or three eBook pages away from the location of that print book ID.
    Far better, would be to have an indexer code content for indexing within the HTML/XML file with IDs around the indexed content. In this way the index link could go directly to the indexed content. Since many of the reading platforms are notorious to going to the first link on an eBook “page,” even this may not keep indexes from being truly useful.
    Until the sophistication of ePUB3 and updated reading platforms allows for creating a useful taxonomony of an eBooks content that is searchable and filterable, we are left with a bulk keyword search function and/or and index that is linking “close enough” to the content the reader is searching for. Neither option is optimal. We can only hope that the standard evolves quickly to something more useful.

    1. You make some very good points, Matthew.
      But what about a 900 page cookbook with an 80+ page index (printed).
      As an eBook, that index will stretch up to 1000 pages (depending on device/window size and font size).

      In my opinion, paging or scrubbing through this amount of data just to find a particular index entry/term is total overkill. Not to mention that it could take quite some time to get to the correct spot.

      In this particular situation, wouldn’t you agree that using the search function is a far better choice?
      Especially if the book already contains 2 TOC lists (1 listing the chapters and the other listing every recipe in the book).

      I feel like there needs to be some balance.
      As you said, a well thought out index with helpful terms is very useful for some books, but if it becomes so long that it takes forever to find what you are looking for, then it loses its value (in a digital book).

      For this particular project (giant cookbook) I feel that the full listing of recipes is more than enough to cover what the reader would potentially need.

      Thoughts?

  2. One suggestion I have is to run a regex that will build IDs to each h1, h2, h3, and h4 in the book. Chances are that with all heads linkable, the index entries will conveniently link to a close-enough point in the text. Obviously, depending on the type of content, your mileage may vary.

  3. Haven’t seen an eBook yet that benefitted from an index. Then again, haven’t seen many printed books that are well indexed. I fear that too many indexes are not created by professional indexers and perform little better than glorified contents — this is particularly so for cookbooks.

    As you say Ron, the ability to search for terms should be sufficient for a cookbook. This depends, of course, on how you expect people will interact with the book. Do people search through a cookbook for ‘egg’ recipes? If so, is the structure of your recipe listing going to help them?

    m.

  4. I like indexes, but in some epub files, they can be bothersome and almost useless. I bought an epub and PDF of an Oreilly book on CSS. I naturally preferred the epub, but I found that both the TOC and index were so huge that it was cumbersome to scroll through everything.

    On PDFs, I could get to the Index — it was easier to flip through the pages and peek at nearby index subjects.

    I recently created a specialized index (Works Cited in this Book) which was useful reading in and of itself. But if it had too many elements, it would probably be cumbersome.

    With regard to cookbooks, I imagine the challenge is what ingredient to create an indexterm for..

  5. I’m so glad I found this post! I work in a small publishing office where our books have numerous, lengthy indexes that are fine for print, but I started to question the necessity of it as I began formatting our first ePub files. The idea of scrolling through a list of 1,600+ names when you could just key in a search term seemed ridiculous…as I paged through my test version of the ePub file, I lost patience with the index structure and know our audience would as well. This post confirms my line of thought — now just to convince they key stakeholders… 🙂

  6. Indexes are far more than just a list of words (that would be a concordance, not an index). While the terms used in an index /may/ occur exactly as written in the text (and therefore be useful as search terms), the added value of a human-created index is that the indexer pulls together alternative terms for the same thing, groups discussions under a single term, references related terms, and more. None of these things happen with a word list or a simple keyword search. To take a simple example, consider a history text where the author uses the words “Charles Edward Stuart” “Bonnie Prince Charlie” and “Prince Charles” (perhaps even “the Jacobite pretender”) interchangeably. There is no single keyword search that will bring up all discussions of this person. But a human-created index can and will.

    macgrunt may not have seen very many good print indexes, but that simply means that the books he’s looked at were poorly indexed. It doesn’t necessarily follow that /all/ indexes are bad, nor that they are unnecessary in e-books.

    I agree that indexes as implemented in ebooks are less than ideal, however. They do not, for example, take advantage of technologies such as autocomplete or autoscrolling (cf. Christine’s and Robert’s complaint about scrolling through a really long index). Fixing this shortcoming requires two things: better encoding of indexes in epub format, including retaining the links, and better functionality by e-readers. Efforts are underway to address this — see for example http://code.google.com/p/epub-revision/wiki/IndexesMainPage .

  7. When you create online help in programs like RoboHelp you can embed key words that are used with the search function. It is a great way to embed alternate terms to improve the user experience. It seems logical to me that they same functionality could be embedded into ebooks.

  8. Search often retrieves an unmanageable number of hits or page numbers. We indexers break down that long list of page numbers into a user-friendly set of subheadings by using creative analysis of the text, so that you can actually find what you are looking for in a flood of information.

  9. The fundamental error here is that you take a print index and convert it to ebook – you hire an indexer and get them to create a fully linked index for the ebook – the index should be born digital. Linking to the ‘page that was’ is really not good enough – with a large font size on a Kindle or iPhone you could easily be 15+ pages away from the desired target – discussed here http://ccgi.jalamb.com/2011/05/kindle-and-the-index/

    1. Hi James,
      Though I agree with pretty much everything that you are saying, there is currently no good way to deal with digital indexes (especially across multiple reader platforms). For now, we are stuck using what is available to us.
      But when the day comes, that there is a functional digital indexing solution, it will be worth it to have indexing professionals involved so that works as it should.
      I look forward to that day.

      1. If you’re starting with paper, then you’re right that at present there isn’t a good non-manual solution. But if you’re working from electronic source files (e.g. InDesign, as in your second bullet point, or XML or Word), can you not use embedded indexing? Rather than linking to the page-that-was, you link right to wherever the term was embedded.

      2. Hi Michele,
        Currently, even if you have an index that is properly done within (and generated by) InDesign, that content is not included when exported to ePub.
        The reason is that the code that generates all the internal links to create the index are still based on page numbers. The links do not actually reference back to the index marker itself.
        I have had some very productive discussions, with some of the Adobe engineers, about solving this problem so that the indexes will eventually be included as clickable links when exported to ePub.
        I am hopeful that this will happen fairly soon.
        Cheers,
        Ron

      1. Hello Ron,

        The plugin is under development at the moment, not very stable. Currently working for PDF, if you have an indexed InDesign document I can do a HyperIndex so you can see the results.

        Rich

      2. Hi Rich,
        Sorry for the delayed response. I am definitely still interested in looking at your plugin.
        As a matter of fact, some of our (O’Reilly) indexers have heard of the plugin and are starting to ask about it.
        You may be on to something.

        Ron

  10. Dave Ream (Co-chair of the EPUB3 Indexes Working Group as a representative of the American Society for Indexing http://www.asindexing.org)
    The WG has been meeting this year to add a module to the standard for index “tagging”. In addition to the obvious pinpoint linking functionalities noted above, we have ideas for other types of functionality. We are still a year or more from these being implemented in any reading system. Our minutes and other documents are publicly available at https://code.google.com/p/epub-revision/wiki/IndexesMainPage?ts=1322858948&updated=IndexesMainPage

  11. On a separate note, LevTech is working as part of the ASI effort to develop workflow and tools which aid in taking a legacy print index and transforming it into a pinpoint index for use with eBooks. This does require involving an indexer again to review the print index to assign the pinpoint ids.
    For the future, single source publishing is a better approach but this requires a lot of upgrades to extant tools whether InDesign or the Stand-alone indexing software such as CINDEX http://www.levtechinc.com/publishing-indexing-products/cindex-software. For the latter Indexing Research’s CINDEX and LevTech’s HTML/Prep can be used to create a EPUB2/3 index outside of InDesign.

  12. And for those of you working in InDesign, we have a workaround for getting both a linked index for your epub and a print index at the same time. http://www.wrightinformation.com/Indesign%20scripts/Indesignscripts.html

    The error with thinking about the index as passe in ebooks is that you lose a lot of important semantic metadata and linkages, as Michele notes above. If you are interested in making your books more discoverable, you will need that pinpointed metadata. Maybe not this minute, but think about repurposing and reusing your content, and about ereaders that have evolved and incorporated better search facilities in the future. For instance, taking the cookbook and creating just a pastry ebook from it: when you pull the chapters, you can also pull the index and its metadata for just that smaller set. If you want to mashup several books and retrieve data across them, index data is invaluable as an entry point.

    Think of indexing as pinpoint metadata, and that you want your content to have that kind of detailed semantic metadata. As Michele points out, you want your content to have intrinsic data that knows “Charles Edward Stuart” “Bonnie Prince Charlie” and “Prince Charles” (perhaps even “the Jacobite pretender”) are all the same person, and indicates it. That’s what indexes can provide. They are controlled vocabularies.

    Current generic eReader-supplied search engines within small bodies of content usually give you generic results. Good search, Bonnie Prince Charlie-style search, comes from an engine being able to learn a specific body of content and from humans adding rules and congruencies to the engine. Indexers provide that detail when the search engine is too primitive or generic.

    One of the goals of the IDPF Indexes Working Group is to provide index markup that eReader manufacturers can customize features around, and incorporate into their search displays, making the search more fully featured. The IDPF can’t dictate that any reader company do this. If you have the data marked up, wouldn’t it be nice to have it fully utilized and easily accessible as a pop-up screen that autoscrolls as you type in your search term? Online help has had similar indexes for decades, so we already have a model for the interface.

  13. I have not tried InDesignCC yet, but the best solution, really the only one out there after searching for months, is a script collection called IndexUtilities by Kerntiff Publishing Systems ( http://www.kerntiff.co.uk/ ). Actually I discovered it on this forum, so I thought this would be an appropriate place to sing its praises.
    I just finished a book for Kindle that was a print book previously, and the material could not go without an index. About 10% of the print book consists of he index, so it has a pretty large footprint. I didn’t have the luxury of waiting for Adobe or the new index standards coming out in a year or two.
    In the IndexUtilities there is a script called HyperIndex, which hyperlinks all your index entries. It creates a beautiful index that works extremely well. It plays well with all the paragraph and character styles. Since it also comes with a ton of other scripts for index editing, importing, exporting, and moving index markers around, it actually streamlines the entire indexing process. Customer service is excellent, too. I hit some snags here and there (mostly me, I must confess) and got great support from Rich over at Kerntiff.
    For those of you who don’t switch over to the Cloud (I used InDesign CS6), I really recommend this solution. In fact, since the jury is still out on InDesignCC’s indexing, I recommend it to anyone.

    1. Leverage Technologies has HTML/Prep which works on a pseudo-tagging scheme that can be produced from CINDEX (www.indexres.com). It generates indexes for the web and for an EPUB2 eBook now. An EPUB3 index will be ready to go when the specification is approved by the IDPF.

  14. I’ve been testing CC, and its EPUB active index output is working great. I think the default index it produces needs to have a bit more of an indent, just for readability. There’s an easy edit to the CSS that can bring that about. More here: http://www.wrightinformation.com/Indesign%20scripts/Indesignscripts.html

    I’m about to test entries in tables and solo text blocks (as you would need if you had a graphic taking up an entire page, and needed to index it somehow: pasting on an empty text block to hold an entry is the workaround.)

  15. I think it depends on the type of book you are wanting to use.

    I used TomeRaider on Symbian for a long time for Hymnbooks and Bibles. If I am in Church, I want to quickly open a book to a particular Hymn number, or to a particular Book:Chapter:Verse.

    An index provides a quick way to do this. For all books, a quick way to jump to any particular Chapter should be available. But a quick search ought to suffice for many types of books.

    For a cookery book, an Index is a quick way to find suggestions. Turn to the “Deserts” section, and browse the recipes. Got a load of Apples that you want to use up, then find “Apples” in the index, and if the index has been put together well, it makes suggestions as to what you can make.

    I have yet to find anything that I can easily use to replace TomeRaider.

  16. I’ll right away grab your rss as I can not to find
    your e-mail subscription link or e-newsletter service.
    Do you’ve any? Kindly allow me recognize so that I may subscribe.
    Thanks.

  17. Do you know of any tools to auto-index epub’s ? I’m looking for one that is based on the EPUB3 CFI standard, preferably.

    1. Hi Bill,
      I have heard of, and even played with, a few plugins for InDesign that were supposed to do this, but they all failed miserably. If you happen to use InDesign to create your ePubs, the newest version (CC) includes the index as part of the exported ePub file, as long as it was built using ID’s Index generation dialog.
      Aside from that, custom scripting may be a good way to go, depending on your workflow.

      If you happen to find something that works, I would love to hear about it.

      Cheers,
      Ron

    2. Good indexes cannot be created by automated tools. I’ve been working with indexes for 40 years and have lost track of the number of software that have purported to do this and they have all faded away. At best you get something that takes so much editorial work to clean up that you should have just had an indexer create it. There are assists that can help a human indexer but nothing fully automatic anymore than full language speech recognition.

      1. I am inclined to agree with David. A good indexer can create a truly functional piece that makes the book usable. Software can sort out terms, but a list of terms with page numbers isn’t an index. You still need a person to sort them out and generate secondaries and tertiaries. That takes reading and comprehension of the topic. You will wait a long time before software can do that.
        In the last book e-conversion I did, we had an indexer give us a terrific index, and then we made it work with Kerntiff’s brilliant InDesign indexer. The reason I liked this work-path so much is that the indexer did not need to know anything about InDesign. She did her work that way she always does. We had the slog to make all the links, but then, we had the software. And once we ran the indexer, out came a thing of beauty.
        And to answer Barry’s question, usability is the key. It completely depends on the type of book. Generally the more informational, the greater the need for an index.

    3. Were you able to find anything on this? We’re currently working on a very similar implementation and would love to know what you found out. Thanks

      1. There has not been much movement in the area of ePub indexing. However, if you are using InDesign CC and your files are properly indexed using ID’s index function, you will get a full (clickable) index as part of your ePub export.
        So, that is at least a step in the right direction.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s