Reflections on some early E-book experiments

It’s nearly 2 years since I got a Kindle, and back then I quickly started filling it with as much free stuff as possible from Amazon. I even got about halfway through Middlemarch, before I started exploring the other possibilities of the device.

First, I discovered the Amazon email service for document delivery:

You and your approved contacts can e-mail personal documents to your Kindle Keyboard through your Send-to-Kindle e-mail address. You must ensure that:

  • You have approved the sender’s e-mail address.
  • You gave the sender your Send-to-Kindle e-mail address.
  • Your document is a supported file type.

That proved pretty successful and is a free service too (as long as you use wifi, not 3G, to pick up the document). At a Repositories Support Project event in Sheffield I even suggested that a Download To My Kindle button might be a useful addition to an institutional repository. The germ of an idea!

I next discovered how to use Calibre to automate generating E-books/E-magazines from various content sources, such as online newspapers, in both Kindle’s MOBI format and in the open EPUB format (supported on most other tablet and e-book platforms).

Then I thought about making a ‘proper’ book. To save having to write one myself, I thought I’d create an anthology of short fiction. I downloaded ECUB and started harvesting some classic short stories from Project Gutenberg. The HTML versions of the stories on Gutenberg needed some work to normalise the markup across twenty files – nothing particularly difficult, though I did come to regret deciding to standardise the use of single- and double-quotation marks. The result was a little e- Short Story Anthology – potentially a completely public domain book that I could share or even try to sell online. (In fact one story that I used, and the cover image, are not public domain, but I’d run out of steam by then, and just wanted to shift my proof of concept.)

Playing with ECUB it was easy to see how an EPUB E-book was structured. Essentially it is little different from a simple website.  EPUB 2.0  was defined by three open standard specifications, the Open Publication Structure (OPS), Open Packaging Format (OPF) and Open Container Format (OCF), but at the heart of the E-book is a familiar group of HTML, CSS and image files – tightly bound together by the OCF specified structure,the OPS content specification, and the OPF XML, and zipped into a single file. (EPUB 3 introduces some differences, but is essentially the same.)

I also noticed that ECUB could easily create a Table Of Contents page automatically – just like the web CGI scripts we still sometimes write – and also that it was easy to use hyperlinks between and within the HTML files in the package. This kind of added editorial value, by the way, is noticeably absent from many of the free E-books available on Amazon and elsewhere.

In my explorations, I also discovered that, although Calibre could convert my EPUB to an acceptable Kindle/MOBI format, some of the finer points of formatting that I had implemented with CSS in EPUB are not supported by the Kindle. For example, I’d used CSS rules to render quotation marks:

q:before { content: “\2018” ; }
q:after { content: “\2019” ; }
q q:before { content: “\201c” ; }
q q:after { content: “\201d” ; }

The reasoning behind this was the idea that it would make it easy to switch between the two common typesetting conventions – single outer/double inner quotation marks and double outer/single inner  quotation marks. But to get Kindle to render this, I’d have to revisit the original HTML files and ensure all such niceties were rendered in text or markup only. I may do that one day.

Turning back to EPrints (and also the theme of that presentation for RSP), we know we can get lists of items in the repository in all sorts of formats – RSS, XML, HTML, Endnote, etc. So it seems we have all the building blocks we need to write a script that takes a list of items in a repository, retrieves them, converts them (if necessary and possible), wraps them up according to the EPUB standard, and makes them available for download, or even emails them directly to your tablet or reader. Anthologizr in a nutshell, perhaps.


6 thoughts on “Reflections on some early E-book experiments

  1. One of the problems in terms of creating EPUBs from repository content is that the vast majority of content in repositories is in pdf?

    If it’s any help the CORE API allows you to request a ‘text’ version of papers where we’ve already done the work of converting the pdf to plain text. I can see problems with this as well – maybe loss of structure, headings etc. although I seem to remember there is some work on modelling the structure of research papers?

    I guess the underlying issue is really whether we can move from PDF as the default way of sharing/publishing academic – maybe epub provides an opportunity here…

  2. Hi Owen

    Well indeed, insightful as ever, you have spotted the elephant in the room 🙂 It’s just something we need to take one step at a time.

    On the plus side, many PDF pre-/post-prints in IRs are relatively plainly formatted, so that may be an aid to conversion to something acceptable in EPUB. Perhaps PDF will be increasingly deprecated as anything other than an optional delivery format – in favour of HTML/XML friendly formats that are easily transformable.

    The project is an opportunity to experiment and see how far we can get. And synergies with CoRe are definitely something we’ll be looking at. Perhaps we’ll end up with a model that only works for a single, JORUM-like collection that explicitly eschews PDFs. We’ll see.

    I seriously believe that EPUB offers the strongest hope yet of freeing the world from the tyranny of PDF and the archaic printing paradigms it perpetuates. This para from the EPUB standard states the benefits eloquently – we need to continue to push authors to see dynamic, reflowable e-book output as the gold standard, not the immutable pseudo A4 of PDF. Needless to say I’m totally at one with PMR’s views on the matter (here and many other places); and the E-book/E-reader/Tablet revolution further undermines counter-claims about the merits of PDF for anything other than electronic galley proofs.

  3. I’ve thought a few times about the potential of a repository ‘Send to my Kindle’ button. As Owen mentioned one small impediment is majority of content is often in PDF.

    The other issue, to which there is possibly the need for a centralised solution to be provided by someone, is the need for a trusted broker middleman. If I want repositories to send content to my Kindle, I would need to add each repository’s email address to my Kindle’s approved senders list. If instead there was a central broker service that would do this (think of things like, OA-RJ etc), then I could add a single email address (e.g. and then each repository just needs to implement the service or widget to enable this.

    (It appears that is registered to Maybe this type of service is under development? Perhaps not, if Amazon would prefer most distribution to occur via their site?)

    • Hi Stuart

      I admit I haven’t explored the Email-to-Kindle infrastructure in detail for over a year now, but here’s what I thought at the time:

      If I have told Amazon that “” is an approved email (perhaps even my main Amazon account email address) – then, if my repository profile includes a “Approved Amazon Email” field which I set to “”, can we not simply do something clever in the server email settings to make that the “From” address?

      I am sure we would have to have an “Ebook Reader Preferences” control panel in the repository where this kind of thing would be entered by the user, along with the “To” address for the Kindle.

      But, yes, if this kind of approach gets taken up more widely, maybe a “Calibre-As-A-Web-Service” application would be a central broker. Maybe such things even exist already.

      I was going to discuss in my next post (first draft accidentally trashed, alas) my thoughts on the different ways one can get EPub/Mobi files onto devices – including tablets running the Kindle and Kobo apps – and (*spoiler alert*) even suggest that perhaps we might consider developing our own scholarly e-reader app…. 😉

      • I assumed the email-to-kindle service might check SPF records and things like that for protection – maybe not – perhaps that would break too many things?

  4. I had it working with Calibre a while back – Calibre offers this mailer control panel. We’ve got a perfect opportunity to try similar in Anthologizr – with SNEEP and MePrints between them, Rory and Patrick know enough about personalising EPrints to add a ‘Kindle from’ and ‘Kindle to’ address to the user profile and tweak the mailer functions – then we’ll push the button and see 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s