Also arising from promoting Anthologizr at OR2012 was the matter of when, whether and why we perceive electronic documents to truly be ‘books’.
It arose in part from my experience of downloading ‘books’ (in EPUB or MOBI format) from Amazon and elsewhere. Of course these are only electronic files, just like the thousands of PDFs, Word, HTML and text files I have downloaded before, so why do I find it easier to think of these as ‘books’?
Chris Awre has suggested that the significance lies in how the formats are then presented, and I am inclined to agree: it makes a considerable difference that the target device is a dedicated e-book reader – that portable, inexhaustible, infinite book that Borges promised us – rather than a laser printer or a desktop/laptop computer monitor. Not only does the dedicated device have the right dimensions and portability, but it offers “electronic” enhancements, such as automatic bookmarks, and full-text search.
As I was leaving Edinburgh, I was also made aware of the Getty Introduction To Metadata, an “online publication devoted to metadata, its types and uses, and how it can improve access to digital resources.” It is free online in HTML and PDF versions, as well as in a printed version. It doesn’t call itself a “book” and in my views, in its electronic manifestations, most definitely is not an e-book. The HTML version packs all the detritus of being a “website”, and the chunked PDF version no doubt prints beautifully on A4 or Letter, but falls far short of being anything I’d call a “book”.
And yet the Getty Introduction To Metadata could so easily have been issued as an “e-book” too. The chapters are just the kind of material one might expect to find in an Anthologizr-enabled repository, to pick-and-choose as appropriate.
I proved this, by turning it into a passable EPUB and (via Calibre) MOBI on the train journey back from Edinburgh to London. The terrible crimes that the Web version commits against the most fundamental principles of HTML markup made this a more laborious task than it might have been, but it is conceptually trivial, and readily scriptable.