Behind the scenes in an ebook

We all know that an ebook is a book you read on screen instead of on paper. But what is it behind the scenes? What’s in that file you download and what do you need to include when you make one?

(Note to technophobes: What follows is quite technical and is only aimed at those who are interested. You don’t need to understand it if you’re using conversion software on KDP or Draft2Digital to create your ebook.)

Basically, an ebook is a set of files zipped up together into an archive. Instead of calling the result a .zip file, it’s called .epub, .mobi or whatever flavour of ebook is intended. When you click a .zip file on your computer, the files are extracted and expanded to their original form. When you open an ebook, your e-reader does the unpacking and organises the contents of the various files into an appropriate format for you to read.

At the time of writing, there are two major types of ebooks – epub on the one hand and various Amazon Kindle formats on the other. Amazon Kindles will read any of the Amazon supported formats (.mobi, .prc. .azw and others) but not epub. Other readers mostly read epub but not the Kindle formats. The actual differences between the two formats are extremely small so it’s hard to believe that this inability to read the opposition’s format isn’t just a commercial decision.

OK, so what are the files that get zipped up?

OPF (Open Packaging Format) Every ebook has one and only one .opf file.
It gives all the ‘meta’ information. (Author, ISBN, etc. Equivalent to the front matter in a paper book)
It lists all the the other files in the package.
It defines the order in which the HTML files should be organised into the ebook.
The OPF must be written in XML format according to the OPF open standard
NCX (Navigation Control file for XML applications). Every epub file should have one and only one .ncx file. but it is optional for the Kindle.
It provides information to allow the user to navigate around the ebook easily, which is particularly important in non-fiction books and anthologies.
The exact method of using this information varies from e-reader to e-reader.
The NCX must be written in XML format according to the OPF open standard
HTML (HyperText Markup Language). The book will have one or more html (or htm or xhtml or xml) files that contain the body of the book.
HTML is the language used for web pages and, because it anticipates the reflowing of text depending on the space available, it is ideal for ebooks.
One problem is that the web has advanced far beyond ebook reader capability so that only a restricted set of HTML can be used, but the extent of this restriction will vary between e-readers.
HTML provides hyperlinks so that is possible to let the user jump from one location in the book to another. If the e-reader is connected to the internet, it may also be possible to jump to a web page (for example the seller of the e-books that are suitable for the e-reader being used)
HTML is only supposed to provide text. The appearance of the text (size, font, colour, layout etc.) should be defined in separate CSS sheets – see below.
Images are listed as separate files that are called in by the HTML when the ereader creates the book on screeen.
CSS (Cascading Style Sheet). Optional.
The appearance of the text in the HTML files may be controlled by styles defined in one or more CSS files. Different HTML files can share the same CSS file.
pictures etc. (.jpg. .gif. .png) Images are not included within HTML files themselves. Instead HTML just lists the image file name at the appropriate point, and the e-reader inserts the picture. When HTML is being used with a internet browser, videos and sound clips can be included in the same way. This is not normally available on e-readers at the time of writing but there is little doubt that it will be available in the future
HTML Table of Contents Required for Kindle, optional for epub.
This is just another HTML file with links to various points in the book (typically each chapter).
It’s purpose is the same as the NCX file but it will always appear as a ‘page’ of the e-reader rather than in any other way.

That looks scary. Can’t I just use PDF? I’m used to that.
Although most e-readers can read PDF (Portable Document Format) files, it’s not the best choice of format for e-books. PDF documents are designed to exactly mimic the printed page so the text is too small to read on most ereader screens. Although it’s possible to zoom in to enlarge this, it’s nowhere near as easy to use as as standard ebook.

I’m still scared.
Don’t worry  too much about the technical stuff and all those initials. Everything is scary when it’s new and it’s getting easier all the time. Take a look at Ebook creation – the absolute basics to create a Kindle version really easily. 

Steve Kimpton