Building Book Production Platforms p3

The editor

This series is based on HTML as a source file format for book production platforms. I have looked at many HTML editors over the years and can remember when the first in-browser editors appeared…it was a shock. Prior to that, all HTML creation was done by writing directly in HTML code, then came fully-featured environments like Front Page and Dreamweaver which allowed you to create HTML in a desktop app, then came wiki mark-up to liberate us all from the tedium of writing HTML, and then finally…the browser-based WYSIWYG editor…

It’s worth noting that the Wiki markup and WYSIWYG solutions were a different category to the previous solutions in that they weren’t designed for creating web pages, rather they were designed to enable the production of wikis and content management systems.

What-You-See-Is-What-You-Get at that time was a refreshing and liberating idea, a newcomer to this scene (although WYSIWYG as a concept and approach to document creation predates the web, with the first true WYSIWYG editor being a word processing program called Bravo, invented by Charles Simonyi at the Xerox Palo Alto Research Center in the 1970s, the basis for MS Word and Excel). Many WYSIWYG strategies have been explored, and many weaknesses unearthed, including the very important critique that What-You-See-Isn’t-What-You-Get, because the HTML created by these editors is unreliable, but more on this later…

As far as I can tell, the first HTML-based WYSIWYG editor was Amaya World, first released in 1996. I don’t know what WYISWYG editor was the first to be embedded in a browser (if you know, please email me). However, I remember TinyMCE like it was a revelation. According to the Sourceforge page, they started building it around 2004 to solve the need to produce HTML in content management systems. It was, and is a great product. The strategy at the time was pretty much to emulate rendered HTML within an HTML text field. TinyMCE (and the others that followed until contenteditable came along) used a heap of JavaScript to turn a simple editable text field into a window onto the browser’s layout engine.

alt.typesetting

From this point, a number of plugins were developed for use with WYSIWYG editors like TinyMCE to extend the functionality.

Some of these plugins ventured into the ever-important area of typesetting. TinyMCE even tried at times to make up for the lack of browser functionality in this area – for example, there were some early and workable attempts to bring equation editing into TinyMCE. I can’t remember when it was, but surely around 2006/2007 that IMathAs had an experimental jab at this. I thought it was pure genius at the time as there was no other solution (I searched! a lot!). As I can remember they used a very clever round-tripping to achieve the result… essentially, since browsers didn’t then support Math, IMathAs supported inline equation writing using ASCIIMath syntax. When the user clicked out of the field, the editor sent the equation markup to the server, and the server returned the rendered equation as either a bitmap (PNG, JPG etc) or as vector graphics (SVG). It was genius and I built it into the workflow for FLOSS Manuals around 2010 because we wanted to write books with equations for software like CSound (produced in 2010/11). It worked great – the equations always looked a bit ‘bit-mappy’ but we could write and print books with equations using in-browser editors and HTML as source (the HTML produced included equations as images so we could render PDF direct from the HTML). Awesome.

It’s also worth noting that these days math typesetting has largely moved to the client side with the evolution of fantastic libraries like MathJax  and KateX. These are JavaScript typesetting libraries designed to be included in web pages and render math from markup on the client side. There are one or two tools that still use server-side rendering, notably Mathoid, and this is often used to reduce the burden on the client’s browser, however, they have possible additional bandwidth costs as the client and server must remain in communication with each other, otherwise nothing will be rendered or displayed.

Mature solutions for math and other typesetting issues are only just starting to come online – no surprise to historians who inform us that notations such as math and music were the last to come online for the printing press as well. The first book to contain music notation post-Gutenberg, the Mainz Psalter, was printed with moveable type, and the music notation was added manually by scribe. It seems the first thing to get right is the printing of text, all other notations come later in print systems. These solutions are slowly evolving – even music notation has its champions. However what is really surprising, is that Google, a company priding itself on being built from ground up by math-heads, seems to struggle to bring native math typesetting to their own browser . I would say that is embarrassing.

Contenteditable

Moving on from typesetting… The initial WYSIWYG editors proved an admirable solution for many content management systems. The name persisted but the background technology fundamentally changed in when the first implementations of the  W3C contenteditable specification for HTML5 was brought to the browser. Contenteditable is an attribute that you can add to a number of HTML container elements (like ‘P’ or ‘div’) that make their contents editable. So, in essence, you are directly editing the content in the browser rather than through some JavaScript text field trickery. This strategy might be called WYSI (What-you-see-IS). This strategy also spawned a whole new generation of editors leveraging this new native browser functionality. Aloha Editor was one of the first to grab the spotlight but there were many many others to follow. Additionally, the big legacy WYSIWYG editors such as TinyMCE and CKEditor added support for contenteditable although they were a little slow to the party.

Contenteditable at first promised a lot… native editing of the browser … phew … that certainly lowers the technology burden and opens the door to innovation and experimentation. Additionally, the idea that this is a read-write web suddenly comes more keenly into focus when you can just edit the web page right there and inherit all the same JavaScript and CSS that operates on the element you are editing. It’s good stuff.

Inevitably, though, some problems soon emerged, first some wobbly things about not being able to place a caret (your mouse pointer) between block elements (eg between two divs) was really a problem, but later a more serious issue was identified – contenteditable does not produce stable results across different browsers, such that if you edit one page in browser A, the resulting HTML could look different if you edited the same page in browser B. That might not affect many people – if you just want some text with bold and italics and simple things, then it doesn’t really matter… the HTML created will render results that will look pretty much the same across any browser. However there are use cases where this is a problem.

In the world I work in at the moment – scholarly publishing – we don’t want a manuscript that contains inconsistent HTML depending on the browser it was edited in … it hurts us down the road when we want to translate that HTML into different formats (eg JATS) or if we want to render that HTML directly to PDF and get consistent results.

So, unfortunately, editors like CKEditor (used by many book production platforms including Atlas), and TinyMCE (used by Press Books AKA WordPress), or Aloha (used by Booktype 2) have to use a lot of JS magic to produce consistent HTML to overcome the problems with contenteditable, and this doesn’t always succeed. I would recommend reading this article from the Guardian tech team about these issues. You also may wish to look at this video from the Wikimedia Foundation Visual Editor core devs for the comments on contenteditable (audio is lousy, jump to 1.14.00) (readable subtitles can be found here .

A better way

So…what can you do? The answer is kind of threefold.

First choice: decide not to care – an entirely legitimate approach. You can still do huge amounts with these editors, and if you need to tweak the HTML now and then, so what? I can clean up the HTML by hand for a 300-page book in an hour, not too tough really and it enables me to cash in on all the other enormous gains to be had from a single-source HTML environment.

Second choice: provide client-side and server-side cleanup tools. Most editors have these built in, but it’s also good to implement backend clean up tools to ‘consistify’ the HTML at save-time (or at least at pre-render time).

Third choice: find an editor that is designed to produce consistent HTML.

In my opinion, the third choice is the best long term option and the ‘right way’ to do things. Being able to produce reliable results with ease, and without having to do things twice, will make everyone’s life easier.

Thankfully there is a new editor on the scene that is designed to do just this – the Wikimedia Foundations Visual Editor. This editor was developed to help the Wikimedia Foundation solve an uptake problem … essentially there are not enough people these days prepared to sit around learning Wiki markup (which is pretty much a complicated scripting language these days). The resulting need to drop the threshold on the foundation’s contributions environment has resulted in the development of the Visual Editor (VE). New contributors can use an easy WYSIWYG-like environment instead of having to learn markup.

Obviously, the entire Wikimedia universe is already stored as wiki markup, so the editor needs to be able to translate between HTML and wiki markup on-the-fly (interestingly, it is actually part of much larger plan to store all Wikimedia Foundation content in HTML. To do this there is a back end called Parsoid that converts markup to HTML and vice versa. Also, the HTML produced by the editor obviously needs to be tightly controlled, otherwise the results are going to be a mess when converting back to wiki markup. VE does this by replicating the content in its own internal (JSON) model and displaying the results in a contenteditable region. When the content is edited, the edits are strictly controlled by the VE internal rules, and then rendered to display. The result… consistent HTML is produced across any edit session regardless of the browser used…

That’s pretty good news. This is one reason amongst many that the platform I am working on for the Public Library of Science has adopted VE software (we were the first to use it outside of the Wikimedia Foundation) and we are extending it considerably and contributing the results upstream to the VE repos. So far we have added table, equation, and citation plugins – all of which are in an early alpha state. If you want a peek, you can see some of the work here.

I highly recommend to anyone building a book platform, or any other kind of knowledge production platform, that you examine VE more closely. It is a sophisticated software and has been carefully thought through. It is still relatively immature, and development is happening at an incredible pace, which can make testing new plugins against an unstable API a little arduous … still, it is a great solution. VE also approaches content editing in a way that will open the door to concurrent editing via operational transformations in HTML, which is a hard problem and currently only solved by Google and Wikidocs (recently acquired by Atlassian.)

If you are in the process of choosing an editor, choose VE and contribute to the effort to make it not just the best Open Source solution to editing in the browser, but the best solution, full stop.

Building Book Production Platforms p2.

Amongst the core requirements for a book production platform are the source file format and the editor, and of course, these are intimately linked. The development team is usually faced with choosing the format first, then the editor.

Choosing a format

The choice is pretty much HTML? or not HTML?

Currently, HTML is the ruling choice of format for a web-based book production platform. HTML is native to the browser and has associated standards-compliant support, such as CSS and javascript. Inversely, not choosing HTML puts you in a bit of a hole and can create a lot of overhead.

It might be interesting to look back a little and learn from some others since there have already been projects in this space that started down non-HTML roads and then gave it up for HTML. Kathi Fletcher, originally the project manager and technical director for Connexions (now OpenStax) which built a custom XML editing environment for academic materials, later researched in-browser XML vs HTML editing environments for her Shuttleworth Foundation-funded OERPUB project. Kathi became convinced HTML was the way to go and did some great work on HTML editor usability with the Aloha HTML editor.

We have chosen to use HTML5 as the canonical format for open textbooks, because developers and tools are more plentiful for web technologies than XML technologies.

http://www.w3.org/2012/12/global-publisher/statements-of-interest/29-oerpub.html

The (closed source) O’Reilly Atlas platform also started with the complex AsciiDoc format (a form of markdown) and eventually awoke to the power of HTML in 2012.

HTML5-based authoring offers a streamlined production workflow for producing both print and digital outputs, facilitates “digital first” content development, and is a perfect fit for creating a WYSIWYG, web-based writing experience.

They then got an extra dose of religion and started a project called HTML Book which is a suggested ‘spec’ for a subset of HTML elements to be used in books.

So far I have not seen a book production platform travel the reverse direction, from HTML to something else. Instead, we are seeing more and more platforms start with, or change to, HTML as a source file format.

Markdown

Markdown is sometimes put forward as the way to go but I’m not going to go into that in too much detail here. I have talked about this elsewhere. The only additional thing I will say is that markdown causes even more issues for book production platforms than those included in that article. Namely, in an in-browser markdown environment, the markdown will most likely be displayed as rendered HTML next to the authoring pane. That is a huge amount of lost screen space and extra UI junk for no apparent gain. Think of the UX cost. If you don’t have that rendered display then you will most likely only see pure markdown in a text field with no rendered display. The user won’t really know if their document looks right until it is rendered somewhere down the line, which is also a tremendous cost to the user for no apparent gain. Markdown: all pain, no gain.

NB: There is a possible good use case for markdown as a helpful add-on for HTML WYSI editors but I will cover that later.

LaTeX

There is a more valid use case for LaTeX in the browser since some scientists and academics will never use anything else, and you’ll never convince them to adopt HTML regardless of the benefits. You are up against the great Church of Knuth and I don’t fancy your chances. If your audience is comprised of LaTeX addicts, then I think you have no choice other than to support that.

Many times I have talked about remedies for unstructured MS Word documents (for scientific manuscripts) only to have someone earnestly comment that if everyone just learned LaTeX we would be in a much better position… They might be right, but I’m pretty sure it’s never going to happen.

The preference for LaTeX is a legacy issue, and problematic, but needs to be dealt with. (Unfortunately, today’s Markdown heroes are growing legacy issues like this with each passing day, and that is going to cost us down the road).

Recently there has been some interesting work on in-browser LaTeX editing including the (closed source) Authorea platform and, most notably the (open source) ShareLatex platform. ShareLatex round trips the LaTeX syntax displayed and edited in a text area (in the browser), renders that to a bitmap on the server, and returns it to the browser for a side-by-side ‘WYSIWYG’. The effect is that you can see a just-in-time rendered view of the LaTeX as you type. It’s a neat trick and effective if you insist on LaTeX in a web-based platform. Then you just have to live with the UI costs. However ,you only need this approach if you wish to support the full LaTeX syntax. If you wish to just support LaTeX equations, you can use an HTML editor with a LaTeX plugin based on MathJax or the Khan Academies KaTeX(and there are some other solutions such as Mathoid).

Incidentally, if you need to support full LaTeX I highly recommend checking out ShareLaTeX over WriteLaTeX. They both have the same approach but WriteLaTeX is proprietary whereas you can pick up the ShareLaTeX code and integrate it straight away. You could even build your own ShareLaTeX-like interface, it’s not too tricky – together with a colleague – Rizwan Reza – and I (Riz did all the hard work) we managed to develop a workable prototype in about 2 days, but there are many gotchas setting up the LaTeX compiler correctly.

Not many book projects need LaTeX, so I will leave this as an interesting edge case. There are solutions if you need it, but not many people need it.

XML

I think I will just leave it to the words of the brilliant Dave Cramer (Hachette Book Group):

So we’ve chosen to describe our content with HTML, and build our production system around HTML.

When I tell people that, they smile condescendingly, and chuckle a bit. “That’s cute. Why don’t you use real XML?”

I then ask them what you can do in Docbook (or TEI, or NLM) that you can’t do in XHTML? I haven’t heard a good answer to that question yet. XHTML is XML, by definition. Calling something “para” rather than “p” doesn’t get you anything, except carpal tunnel syndrome and invoices from consultants

The problem with non-HTML XML is that it is essentially just XML the browser can’t use. Hence you lose all that other good stuff like WYSI editors, CSS design tools, cool tricks with JavaScript, and all the cool tools that are being developed for HTML. XML just can’t compete, plus you are going to need to convert the XML into HTML anyway. So don’t make life more complicated than it already is – continue your love affair with XML as long as it’s XHTML!

HTML

HTML is king in the browser and it gives you all you need to make books. I don’t want to spend a lot of time arguing the merits of HTML in this post as there is a lot to say and I want to bring that in at other points of the conversation. But in brief:

• HTML is supported by JS and CSS.
• The DOM is known natively by the browser.
• HTML is standards-based.
• It is straightforward.
• HTML is easy to read and easy to clean.
• HTML is the most popular file format on the planet.
• You can use HTML to build structure in documents with assigned class and id values, or microdata formats.
• HTML is the native file format for EPUB.
• PDF can be rendered directly from HTML in the browser (more on this later).
• HTML can be paginated in the browser.
• CSS is moving towards supporting more and more page based elements.
• The browser can act as a design environment.
• You can create real what-you-see-is (WYSI) production environments.
• Basic editing is built into the format itself.
• HTML is supported by an enormous number of tools for conversion (in and out).
• HTML is supported by an enormous repository of examples (the web).
• HTML is cheap to develop with.
• Even book designers are getting used to it.
• Some schools teach it.
• It has a million free tutorials online to help you use it.
• A lot of people know HTML.
• HTML is supported by a rapidly proliferating body of JavaScripts for typography, graph production, animation, interactions, dynamic rendering etc etc etc etc

The basic idea really comes down to this.

• HTML is the cheapest format of our time.
• HTML is the most popular format of our time.
• HTML is the networked document format of our time.

Increasingly HTML is the way stories are told, whether that is in books or on the web. It’s a trite analogy perhaps, but HTML is the paper of our time. As Dave Cramer says:

why start with something other than HTML, when you have to turn it into HTML anyway?

It should be noted that Cramer also turns HTML into paper, and the Hachette Book Group have produced many beautiful paper books using HTML as the source format. Many of these books you will now find in the best-selling sections of your local brick and mortar bookstore.

Other print producers are also using HTML as the source. Print-on-demand services, used to producing very ugly books by ingesting MS Word and dealing with all that ugly conversion, are also adopting HTML production environments. Books on Demand, Germany’s largest Print on Demand service, adopted Booktype so their customers could have an easy in-browser book production environment. The source format is HTML but the users don’t know that, and the books look better. That’s the beauty of HTML.

Finally, helped a lot by the efforts of Dave Cramer and the Hachette Book Group, Sourcefabric, the people at O’Reilly, and others adopting HTML, we might be starting to see the very beginning of the changing of the guard.

WYSIWYG vs WYSI

HTML is the new paper and the new path to paper online editing environments are becoming much more important for publishing. Dominant until now has been the WYSIWYG editor we all know and…err…love? However, the current WYSIWYG paradigm has been inadequate for a long time and we need to update and replace it. Producing text with a WYSIWYG editor feels like trying to write a letter while it’s still in the envelope. Let’s face it…these kinds of online text editors are not an extension of yourself, they are a cumbersome hindrance to getting a job done.

Apart from huge user experience issues the WYSIWYG editor has some big technical issues. Starting with the fact that the WYSIWYG editor is not ‘part of the page’ it is instead its own internally nested world. In essence, it is an emulator that, through Javascript, ‘reproduces’ HTML. As a walled/emulated garden, it is hard to operate on the objects in the garden using standard Javascript libraries and CSS. All interactions must be mediated by the editor. The ‘walled garden’ has little to do with the rest of the page – it offers a window through which you can edit text, but it does not offer you the ability to act on other objects on the page or have other objects act on it.

Thankfully a new era of editors is here and maturing fast. Still in search of a clearly embraced category name they are sometimes called ‘inline editors’ or HTML5 editors. This new generation takes a large step forward because they enable the user to act on the elements in the page directly through the HTML5 ‘contenteditable’ attribute. That allows ‘the page’ to be the editing environment which in turn opens up the possibility for the content to be represented in a variety of forms/views. By changing the CSS of the page, for example, we can have the same editable content shown as multi-column (useful for newspaper layout), as a ‘Google docs type’ clean editing interface, in a semantic view for highlighting paragraphs and other structural elements (important for academics), as well as other possibilities….

Additionally, it is possible to apply other javascript libraries to the page including annotation software such as AnnotateIt or typographical libraries such as kern.js. This opens up an enormous amount of possibilities for any use case to be extended by custom or existing third-party JavaScript libraries.

It is also possible to consider creating CSS snippets and apply them dynamically using the editor. This in effect turns the editor into a design interface which will open the path for in-browser design of various media including, importantly, ebooks and paper books.

There are various attempts at the HTML5 editor, which might also be called a ‘WYSI’ (‘What you see is‘) editor. The most successful are Mercury, Aloha and the recent fork of Aloha called ‘WYSIWHAT‘.

Each of these is treading their own path but things are really opening up. As an example, and with reference to the last post I made about math in browsers, the WYSIWHAT group is making some giant strides in equation editing. Their equation plugin which was first built by Mihai Billy Balaceanu at the September WYSIWHAT hack meet in Berlin has since been improved and extended by the Connexions team and the good people at OERPUB (including the talented trio of Phil Schatz, Kathi Fletcher and Marvin Reimer). The plugin was made by including MathJax in the page and allowing the editor to interact with that. This was not easily possible with previous WYSIWYG editors.

The progress on the equation front is looking very good but what this shows more than anything is that by using WYSI editors the entire page is available for interaction by the user or JavaScript. Anything you can think of that JavaScript can do you can bring to the editing environment, and that is quite a lot…

Published on O’Reilly, 3 December 2012 http://toc.oreilly.com/2012/12/wysiwyg-vs-wysi.html

What’s wrong with WYSIWYG

The current WYSIWYG paradigm has been inadequate for a long time and we need to update and replace it. Using a WYSIWYG editor is pokey and horrible. Producing text this way feels like trying to write a letter while it’s still in the envelope. These kinds of editors are not an extension of yourself, they are cumbersome hindrances to getting a job done.

Apart from huge user experience issues, the WYSIWYG editor has some big technical issues, starting with the fact that the WYSIWYG editor is not ‘part of the page,’ it is instead its own internally nested world. In essence, it is an emulator that, through JavaScript, reproduces the rendering of HTML. However, as a walled emulated garden it is hard to operate on the objects in the garden using standard JavaScript libraries and CSS. All interactions must be mediated by the editor. The ‘walled garden’ has little to do with the rest of the page. It offers a window through which you can edit text, but it does not offer you the ability to act on other objects on the page or have other objects act on it.

The new generation of HTML5 editors have taken a large step forward, not because they integrate HTML5, as such, but because they act on the elements in the page directly. That allows ‘the page’ to be the editing environment which in turn opens up the possibility for the content to be represented in a variety of forms/views. By changing the CSS of the page, we can have the same content represented as a multi-column editing environment (useful for newspaper layout), as a ‘google docs type’ clean editing interface (see demo below), a semantic layout for highlighting paragraphs and other structural elements (important for academics) as well as other possibilities….

Additionally, it is possible to apply other javascript libraries to the page, including annotation software such as Annotate It or typographical libraries such as lettering.js. This opens up an enormous amount of possibilities for any use case to be extended by custom or existing third party JavaScript libraries.

It is also possible to consider creating CSS snippets and applying them dynamically using the editor. This is in effect turns the editor into a design interface, which will open the path for in-browser design of various media.

Lastly, WYSIWYG editors, while marvellous in their day, have had their day. They feel too pokey and ‘old school’. Largely due to the success of Google Docs users no longer want to poke around in a tiny WYSIWYG editor. They want large clean interfaces for content production.

The above screen shot is taken from the semi-functional demo at http://data.flossmanuals.net/mercury/index.html (all demos work only in Chrome for now).

In brief summary, essential problems of the current WYSIWYG world are:

1. it is not easily possible to enable JavaScript libraries to act upon the objects in the editor
2. representing the content in context is difficult
3. the content is not part of a page so additional functionality like (non- intrusive) annotations cannot be added to content
4. dynamic rendering of content retrieved from the server is hard to achieve
5. dynamic creation of content is hard to achieve
6. inclusion of nested JavaScripts is hard to achieve
7. they look ‘pokey’ and old school
8. synchronous editing is possible but hard to achieve
9. users want to see the content they are making in a much cleaner and clearer ‘fullpage’ way
10. some users want semantic views
11. some users want design views
12. all users must be able to edit the content (designers, editors, content creators, etc) and do what they need from the same view

Online book production has special needs such as the ability to display the content as it might appear in the output, annotations, draft view etc.

The Ideal Editor

A short list of starting list of features for a new editor might look something like this:

1. harmonise edit, draft, proof, design features into one view
2. live chat
3. short messages which also support ostatus API (which is built on Twitter spec)
4. send to renderers from within the editor
5. ability to switch on and off third party JS libs
6. apply CSS templates to content
7. create CSS snippets
9. share snippets and templates
10. dynamic snippet rendering
11. live template swapping including semantic layout and ‘output’ view
12. synchronous editing
13. per ‘chunk’ notes

Many of these features are relatively easy to achieve, others will take some careful thought and planning.

The Dream features

1. backend hook up to git
2. versioning of content especially for different outputs (A4 vs A5 pages, html, epub etc)

Where to start

We need to develop with an editor in the hand. The current batch of html editors does pretty well but we need to choose one which has a good support community and a good feature set. Currently, there is just one option – Mercury editor. It currently has a new home page (less than 2 weeks old), and a thriving community.

Some demos if what could be done with Mercury in a book production environment:

http://data.flossmanuals.net/mercury/index_semantics.html

http://data.flossmanuals.net/mercury/index.html

http://data.flossmanuals.net/mercury/index_hyphen.html

http://data.flossmanuals.net/mercury/index_prettify.html

Try typing $f(x)$ into the window to see how it works