Why Vivliostyle is Important

Vivliostyle advertises itself as a “publishing workflow tool.” It is a great tool but this isn’t a terribly accurate or descriptive byline since publishers can’t really use it without changing everything else they do. Hence my preference to refer to it as an HTML pagination engine (although I used to refer to this process as Browser Typesetting). It can be part of a publisher’s workflow but only after they have transformed everything else they do to an HTML-first workflow.

On this point, I think Vivliostyle have their sales pitch wrong but it is hard to know how they might do it better since Vivliostyle imagines, as do many other tools, a radically different way of doing things to how publishers operate now. Vivliostyle is one necessary part of an approach that is, in effect, a paradigm shift in how publishers work. This ‘new way’ of working is exciting and transformative, but it is also hard to capture paradigmatic changes in a few catch phrases. So, to understand the value of Vivliostyle, you first have to understand how publishers work now, why it is a terrible way to work, and what HTML-first workflows can offer. Once you understand that, you can see how exciting Vivliostyle could be for publishing.

So… what does it do? Vivliostyle is the latest in a long list of tools that, among other things (more on these in later posts), can enable the transformation of HTML to PDF. These tools have been typically used to produce PDF for printing – book formatted PDF. There are a few of these softwares out there but most are proprietary and only a handful of them have been Open Source (notably BookJS that I was involved in a long time ago, and CaSSius). Vivliostyle is the latest in this family tree, the root category of which I would describe as an HTML Pagination Engine.

What it does is this – it takes an HTML file and paginates it. It flows the HTML through the ‘boxes’ (pages) you have defined and lays the content out nicely in each box, one box after the other, flowing all the content through it with the right margins (and more, see below). So instead of scrolling through the web page, you page through the document on screen. Vivliostyle converts the HTML into ‘pages’. It is an HTML pagination engine.

It is then possible to do a lot more with these pages, including adding page numbers, using CSS (the style rules used by browsers) to define the look and feel of text and images(etc), adding headers, notes and footers etc. In other words, you can add to each page everything you need to make the result ‘look like a book’.

You can do this transformation in the browser because Vivliostyle is written in JavaScript. If you want to see this in action, check out their demos page. For example, look at this page showing the raw HTML of a book by Lea Verou (published by O’Reilly), then open this page (give it a minute to render). Now right click and print (best in Chrome or Chromium browsers – you may need turn margins off and background images on when printing). The result should give you a one to one co-relation of the paginated content in the browser (which is HTML) to the paginated print-ready PDF.

Since the browser can also print PDF, then you can take this newly styled ‘book looking’ HTML and print it to PDF from the browser. From this, you have a book-formatted PDF that is ready to send to the printer to be printed and bound. I’ve worked this way for many years and printed many books this way for organisations from the World Bank to Cisco. The system works and the printed books look great. Vivliostyle is, at this point, the most sophisticated open source tool for doing this.

It is pretty amazing stuff. From HTML to PDF to printed book at the press of a button. Magic. This process is also catching on. Hachette  produce their trade paperbacks using an HTML-to-PDF rendering engine with styling via CSS. So it is no longer a process reserved for small experimental players.

So, why is this interesting? Well, it is interesting almost without an explanation! Transforming HTML in this manner seems pretty magical and it’s kinda neat just to look at the demos and marvel at it. However, the really interesting part comes into play when we start talking about workflows. This is where Vivliostyle can be part of transforming publishing workflows.

As a publisher, if you were to take Vivliostyle ‘out of the box’ it would not be of much use to you. How many web pages do you have that you want to turn into PDF? Not many, if any. How much of your book or journal content in your current workflow is stored in HTML? Probably none. In all likelihood, HTML in your business is restricted to your website and, perhaps, EPUB if you produce them (EPUB is just a zip archive containing HTML files and some other stuff). But in most publisher’s workflows the EPUB is an end-of-line format. Publishers take the completed copy in MS Word, or (sometimes, regrettably) PDF, and send it to a vendor (typically in India) to transform into EPUB and send back. So chances are, HTML is not the format being used as the basis for your book or journal workflows (apart from possibly being an end-of-line format).

As a publisher, you don’t have manuscript copy in HTML, so Vivliostyle is not going to fit snuggly into your workflow. In order to utilise it, you need to transform the way that you work. You have to start working in HTML or, more difficultly, work in some format that can transform into a very tightly controlled HTML output so that Vivliostyle can work with it.

I don’t like the latter style of workflow. This is where you work in something like XML (of some sort) and then transform to HTML at the end of the process. It’s ugly workflow, not friendly for non-techie users and typically full of workflow redundancy. If you want good HTML, then just work with, and in, HTML. This comes with additional benefits since from good HTML you can get to any format you want PLUS you have the advantage of now being able to move your workflow into the browser. And this is where Vivliostyle fits into a toolset, an approach, that could transform how you work – the HTML-first production environment.

The current ‘state of the nation’ in publishing is pretty terrible. Most publishers use MS Word docx as their document format, and Track Changes and email are their primary workflow tools. This means that there is a single document of record – the collection of Word files. These are shareable, in the sense that you can email people copies, but you cannot have multiple people simultaneously accessing them at the same time. In effect, the MS Word files are like digital paper in the worst possible way. There is only one ‘up to date’ version and only one person that can work on that version while they hold onto the files. There is no easy way to follow a document’s history, revert to specific versions, or identify who made what change when. Further, there is no inherent backup strategy built into standalone MS Word files. Everything must be done manually. That means organizing the files in directory structures with naming conventions that are known by only those in the know (since there is no standard way of doing this).  There are also problems with email as a collaboration tool. Did it send? Did they get it? Did they get the right version? Plus there is no way of understanding the status of the documents unless you ask via email or there is some other system that is manually updated for status tracking. The system is not transparent. Further, changing workflows when using systems like this, even for small optimisations, is quite difficult and the larger the team the more difficult it gets.

Additionally, using Word and email like this is really placing unnecessary gateway mechanisms on the content. If I have the up-to-date versions then you can’t have them or work on them. There is really only one copy and only one person can work on it. That makes for linear workflows and strongly delineated roles. No one can ‘jump in and help’ and it is very difficult to alter the linearity of the process or redistribute the labor to achieve efficiencies.

If publishing is to move on, then workflows need to migrate to the browser. With browser-based workflows, there is no need to have multiple copies of the same file, versioning is taken care of as is document history, it is easy to add and remove people from the process, and labor can be better distributed over both roles and time to create more elegant, efficient, workflows. I wrote about this in an earlier post and will write more in posts to come since there is a lotmore to it. But suffice to say that publishing workflows to the browser is a little like ‘sucking all the gaps’ out of the current Word-email workflows (plus a whole lot of other benefits). No more checking your Inbox while you wait for status updates or someone to send you the files so you, and you alone, can work on the next little part of the process while everyone else waits. Additionally, there can be full transparency as to what needs to be, has been, and is being done (and by whom). There is the opportunity to break down larger tasks into smaller tasks and have them all in play concurrently. There is the opportunity to share the same tools and hence enhance communication and redistribute the work to where (who) it makes most sense. There is so much to be gained.

This is not to say that browser-based workflows are  ‘anything goes’ workflows (which is what most publishers think this way of working amounts to). You can still assert rules of who has access and when. But… in my experience, when you migrate workflows to the browser then publishers start rethinking how they work and you often hear comments like “but we don’t need to do it like that anymore’…They then start designing radically better workflows themselves.

So, the point of all this is that Vivliostyle by itself does not achieve this. It is not, in itself, a workflow tool for publishers. You first need all the other things that enable an HTML-first workflow to be in place and once they are there, then you can utilize Vivliostyle to transform the HTML (at the push of a button) to the PDF you need for printing. That is the radical improvement Vivliostyle can offer. Cut out the file conversion vendors and render the content according to templated style sheets (automated typesetting can produce beautiful results). This means you can check what the book will look like at any moment, plus the CSS stylesheets you use can also be included in your EPUBs (also rendered at the push of a button since the original content is already in HTML, the content filetype for EPUB) so your printed book and the EPUB look the same.

So, Vivliostyle is a necessary tool for HTML workflows and with an HTML workflow you will radically improve what you do.

This is why Vivliostyle is important to publishers but you cannot consider it isolation. You must consider it with regard to migrating to an HTML first workflow. If you migrate to this kind of workflow then not only will you experience the efficiencies described above but your organisational culture will be transformed and the types of content you can then produce will become a lot more open ended. This is the vision that Vivliostyle, and other tools that enable HTML-first workflows (including those developed by UCP and Coko), are imagining and building towards.

Dear reader, out of principle, I do not use proprietary social media platforms and networks. So, if you like this content, please use your channels to promote it – email it to a friend (for example). Many thanks! Adam

Federated Publishing Revisited

Back in 2011, I wrote about Federated Publishing. It is probably time to revisit this topic, given some recent-ish developments in technology, most notably Dat.

When I first encountered Dat, there was some mumble in the air about ‘git for data’. A nice elevator pitch. I’m not sure where that meme came from. I know some of the Dat people (Max Ogden and Karissa McKelvey specifically) and I don’t think I remember them using this phrase. It is often this way with tech – an idea gets out there, no one knows what it means, and then, before you know it, it’s everywhere. I have never been at the center of this kind of viral excitement but I have seen it many times, and, as always, the meme reflects very little truth about what is actually going on with the tech or how it might be useful.

In the case of Dat, it took me some time to work out what it was. At first I understood it as breed of peer-to-peer technology specifically for the distribution of datasets. Indeed, that is what they say on their website


Sure… so it’s this sciencey thing that is intended for use by researchers for sharing data. It sounds like many of the things we have also been discussing at Coko, to do with the early sharing of research data. So Kristen Ratan and I approached Dat and started up conversations which are leading us towards some interesting collaborations – not yet around implementing Dat in Coko projects but around developing open source-open science communities (more on this later).

However, it wasn’t until I spent some time in the Maasai Mara that I understood what Dat was all about.

Richard in the Maasai Mara, Kenya

I traveled to Kenya to spend some time with Richard Smith-Unna who had just joined the Collaborative Knowledge Foundation. We spent some time together just outside of Nairobi and then traveled down to the Maasai Mara for a few days camping.

Our camp (that’s my tent in the background!)

It was a pretty basic camp in beautiful surroundings. Its a wonderful thing to sleep deeply at night while hippos and buffalo walk around your campsite and, on one night, around your tent. During the days we went exploring and talked tech while watching cheetahs or lions, or termite mounds.

Richard had been the primary developer behind Science Fair which uses Dat libraries. Science Fair is supported, to a small degree, by eLife. eLife received 25 million pounds earlier this year to ‘do their thing,’ some of which includes spending small amounts of money on technology innovations built by people like Substance.io or Richard.

So Richard, mostly, put together Science Fair. In essence, it is a desktop browser with a very specific content focus – research articles – and a very specific distribution strategy – which is where Dat comes in.


Science Fair enables a researcher to search for research articles and read them within a desktop ‘browser’. The browser is actually built with a technology called electron. Electron enables developers to build cross-platform desktop applications in JavaScript (JS). Welcome to the new world folks – the promise of write-once-run-anywhere made by Java has finally been realised by JavaScript. Which is one of the reasons the Coko PubSweet framework is all JS. The future is JS, as Jeff Atwood has said in his ‘All programming is Web Programming‘ post:

any application that can be written in JavaScript, will eventually be written in JavaScript.

Science Fair, an entirely JS application running on your desktop, leverages the Dat JS libs….but to do what?

Dat enables content to be stored – you throw stuff into it and get it back later. However Dat isn’t just a content store, otherwise it wouldn’t be very interesting, since that problem is well solved. Dat is a peer to peer store.

In the case of Science Fair, when you download something to read (which is stored with Dat) you become a peer in that content’s network. You become a server for that piece of content. When someone else requests that same article then you may be the one serving the content to them (if you are the closest peer to them).

In other words, Dat is a kind of open source Content Distribution Network (CDN) technology. One with a few interesting extra features to leverage ie. a peer to peer design.

You don’t have to use the peer-to-peer functionality. You can just use Dat as a single content store – without replicating the content to other nodes. That is quite useful in itself but there are many other technologies that can do this – a normal file system on a server somewhere, for example. You could also use Dat purely as a CDN – a network of content stores which replicate and deliver your content closer to where your users are. Once again there are open source technologies that can do this like jsDelivr. However, what Dat can also do, is turn your CDN into a peer-to-peer network where users become the content servers. When a user fetches some content, they then become another node in that content’s delivery network.

That is pretty interesting. It means Science Fair, while looking like a search-and-read interface for content, also is a peer-to-peer content delivery node for that same content.

The question is – is that interesting or useful? Well… it is a fantastic example of federated content and, possibly in time, federated publishing. As researchers and/or publishers seed content into this network, the boundaries and roles of Journals may start to become a little fuzzy.

For example, Open Access (OA) is interesting because it is a movement for making research materials available for free. Free as in no cost, and free through the application of liberal Creative Commons licenses. However, OA still follows many of the norms of the publishing world, in that there are (capital P) Publishers which curate and control the access, display, and ‘functionality’ (although article functionality is a rather impoverished idea in this sector) for content. If an OA Publisher classifies article A as belonging to category B due to their internal taxonomy then that is where article A will go. If a Publisher enables annotation for ‘their’ content then you have annotation. If a Publisher enables threaded comments for discussion around the article then you have one place where you can discuss the findings. But…while Science Fair might sound like this – a place to find content (just like a Publisher) – it is not this. Science Fair distributes the content into a Dat network and how that content is surfaced, tagged, commented on etc is entirely up to the type of interface you use to access that content. If you wanted to share user-specific tagging taxonomies, for example, you can build that into Science Fair or a Science Fair-like interface. No need to wait for the Publisher.

The researchers, then, could have complete control on how content is curated, displayed, discussed etc since in some sense the users start to become the publishers. 

That is a pretty big step sideways.

I’m aware that distribution is not the only thing Publishers do. But it is why they exist in their current form. If Publishers were not the branded content portals they are then it is unlikely they would exist in the form we know them now, rather they would be service providers that do all, or part, of the other services they currently provide like quality control, technical checks, conflict of interest checking, validation and normalization, review management, format conversion etc. The point is that at the core of these services currently is the Publisher – the brand holding this all together, so to speak. But what happens when one of their primary offerings – sharing/distribution of content – starts to be diminished by other channels? What if researchers decided this is not how they want to access content. What becomes of the Publishing model when faced with an erosion of one of their primary offerings?

Federated publishing breaks down all the ways that we think of publishing, as a way to access content, today. It fundamentally remaps ideas of centralised publishing and opens up many many interesting de-centralised possibilities and questions. It is a fundamental shift of power from the center to the periphery.

I find this interesting because at the time I wrote the piece on federated publishing, I mentioned that earlier there had been quite a bit of chatter about federation. Diaspora, status.net (now pump.io), and Thimbl were three projects that looked to the centralised power dynamics of social networks and saw federation as a way out. Ward Cunningham also evolved the Federated Wiki around the same time. Everyone felt, for one reason or another, that distributed power worked best. None of these projects were successful in terms of adoption, nor were my attempts to start federated publishing using Booki/Booktype. However, that is possibly exactly what makes Dat and Science Fair interesting.

I have watched many great ideas developed into softwares over the years and witnessed the death of those same projects. This kind of cycle reflects the well-known Silicon Valley mantra

to be right too early is the same as being wrong

But often the second, or third, time around, these things get the timing right and something shifts. The idea proliferates and adoption occurs. We may be seeing this at the moment with a new generation of peer-to-peer approaches, many implemented in JavaScript — such as IPFS and Dat.

While Dat, IPFS, Science Fair etc might actually herald in an interesting new era of federation one thing for sure, the change will not occur overnight. It requires persistence, strategy, and working closely with researchers to encourage them to use the tools and to shape them so they find them useful. A slow displacement of existing tools and their inherent politic is the better strategy for radical change. Radical change at a slow persistent pace is far more likely of success than a gangbusters approach that will soon lose energy if change isn’t instantly catalyzed. Persistence is the key.

While Dat is not restricted to the sharing of datasets as they imply, it is interesting to see how this idea has been realised in part by Science Fair as an interface for browsing and reading articles. The question for me is not ‘is this a good idea’ (it is) but rather, could the timing and execution be right this time? If it is, then could applications like Science Fair evolve more utility than publishers can provide? Could this, in turn, lead to these applications being widely used by researchers? Could this, in time, lead to a huge ‘unbound’ peer-to-peer content store of research data? Do the Science Fair and Dat teams have the patience to strategise and set their collective minds on a persistent, slow, change that will enable the radical reshaping of the power dynamics they are addressing? And if so, what happens then?

Coda: Dat is capable of more than what I have described above. Its has other very interesting features such the ability to cryptographically sign content and its ability to update content by updating only the difference between versions (as opposed to the entire file). The above post is not an audit of Dat and its total utility but rather a sense making piece reflecting on some features of Dat and what it could mean in this use case as exposed by Science Fair.

Learning Node

It’s that time again… time to learn something new. On my radar right now is Node – the environment for executing JavaScript on the server. It’s a long time since I gave up learning to program. I built several systems in Perl and PHP and tinkered in Python. I also learned some JavaScript a long time ago…as I’m writing I am also remembering I did a few projects in Visual Basic and RealBasic (Mac)… eeeek…. those were… err… the days… the days, that is, when I couldn’t afford to pay someone to do it properly. I would describe myself as a cut’n paster, not a programmer.

However, Node really moves the game on. It has been around for a while now but everything interesting I see online these days is more than likely to be using Node in some capacity. So, after having recently finished a reasonably schedule-heavy job, I have time to work on my own projects and start tinkering again. I’m really enjoying it. I don’t want to be a programmer, but I find that gaining a working knowledge of interesting new technologies really helps me imagine what is possible and talk pseudo-intelligently with programmers about ideas.

Node looks to be one very interesting option for whatever I do next. I’m using Talentbuddy to get a start. They have some good free starter tutorials. I’ll try them out and see where I get. Having said that I have, just this minute, discovered learnjs , so it looks like there’s no shortage of learning opportunities. Awesome.

InDesign vs. CSS

The explosion in web typesetting has been largely unnoticed by everyone except the typography geeks. One of the first posts that raised my awareness of this phenomenon was From Print to Web: Creating Print-Quality Typography in the Browser by Joshua Gross. It is a great article which is almost a year old yet still needs to be read by those that haven’t come across it. Apart from pointing to some very good JavaScript typesetting libraries, Joshua does a quick comparison of InDesign features vs. what is available in CSS and JS libs (at the time of writing).

It’s a very quick run down and shows just how close things are getting. In addition, Joshua points to strategies for working with baselines using grids formulated by JavaScript and CSS. Joshua focuses on Hugrid but also worth mentioning is the JMetronome library and BaselineCSS. These approaches are getting increasingly sophisticated and, of course, they are all Open Source.

It brings to our attention that rendering engines utilising HTML as a base file format are ready to cash in on some pretty interesting developments. However, it also highlights how rendering engines that support only CSS (such as the proprietary PrinceXML) are going to lose out in the medium and long term since they lack JavaScript support. JavaScript, as I mentioned before, is the lingua franca of typesetting. JS libs enable us to augment, improve, and innovate on top of what is available directly through the browser. Any typesetting engine without JavaScript support is simply going to lose out in the long run. Any engine that ignores JavaScript and/or is proprietary, loses out doubly so since it is essentially existing on a technical island and cannot take advantage of the huge innovations happening in this field which are available to everyone else.

Once again, this points the way for the browser as typesetting engine; HTML as the base file format for web, print, and ebook production; and CSS and Javascript as the dynamic duo lingua franca of typesetting. All that means Open Source and Open Standards are gravitating further and faster towards the core of the print and publishing industries. If you didn’t think Open Source is a serious proposition it might be a good idea to call time-out, get some JS and CSS wizards in, and have a heart-to-heart talk about the direction the industry is heading in.

Correction: The latest release of PrinceXML has limited beta Javascript support. Thanks to Michael Day for pointing this out.

Originally published on O’Reilly, 19 Nov 2012 http://toc.oreilly.com/2012/11/indesign-vs-css.html


Gutenberg Regions

With the onward march towards the browser typesetting engine, not an invention but a combination of technologies with some improvements (much like Gutenberg’s press), it’s interesting to think what that would mean for print production. Although there are various perspectives on how the printing press changed society – and they mostly reflect on the mass production of books – and while the first great demonstration involved a book (The Gutenberg Bible), the invention itself was really about automating the printing process. It affected all paper products that were formerly inscribed by hand and it brought in new print product innovations (no not just Hallmark birthday cards but also new ways of assembling and organising information).

The press affected not just the book but also print production. And now we have browsers able to produce print-ready PDF and the technology arriving to output PDF from the browser that directly corresponds with what you see in the browser window, we are entering the new phase of print (and book) production – networked in-browser design. We are not talking here of emulated template-like design but the 1:1 design process you experience with software like Scribus and InDesign.

The evolution of CSS

There are several JavaScript libraries that deal with this, as I have mentioned in earlier posts (here and here), and the evolution of CSS is really opening this field up. Of particular interest is what Adobe is doing behind the scenes with CSS Regions. These regions are part of the W3C specification for CSS and the browser adopting this specific feature the fastest is the Open Source browser engine Webkit, which is used by Chrome and Safari (not to mention Chromium and other browsers).

CSS Regions allow text to flow from one page element to another. That means you can flow text from one box to another box, which is what is leveraged in the book.js technology I wrote about here. This feature was included in Webkit by the Romanian development team working for Adobe. It appears that Adobe has seen the writing on the wall for desktop publishing although they might not be singing that song too loudly considering most of their business comes out of that market. Their primary (public facing) reason for including CSS Regions and other features in Webkit is to support their new product Adobe Edge. Adobe Edge uses, according to the website, Webkit as a ‘design surface’.

A moment of respect please for the Adobe team for contributing these features to Webkit. Also don’t forget in that moment to also reflect quietly on Konquerer (specifically KHTML), the ‘little open source project’ borne out of the Linux Desktop KDE which grew into Webkit. It’s an astonishing success story for Open Source and possibly in the medium term a more significant contribution to our world than Open Source kernels (I’m sure that statement will get a few bites).

”HTML-based design surface” is about as close to a carefully constructed non-market-cannibalising euphemism as I would care to imagine. Adobe Edge is in-browser design in action produced by one of the world’s leading print technology providers but the role of the browser in this technology is not the biggest noise being made at its release. Edge is ‘all about’ adaptive design but in reality it’s about the new role of the browser as a ‘target agnostic’ (paper, device, webpage, whatever – it’s all the same) typesetting engine.

A consortium, not an individual company

However, we should not rely on Adobe for these advances. It is about time a real consortium focused on Open Source and Open Standards started paying for these developments as they are critical to the print and publishing industries and for anyone else wanting to design for devices and/or paper. That’s pretty much everyone.

Gutenberg died relatively poor because someone else took his idea and cornered the market. Imagine if he put all the design documents out there for the printing press and said ‘go to it.’ Knowing what you know now, would you get involved and help build the printing press? If the answer is no, I deeply respect your honesty. But that’s where we are now with CSS standards like regions, exclusions, and the page-generated content module. The blueprints are there for the new typesetting engine, right there out in the open. The print and publishing industries should take that opportunity and make their next future.

Originally posted on Nov 6, 2012 on O’Reillys Tools of Change site: http://toc.oreilly.com/2012/11/gutenberg-regions.html


What’s wrong with WYSIWYG

The current WYSIWYG paradigm has been inadequate for a long time and we need to update and replace it. Using a WYSIWYG editor is pokey and horrible. Producing text this way feels like trying to write a letter while it’s still in the envelope. These kinds of editors are not an extension of yourself, they are cumbersome hindrances to getting a job done.

Apart from huge user experience issues, the WYSIWYG editor has some big technical issues, starting with the fact that the WYSIWYG editor is not ‘part of the page,’ it is instead its own internally nested world. In essence, it is an emulator that, through JavaScript, reproduces the rendering of HTML. However, as a walled emulated garden it is hard to operate on the objects in the garden using standard JavaScript libraries and CSS. All interactions must be mediated by the editor. The ‘walled garden’ has little to do with the rest of the page. It offers a window through which you can edit text, but it does not offer you the ability to act on other objects on the page or have other objects act on it.

The new generation of HTML5 editors have taken a large step forward, not because they integrate HTML5, as such, but because they act on the elements in the page directly. That allows ‘the page’ to be the editing environment which in turn opens up the possibility for the content to be represented in a variety of forms/views. By changing the CSS of the page, we can have the same content represented as a multi-column editing environment (useful for newspaper layout), as a ‘google docs type’ clean editing interface (see demo below), a semantic layout for highlighting paragraphs and other structural elements (important for academics) as well as other possibilities….


Additionally, it is possible to apply other javascript libraries to the page, including annotation software such as Annotate It or typographical libraries such as lettering.js. This opens up an enormous amount of possibilities for any use case to be extended by custom or existing third party JavaScript libraries.

It is also possible to consider creating CSS snippets and applying them dynamically using the editor. This is in effect turns the editor into a design interface, which will open the path for in-browser design of various media.

Lastly, WYSIWYG editors, while marvellous in their day, have had their day. They feel too pokey and ‘old school’. Largely due to the success of Google Docs users no longer want to poke around in a tiny WYSIWYG editor. They want large clean interfaces for content production.


The above screen shot is taken from the semi-functional demo at http://data.flossmanuals.net/mercury/index.html (all demos work only in Chrome for now).

In brief summary, essential problems of the current WYSIWYG world are:

  1. it is not easily possible to enable JavaScript libraries to act upon the objects in the editor
  2. representing the content in context is difficult
  3. the content is not part of a page so additional functionality like (non- intrusive) annotations cannot be added to content
  4. dynamic rendering of content retrieved from the server is hard to achieve
  5. dynamic creation of content is hard to achieve
  6. inclusion of nested JavaScripts is hard to achieve
  7. they look ‘pokey’ and old school
  8. synchronous editing is possible but hard to achieve
  9. users want to see the content they are making in a much cleaner and clearer ‘fullpage’ way
  10. some users want semantic views
  11. some users want design views
  12. all users must be able to edit the content (designers, editors, content creators, etc) and do what they need from the same view

Online book production has special needs such as the ability to display the content as it might appear in the output, annotations, draft view etc.

The Ideal Editor

A short list of starting list of features for a new editor might look something like this:

  1. harmonise edit, draft, proof, design features into one view
  2. live chat
  3. short messages which also support ostatus API (which is built on Twitter spec)
  4. send to renderers from within the editor
  5. ability to switch on and off third party JS libs
  6. apply CSS templates to content
  7. create CSS snippets
  8. add snippets to templates
  9. share snippets and templates
  10. dynamic snippet rendering
  11. live template swapping including semantic layout and ‘output’ view
  12. synchronous editing
  13. per ‘chunk’ notes

Many of these features are relatively easy to achieve, others will take some careful thought and planning.

The Dream features

  1. backend hook up to git
  2. versioning of content especially for different outputs (A4 vs A5 pages, html, epub etc)

Where to start

We need to develop with an editor in the hand. The current batch of html editors does pretty well but we need to choose one which has a good support community and a good feature set. Currently, there is just one option – Mercury editor. It currently has a new home page (less than 2 weeks old), and a thriving community.

Some demos if what could be done with Mercury in a book production environment:





If anyone would like to work on this, please contact me adam@booki.cc

If you want some more convincing,  try this math editor using Mercury and MathJax…  http://data.flossmanuals.net/mercury/index_math.html
Try typing $f(x)$ into the window to see how it works

Or try this to see how AnnotateIt works in this environment.