We may be looking for another JS dev to add to the kotahi team…not advertising it as yet but if you are looking for something cool to do and you are a good JS dev then just ping me.


Recently I’ve been amazed at what we can do with Editoria. The complexity of the content we can manage and output is really astounding. I’ll share some EPUB and PDF soon that shows what the content can look like.

This is the result of many years work. I have been developing book production systems for a long time. The first one I produced was for FLOSS Manuals and it was me cutting and pasting perl code to extend TWiki into a book engine. The thing about it is that it worked. It actually had all the main components Editoria has now – a dash, a book builder, a chapter editor, an export engine etc… we could, at the time, produce books which were printed. The output was determined by what the editors and PDF generation tools were capable of back then. I was quite happy with the results at the time and so I figured this kind of process would soon become the norm in publishing. This was in 2007 or so.

In some sense the output of those systems did become the norm in some circles. The output, for example, was very similar to what you can see coming out of book production tools today like PressBooks.

Since then I’ve been involved in all sorts of publishing systems and learned a lot along the way. What I’ve learned is the road to get from what we could do in 2007 to where we are today is long. At the time I didn’t understand this. I figured the books output to PDF were ‘good enough’. But then I started working directly with publishers, I was no longer an outsider – I saw what it was they needed that was not in the systems that I had built to date.

What those publishers needed actually sounds small. In general, they need tools that could enable them to have fine control over the text and semantics of the content. I didn’t understand how much effort getting the latter was going to take. What I didn’t understand was that the tools required to manage these semantics for editing/creating book display semantics  and rendering it was PDF would need to be exponentially more sophisticated.

By display semantics I mean this – a book as rendered in print form has various elements that form the structure, look and feel of the book. We are talking about titles for chapters (at the simplest level) or image captions. These things need to be identified in the underlying markup so that they can be styled and positioned when outputting the PDF.

However this is just the tip of the iceberg. There are also things like two column content, footnotes, floating images, text flow around images, block images (that fill an entire page to the edge), two page spreads, overlays, inserts, indexes etc…. all of which must first be marked up in the right semantics (and these semantics must be maintained throughout the books production lifecycle without being screwed up) so that we can style and position that content on the page correctly at render time.

This is also not all of it. It might not be obvious but in many books we have call outs to content that appears later in the book… how do we enable publishing staff to mark up this kind of relationship? How do enable this kind of inclusion when it comes to rendering the PDF?

We need a high resolution of semantics in the underlying content so we can achieve complex output for print. We then also need a typesetting engine than understands these semantics and can apply the right style rules to produce the design we are after. These tools have to work together and they have to also understand referencing of content throughout the book for creation of indexes etc…

Well, you might think that an easy path is to stay with non-complex output. I did a the time. But, as it happens publishers aren’t happy with that answer.

Also, I don’t believe many folks really understand how complex this problem is. What I found is that we needed a word processor and a typesetting engine of the kind that did not exist. We needed one for the web that met the standards publishers demand. The problem was – there weren’t any.

So the process of building a book production engine became the problem of building a word processor and a typesetting engine. These two technologies are not simple categories. These are sophisticated problems. Who are the organisations that have solved this in the past for publishers? Microsoft and Adobe (although Microsoft never solved this issue well which is one of the many reasons we produced other tools like XSweet).

Anyways… it is a little ridiculous to have to build these things. They should have existed. But they didn’t. I have to say I had a large quantity of crap thrown at me by various folks over the years for not being able to solve this over night. I guess it wasn’t their fault, they didn’t understand how difficult this problem was.

The path to building a word processor also wasn’t helped by the fact that we had to build it twice as the first build was on third party libraries that eventually fell into disarray. That was a tough one.

So its been a long journey. Not many will understand what we have achieved. Its not just the individual components, but what we had to understand to do it and to put them all together in ecosystems (platforms) that work. Its enormous. Really really enormous. Its also been, obviously, a huge team effort by Coko and Cabbage Tree Labs folks. But we did it. I still really don’t comprehend how big it is myself. How did we, a small rough-as team of nobodies, build tools that took Microsoft and Adobe decades to build and refine with all the resources they had to throw a the problems? Did we do it in a way we wanted to do it, that brought insights to the problem very few have? I believe we did. Its really really astonishing.

However, this week I really felt we got there. It was a more profound feeling than I have had before of ‘getting there’. . I mean, I kind of knew we had solved these problems for a while now. But this time the feeling was deep. It came because we recently made some relatively small changes to Editoria that brings it all together and the results are quite astonishing. I’ll be sharing some content we have produced recently with Editoria but it is exceptionally good. It puts Editoria in a class of its own. Editoria is now capable of very sophisticated output of the kind that will make many publishers happy.

The same is true of the EPUB and web based HTML output coming out of Editoria… but more on all this soon…

A brief outline of Kotahi features

Kotahi has been in the wings for a while. Now, however, we are moving ahead with some more accelerated development. In short-time the product has been extended in collaboration with eLife. We now also have the platform in production at eLife.

The following is a brief outline of the current state.

Importing content

Kotahi has an extensible import pipeline. Currently docx import goes a step further than most ingestion pipelines and converts the docx file into a kotahi compliant data structure (via xsweet) so the content can be displayed and edited in the browser (using wax 2). However we do also have use cases currently in testing for batch import of submissions via crossref, bioRxiv and Pubmed Apis, as well as via content from a google spreadsheet.

  • Extensible import pipeline
  • Currently supports the batch import of articles via bioRxiv and Pubmed APIs, Docx


Kotahi supports any kind of login OAuth, integration with existing auth systems, or the standard account creation process. However we have made the ORCID login process the default.

  • Authentication and login with ORCID

Manuscript submission

Submission objects (article, preprint etc) can take the form of a URL (eg a link to a github repo for reviewing sofware, or a link to a Jupyter notebook, an existing preprint via DOI link etc). Submission objects can also be an attachment (eg PDF, EPUB, latex file) or content to be converted and edited within the system (eg docx).

Additional submission data is captured by author input into a submission form. This form is built by the admin. the building of the form is all done in the browser with our Form Builder (no need for a developer to do this).

Each field has validation options as well as the option to mark certain fields as required etc.

  • Submit manuscripts from a URL or upload a file.
  • All manuscript metadata is captured in a form using the Form Builder
  • Submission form builder accessed via a browser
  • Options to customise/parse specific object data e.g. DOI URL validation upon submission

Manuscript editing

If you have ingested a docx file then the ingested content (micropub, manuscript etc) can be viewed and edited within the system. When content is edited within the revision cycle separate versions of each submission step are maintained. This is also true for form data captured in the submission form.

  • Edit your manuscript in your browser using Wax 2, a premium word processor developed by Coko
  • Wax 2 and Form Builder support multiple versions of a submission (all revisions)

Manuscript handling

A lot here… Admin and Managing Editor / EiC etc share the same ‘god’ role – they have access to everything. Senior Editors, Handling Editors etc are assigned per submission. All folks access their submissions via the dashboard although there is also a view that shows all submissions (available only to those with managing editor etc perms).

  • Role-specific views and information accessed via the Dashboard
  • Managing Editor/Admin view accessed via the Manuscripts page; status, date modified, created by etc.
  • The Manuscripts page can be configured to include the following actions; view/edit metadata, edit/publish manuscripts and evaluations (reviews)


Kotahi supports ‘triaging’ via a number of tools. Essentially filtering by topic and labels combined with the ability to delete/reject submissions enables multiple types of triage process.

  • Filtering manuscripts view by status, ad hoc labels and/or topics
  • Deletion of submissions

Peer review

There are multiple types of peer review process supported. Various types of open and blind reviews. There is also the ability to assign individual reviewers OR multiple reviewers to conduct a shared/collaborative review.

  • Discreet vs shared reviews
  • Public vs anonymous reviews
  • Capture reviewer recommendations
  • Capture Editor decisions (accept/revise/reject)


Multiple tools to enable collaboration. Live chat, video calls, ‘presence’ indicators’ etc.

  • Synchronous and asynchronous chat
  • Chat directly with an Author or have a confidential discussion between editors
  • Launch a video chat
  • Built-in email notifications subsystem


  • Copyediting work can be done in Wax (Grammarly integration)


This is more or less the export pipeline and it similarly extensible (as per import pipeline). Kotahi has a native publish interface but it can also be integrated with external services. You could, for example, publish instead to the forthcoming product – Flax.  Currently the following use cases are also built in for publishing to (registering) DOIs and metadata with Crossref, and publishing to a spreadsheet.

  • Publishing to Kotahi’s homepage
  • Registering DOIs via Crossref
  • Publishing to a spreadsheet
  • Extensible export (publishing) pipeline


  • Filter and view system-wide, manuscript and role-specific performance data

User management

  • Assign system wide role permissions
  • Assign manuscripts specific permissions


  • Text strings accessible for translation/modification
  • UI for config

Kotahi code is open source

New product announcement soon

At Cabbage Tree Labs we have been working on a new product called ‘Flax’. It is a new platform that any tool can publish to. For a long time Coko has been focused on building workflow tools for the content production and review process. Now we are extending the natural scope of the flow to the actual publishing of materials.

The framework is designed to be lightweight and extensible. Building workflow products is hard, and we have developed PubSweet to make this easier. However publishing front ends (call them repositories, libraries, collections, shops…whatever you like) aren’t so tricky AND they need to be highly customised per publisher. Consequently we have built a lightweight extensible framework that can handle any kind of content coming from any tool. The system can then be extended by a local development house (or by yourself if you are geeky) to include all your about pages, FAQ, policies etc…

The key features are:

  • content agnostic
  • lightweight and extensible
  • can handle one item or an entire repository/library of content

Regarding Coko products, we can integrate Flax with Editoria and Kotahi using GraphQL which means there is a live connection between the workflow and the publish end point. Consequently you can immediately publish, update, unpublish etc to Flax from Kotahi or Editoria.

However Flax is built to integrate with any tool. Consequently we have placed it in Cabbage tree Labs. The Labs are for projects that anyone building a publishing tool could leverage without using any other Coko products.

Anyways… announcement coming soon…