Editoria Vid

If you missed the webinar, the video is here (it will be available from the Coko site also)…starts 9min 30s in:

Also, we had 40 or so live questions during the Q&A. We answered them all and Julien decided to do a quick dive into Right to Left (RTL) support. He posted this screenshot with mixed RTL and LTR text to show it works out of the box (still more to test):

archmerge_2018-05-16_10-25-17

Site Changing

I’ve been moving things around to see how they feel on the site. I am reverting back to how it was a few weeks ago. My thinking at this stage is I like the site as is but I need to improve a few things and add a page with some more polished selected articles. I’m not happy with the longer form articles I have, they are fine as blog posts, but I will find a way to rewrite and create some new ones, and then create the new page.

Whats Involved in Building Editoria

We have a webinar coming up soon for Editoria. In anticipation of that I thought I’d write a little overview of what was involved in building the product. So… Starting with a little hand drawn schematic…

ed
The above shows all the moving parts that we had to consider, and build, to make Editoria what it is. From the user’s perspective, it looks like one app – the Editoria interface – where they can upload a docx file (or files) and start work on producing a book.

From an ‘under the hood’ perspective, it looks like the following…

xsweet

As it happens, we had to build all of the above to make it possible… So, here is a list of all the parts starting from ingest…

Ingest

Ingest consists of 2 parts:

  1. INK
  2. XSweet

INK is a job processing workhorse. Essentially it will run any kind of job you want and ‘steps’ are written for specific kinds of jobs/processes. These are a kind of ‘wrapper’ around executable code. INK has an API that enables platforms like Editoria to send files to it and request a specific set of steps to be run, and then return the result. In this instance, INK runs a set of steps that wrap up XSweet.

XSweet is a set of scripts that take a docx file and convert it to HTML. It is very modular and can be pretty easily extended and improved. It can also be customised per use case.

So, on the ingest stage it looks like the following to the user:

  1. user logins in to Editoria
  2. they create a new book from the book dash
  3. they vlivk on the new book
  4. from the book builder interface they choose ‘upload multiple word files’ or ‘upload word’.
  5. they select the word file(s) from a system dialog
  6. these files are automagically appear ‘in the book’ and can be edited
    1. note : in the case of multiple docs files, if the files have been named according to a simple convention, Editoria will create each new chapter and automatically populate the structure of the book and load the content in the right place

From the tech side the following happens:

  1. when a word doc file or files are selected, the files are sent to INK via the INK API, with a request to convert to docx using the XSweet steps
  2. INK runs the docx file(s) through the XSweet steps, creating HTML
  3. when the job is done, INK signals Editoria that the files are good to go
  4. Editoria grabs each file when it is ready and puts in the the right chapter, creating the chapter if multiple word files are uploaded at once

So… we had to build all of the above. First, INK exists because, crazy as it sounds, we could not find an elegant open source pipeline manager to manage jobs like this. So we decided to build from zero. In addition, which also sounds crazy, there is not a comprehensive open source library out there for converting docx to HTML that was easy to extend or customise for different use cases. We certainly know of many scripts for docx->html (many are listed on the XSweet site), but they are either unmaintained, or difficult to extend. So, unfortunately, we had to build that also.

Editoria App

The Editoria App is what people think of ‘as Editoria’. Everything the user interacts with is, on the surface, Editoria interfaces in the browser. However, Editoria comprises of 3 main parts:

  1. PubSweet
  2. the Editoria interfaces
  3. Wax

PubSweet is the ‘under the hood’ CMS that enables apps like Editoria to be built on top. PubSweet is a Node/React application written entirely in Javascript and it sits on the server and provides a ‘wrapper’ for the Editoria interfaces. You can think of PubSweet as a ‘headless’ CMS that thinks like a publishing system. Once again, when we started (and still today) there was nothing out there that could realise multiple publishing use cases, as diverse as micropubs, books, journals, content aggregation etc, built on top of the same CMS. So we had to build it ourselves. PubSweet is now pretty mature and indeed supports multiple publishing platforms – at last count I think that totals 6 platforms built on top of PubSweet (and it is early days).

The Editoria interfaces are the ‘web pages’ that make up the Editoria application. This includes the dashboard, the book builder etc… these are all custom code written on top of PubSweet to realise the Editoria use case / platform.

Wax is an online editor, it is so sophisticated we prefer to call it a web-based word processor. It is actually an Editoria interface, except that it is so complex and such a critical part of any online publishing tool, that it is a product in its own right. Nothing like this existed either, so we built our own (typical WYSIWYG editors are not sophisticated enough for the publishing use case).

PagedJS

From Editoria, you can press a button and have a book appear in the browser. If you print from the browser, you will then get a PDF ready to take to the printers with all the typographical bells and whistles that books need. It looks magical. We currently use Vivliostyle for this but we have hit many limitations with the software. We reached out to the org that produces the code but they weren’t ready to collaborate to improve the product. Hence, unfortunately, we had to build our own. Soon we will replace Vivliostyle with this new offering. The speed of dev has been incredibly fast on paged.js (working title) and so watch this space for some exciting news.. there are some early demos in the pagedjs repo if you are interested.

Summary

…. So, as you can see, Editoria is more complex than it appears to the eye of the user. Also, the crazy thing is that most of these parts should have existed already, but, crazy as it is, they didn’t exist. There has been no full page, open source, editor out there gives the source control and extensability required for the publishing use case. The docx conversion tools available do not yet do a good enough job of the conversion, and are not easy to customise. There is no headless CMS that ‘thinks like a publisher’ and would help us build Editoria; and we had to create a pagination engine (paged.js) because the one open source option couldn’t give us what we needed and we failed at trying to help them improve the product.

Its not like we wanted to build all this, but the sad fact is for the publishing industry the web is still an innovation they haven’t embraced (except as a distribution medium). So while all of these parts (or many of them) should exist, the sad truth is they did not exist. We did a pretty good look around to see if we could save ourselves the effort before embarking on these projects. We did, for example, bank on Substance Lens Writer for our editor, but it was not maintained and proved unstable, and our eventual assessment was that it was better to start again – hence Wax. We also have gone a long way down the path of using and promoting Vivliostyle, until it became apparent that neither the technology nor the technology providers could get us to where we needed to be… consequently, after implementing Vivliostyle (it is still baked into Editoria) we started a new approach to this problem – Pagedjs.

So, we had to build all of this ourselves… the good news is, that all of these moving parts are reusable in other scenarios… so you can take whatever you like from the above, and add it to your own product and hopefully that will fuel the publisher sectors ability to innovate and transform their products and workflows.

Kindle vs Kobo

Every few years I buy ebook reading devices to get a feel for where things are going. Not much news here really, things seem to have stabilized a lot in terms of formatting, price, and reading experience. They don’t compete on much, now that things have settled a little. So I won’t comment on it much,  just to say that the Kobo Aura wins for me over the Kindle Paperwhite, purely on the basis that the Kindle displays advertisements on the screen when in sleep mode, while the Kobo displays the cover of the book I’m currently reading. It makes a huge difference as to how I feel about the device.

New Term in My Lexicon

I have a new term for my Lexicon – “Timezone Disaster”. In short, that’s me – I am a timezone disaster. Recently I have found myself blankly staring into space when I get meeting requests like “Hows Tuesday at 2?”…. unfortunately, my first question is … “Tuesday, where?” as I often traverse the dateline… My Tuesday could be your Monday… and “2”…well… that’s a roll of the dice.  Could be anything.

Then there is the problem that I might be asked that question by someone in Timezone A, there might be someone else in Timezone B to factor in, and then I am currently in Timezone C, but over the period of their availability, I going to be in Timezone D, E or even F.

It’s no joke. This is just doing my head in!

So… time to admit that I can’t manage this anymore and time to introduce Anne Erika Schmidt. A friend of a friend. A Berliner. Efficient. And that’s all I know 🙂 We haven’t met yet but that’s normal for many people I work with, however next week I will be in Berlin and we’ll get to catch up in person. Which is important because Anne is helping me navigate this timezone mess and other things that a perpetual nomad needs help with. So far Anne has helped me a lot and I’m very grateful for it. So if you get an email from me with Anne cc’ed you will know why… I should have done this a long time ago!….

XSweet 1.0!

We have 1.0 of the docx -> HTML transformation tool XSweet out today! It also has a new site:

http://xsweet.coko.foundation/

XSweet is a finely crafted tool. It takes docx files, those horrible mangy MS Word files, and translates them into clean, lovely, HTML. XSweet is open source, modular, and very nicely done.

A huge tip of the hat to Wendell Piez (XML guru), and to Alex Theg. As geeky as it sounds, I loved watching these two chat about the issues they encountered making this software. The attention to detail was really unbelievable. Amazing work. XSweet is a finely crafted tool.

More info on Coko https://coko.foundation/announcing-xsweet-1-0/

If you want to do a deep dive into why I think this is important, I wrote this some time ago – https://www.adamhyde.net/typescript-redistributing-labor/

A day in Northland

When in NZ, I often have meetings at 1am, 2am, 6am… since most of the team is ‘somewhere else’ 😉 That can be pretty tiring but a bonus is that if I can muster the energy I can sometimes slip away on a weekday, during the day, and enjoy Northland. Some photos from such an adventure today. I took my neighbor Mike out for a day’s surf’n at Ahipara.

The first two photos show a car we passed that had tried to get around the rocks (we drove around there today – it needs a good 4WD). The car broke an axle and then the tide came in and wrecked the car… it was a sad sight to see and a sober reminder of how treacherous the coast can be.

dsc05406

dsc05407 These photos of us returning to the Hokianga on the Rawene ferry. The end of a great day.

img_20180503_171022 img_20180503_170649 img_20180503_170632 img_20180503_170255