Open Source – Adam Hyde

I have been involved in many open source projects but the below is a good snapshot of some the developments I am proud to have been involved with. For a more personal and detailed account of my journey, please see https://www.adamhyde.net/some-publishing-systems-i-have-developed/

Paged.js

This is a javacript library for rendering HTML content as PDF, built for pagedmedia.org. I always compare it to the types-setting role of the traditional book printer. It adds Javascript to a web page and transforms it into a paginated book complete with page breaks, margins, page numbers, table of contents, front matter, headers etc. Paged.js is version 2 of book.js which I developed in 2012.

PubSweet

PubSweet is the technical foundation of much of my current work with the Coko Foundation. It is an ecosystem for building customized publishing workflows. Anyone involved in publishing, be it academic journals or books, has a specific workflow unique to their organization, team and type of output. A document will bounce from one person to another as it goes through the multiple stages of review and editing, before it is actually put out into the world as a final product. This process is typically done over email with word documents. PubSweet provides the framework to simplify and speed up this often clunky and frustrating process, allowing developers to work with publishers to design the publishing platform to suit their specific workflow and requirements. PubSweet is designed to be modular and flexible, consisting of a server and client that work together, components that can modify or extend the functionality of the server and/or client, and a command-line tool that helps manage PubSweet apps. PubSweet has a large OS community around it and we hope many new platforms will emerge out of it.Coko is currently working with both scholarly and book publishers to design bespoke publishing systems built on top of PubSweet.

All of these are designed to be configurable and extensible so that the technology can evolve with people’s workflows rather than needing to be rebuilt.

There is now a book about Pubsweet: written by our community including representatives of Coko, Hindawi, eLife and EBI – all of whom have been using Pubsweet to build the publishing platforms which complement their workflows.

XSweet

Released in 2018, Xsweet at its simplest is a tool to convert Word documents (.docx) to HTML. Content from a Word doc is recognized and rendered as HTML which can be published as is, imported into another application, or converted into other formats such as PDF. In its full functionality XSweet is a suite of XSL (eXtensible Stylesheet Language) tools which allow us to create publishing workflows without the lagging email chain of confusingly-named word documents. By converting docs to HTML, all the processes involved in the editing and publishing process happen in real time. It also handles the fiddly things which no one wants to do manually like converting plain-text URLs to hyperlinks and inferring and tagging headings and heading levels based on visual formatting. When the editing is done the book is ready to publish and can be exported in multiple formats. Xsweet stands alone or runs within INK. It also works very well with Wax, Coko’s web-based word processor for styling, editing and improving content.

Wax

Wax is an in-browser collaborative editor built on top of Javascript / Substance.io libraries. It is designed for HTML-first workflows so that content and figures can be edited and produced in the browser. It has most of the functionality one might expect in a word processor such as adding images, saving, undo, redo, track changes, comments and various text styles. On top of this it provides functionality specific to book publishing; formatting styles such as epigraphs, image annotations and special characters as well as role-based permissions and locking. Packages can be added and removed as needed. Most importantly, it is not bound to a frontend framework which ensures its longevity. The UI is also customizable depending on the user’s desired layout. This aspect is currently being extended.

Editoria

Over the course of 12 months in 2017-2018 Coko worked with the University of California Press and the Californian Digital Library to build Editoria– a low-cost online platform to manage the production of books from the point of the author handing over a manuscript to the point of publication. It has capabilities for writing, editing, workflow management and output to pdf and epub in a browser-based platform. It also has permissions based on roles such as “authors” “production editors” and “copy editors”. I facilitated the Collaborative Product Design process (see Facilitated Methodologies) together with the end-user community. I also set up and worked with the Editoria development team based in Athens. I’m very proud of the product and am looking forward to seeing how it influences the future work of the University of California Press and the Californian Digital Library.

Xpub

xpub is a family of platforms for the production of journals, built on top of Pubsweet by the Coko community. There are currently four different platforms in the family: xpub-collabra, xpub-elife, xpub-faraday (Hindawi) and xpub-epmc and there will certainly be more. They rely on the headless framework of Pubsweet and are then customized to meet the specific needs of the organization’s workflow.

Legacy Projects

INK

INK is Coko’s ingestion, conversion and syndication environment that converts content and data from one format to another, tags with identifiers and normalizes metadata. Often with publishing toolchains, there are several things that need doing to a raw document before it is ready to publish. With INK a user can create a pipeline of steps all in a row that need to be executed in sequence. Just for example: convert (files from word to html) > clean up (HTML) > modify (image size, colour) > translate (from the original language into another) > analyze (the contents of the doc and generate a summary. There are many steps ready to use in the INK framework – and since its open source more will always be added – which the end-user can chain together in whichever order suits their needs. INK consists of an API, written in Rails, and a client, written in Javascript. Anyone with their own server can set up and run INK.

Aperta/Tahi

In 2013, I designed platform for the submission and peer-review of scientific journals for the Public Library of Science (PLOS). It was originally called Tahi but renamed to Aperta. In 2014 I was asked to lead a team to build the platform. I led the 15 strong team to the production-ready 1.0 release of this multi-million dollar project to completion, on time and under budget in June 2015.

I quit the project soon after when the PLoS Board went against the original intention to build an open project and decided to close the repositories. This seemed particularly imbalanced because the development of Tahi had been paid for by the research community around PLoS. The PLoS Article Processing Charges fuels PLoS and they committed some of this revenue to the development of Tahi.

Aperta was designed to be highly collaborative and concurrent. The platform included a manuscript production interface, HTML and LaTeX document editing support, Word ingestion, a workflow management system, task management interfaces, admin interfaces, reports, and user dashboards. The platform was built in Ember-CLI, Rails, implements a highly customised Wikimedia Foundations Visual Editor, and uses Slanger for concurrency. It was an HTML-first system, has many innovative new approaches to journal systems, and solves many long-standing problems in this space. The project also involved a separate codebase named iHat that provided Aperta with an API service for queue-managed file conversions.

In 2017 development on the project was brought to a halt. Aliso Mudditt, CEO of PLoS explains why in this blog post to the community.

Book Sprints Production Tools

Developed in 2012, a book authoring and production platform still in use by my company, Book Sprints. It was built specifically for Book Sprints: strongly facilitated events where a writing and production team conceptualize, write, review, edit, design and publish a book in 3-5 days. The platform is very simple to use and is a fairly easy transition for anyone used to writing in word processing software or on a blogging platform. The main interface or a user is an HTML WYSIWYG editor and a dynamic table of contents with support for multiple chapter types. It also offers dashboards, publishing consoles, card-based workflow management (task manager), discussions and data visualizations of contributions.

PubSweet can produce EPUB and leverages book.js (see below) to produce print-ready PDF. PubSweet is written in PHP, using Node on the backend, and CKEditor as the content editor.

developed_pubsweet1

Lexicon

2012
Lexicon is a platform produced for the United Nations Development Project to collaboratively produce a tri-lingual (Arabic, French, English) lexicon of electoral terms for distribution in Arabic regions. Lexicon provided concurrent editing for chapters with multiple terms, sorting by language, discussion forums and voting. Lexicon was written quickly in php with Node.

The Lexicon was created with the aid of an innovative collaborative writing tool customized to suit the needs of this project. This web-based software allowed the authors, reviewers, translators and editors to simultaneously input their contributions to successive drafts from their various countries.”

http://www.undp.org/content/undp/en/home/presscenter/pressreleases/2014/11/19/undp-launches-first-lexicon-of-electoral-terminology-in-three-languages.html

Renderer

2012
Headless webkit renderer for embedding BookJS for server side rendering of HTML to PDF.

StyleJS

2012
A basic experiment for applying live CSS updates to HTML in the browser. It never made it to production.

BookJS

2012
Book.js is an earlier iteration of Page.js described above. book.js has given inspiration to a number of other JS pagination engines. See Vivliostyle, bookJS Polyfil, Pagination.js, simplePagination.js, and CaSSiuS.

2_bookjs

Booktype Designer

2012
This was an in-browser design environment for applying CSS styles to both EPUB and PDF outputs. It was an experiment and never made it to production.

Booktype

2010 / 2012
Booktype is a web based book production platform built for editorial teams working on complex projects and with the ability to import and output multiple formats. It is a newer iteration of ‘Booki’ which I developed in 2010 and brought to the OS developers of software for journalism, Sourcefabric in 2012. Booktype is written in Python (Django).

Objavi

2008
Objavi is an API-software service originally written for Twiki Book (see below) but which also serviced Booki and later Booktype. Objavi converts books from their native HTML into PDF for printing. It also handled other file conversions (eg HTML to ODT, HTML to EPUB etc). I later produced a similar API-based conversion software for San Francisco Public Library of Science (PLOS) known as iHat. Objavi is written in Python.

FLOSS Manuals Production Platform

2006
The first open source publishing system I designed and built never had a name. It was developed so that the global community of free software enthusiasts could collaboratively author and distribute free documentation and how-tos about free software. It was first developed in English but soon became multilingual and even produced right-to-left manuals in Farsi.

The FLOSS Manuals system was based on TWiki, an earlier hack consisting of an open-source Perl-based wiki that had good PDF-generation support and a good plugin and template system. We used the Twiki account creation and permission systems and with some clever JavaScripting by Alexander Erkalovic to allow multiple versions, an HTML WYSIWYG editor, a separate mechanism for creating an ordered table of contents, and you could compare different versions of the same chapter, view overall progress and dynamically change the chapter order. A remix system enabled import of content from other manuals; a side-by-side editor was used for translation, supported by federated content to enable the translation workflow; and output was in many formats, primarily PDF and zipped HTML, and later EPUB.

The system produced many manuals and printed books about free software and is still being used today.