Intro to the Command Line

Many years ago when I was experimenting with Book Sprints I suggested to the Free Software Foundation that we do a Book Sprint at the LibrePlanet meeting in Boston (2009). The idea was to make a book introducing people to the command line. I asked Andy Oram (friend and editor at O’Reilly) if he would help me put it together and so we announced it and both turned up at the event to make it happen.

The Book Sprint was meant to be an opt-in. So ‘anyone’ who was at the event could come in and write the book. Of course ‘everyone’ wanted instead to go to the presentations! So we had almost no one from the event attend…it felt very lonely for a bit…

Then, a few people turned up online and because they couldn’t participate with the conference they wanted to be part of LibrePlanet and contribute to the book. It was an astonishing thing. More and more people turned up. I remember going to bed on the first night (it was a 2 day Book Sprint) and feeling like all was not lost, waking up the next day and a whole lot more of the book was done. It was amazing.

There were some fantastic contributions and we coordinated it all through online chat (I think it was IRC but can’t remember). I remember in particular having a really lovely chat with one participant – I think their name was ‘Freaky Clown’ but I can’t remember for sure. I remember a sense of solidarity and peer-ness when talking to them about the book and the process. There were a lot of things that made that process special.

Anyways…we produced the book in 2 days and it is quite an amazing work.

introduction-to-the-command-line

It was on offer from FLOSS Manuals (still there – http://write.flossmanuals.net/command-line/introduction/), but more interestingly the Free Software Foundation offered it for sale – https://shop.fsf.org/books-docs/introduction-command-line

The cool thing about that was (apart from the FSF publishing it which was already pretty cool) that they used their own cover art etc and made it their own… freedom in action.

Apparently the book has been very successful – its used as a textbook, has been reviewed a lot, and seems to get a lot of use – but imagine my surprise when I saw in the LibrePlanet meet of 2018 this program item:

Introduction to the Command Line brainstorming session

We’re updating the popular 150-page Introduction to the Command Line. What do you think should be in the new edition? We’ll be discussing content and process for updating this important work.

A product of a partnership between the FSF and Floss Manuals, this book gives new computer users a gentle, beginner’s window onto Bash, vim, a few scripting languages, and other key tools offered on the Unix/GNU command line. A lot has happened since the book was released in 2009. We want to include new developments without substantially increasing the length of the book.

Pretty cool! Andy hosted the session and I hope it went well… very nice to think these crazy things you are a part of become something and take on a life of their own….

I even found the only image I know of taken at the event (by me) when (on the second day) some folks took pity on us sitting by ourselves and came in and contributed…Andy on far left.

introcommandline-about-sprint-en

I also found this old blog I did on the FSF site –https://www.fsf.org/blogs/community/introduction-to-the-command-line

Circles…

Well… I was driving from Cambridge (UK) to Newquay and chatting to my buddy Faith about all the things I have been working on (Faith is rewriting some of my website) over the last 10 years. I was commenting on how crazy circular it all is at the moment… I found myself documenting Editoria a few weeks ago, using Editoria to document it… which is just the kind of thing I started doing when I started FLOSS Manuals 10 years ago. And then this week I was in a Book Sprint this week as a participant for the first time whereas it was the first time as a facilitator 10 years ago… and this week I was with a team of Coko people writing a book about a Coko product and at the same time with a team of Book Sprints people helping Coko to write a book…and I started both organisations… its all so recursive….

Its been a bit weird. And as I’m chattn to Faith and driving through a lovely summer’s solstice day in the UK pondering this out loud and what do I accidentally discover?….

img_20180622_154608

dsc00443

img_20180622_154639

Y’know… that place with the rocks in a circle….And I didn’t even know I was near it…I don’t know… gotta take account of things like that I reckin..

Off to Book Sprint!

A long time ago I started FLOSS Manuals to create free documentation for free software. It was my big get rich quick plan. Along that journey, which lasted 4 years or so, I had to come up with some things to make it all work… namely, I had to:

  1. learn how to build community
  2. build tools to help people collaborate to make books
  3. come up with methodologies to make books fast

Out of this came the FLOSS Manuals community, many (open source) book production platforms (most notably Booktype), and Book Sprints – a method to produce books from zero in 3-5 days.

Now… 10 years later…. here I am traveling to the Cambridge, UK, to be part of a Book Sprint facilitated by the CEO of the company I started, to write a book about PubSweet (a publishing toolkit for building publishing platforms built by Coko, which I co-founded), and the book will be written by the Coko team and Coko community…and written using Editoria – a book platform built ontop of PubSweet BY the Coko community!!!

Talk about recursive…. its like dog fooding on steroids…

INKy Thinky

Recently we released INK 1.1, it is a great milestone to arrive at, just a few weeks after 1.0. Charlie Ablett, the lead dev for INK, has been working very hard nailing down the framework and has done an amazing job. It is well engineered and INK is a powerful application. You may wish to ink1 if you wish to learn more about where INK is now.

For the purposes of the rest of this post (and if you don’t wish to read the above PDF) I should provide a bit of background, cut directly from the PDF:

INK is intended for the automation of a lot of publishing tasks from file conversion, through to entity extraction, format validation, enrichment and more.

INK does this by enabling publishing staff to set up and manage these processes through an easy to use web interface and leveraging shared open source converters, validators, extractors etc. In the INK world these individual converters/validators (etc) are called ‘steps’. Steps can be chained together to form a ‘recipe’.

Documents can be run through steps and recipes either manually, or by connecting INK to a platform (for example, a Manuscript Submission System). In the later case files can be sent to INK from the platform, processed by INK automatically, and sent back to the original platform without the user doing anything but (perhaps) pushing a button to initiate the process.

The idea being that the development of reusable steps should be as easy as possible (it is very easy), and these can be chained together to form a chain (recipe) that can perform multiple operations on a document. If we can build out the community around step building then we believe we will be able to provide a huge amount of utility to publishers who either can’t do these things because they don’t know how or they rely on external vendors to perform these tasks which is slow and costly.

A basic step, for example, might be :

  • convert a MS Word docx file to HTML

Which is pretty useful to many publishers who either prefer to work on an HTML document to improve the content, or have HTML as a final target format for publishing to the web. Another step might be :

  • Validate HTML

Which is useful in a wide variety of use cases. Then consider a step like:

  • convert HTML to PDF

That would be useful for many use cases as well, for example, a journal that wishes to distribute PDF and HTML.

While these are all very useful conversions/validations in themselves, imagine if you could take any of these steps and chain them together:

  1. convert MS Word docx to HTML
  2. convert HTML to PDF

So, you now have a ‘recipe’ comprising of several steps that will take a document through each step, feeding the result of one step into another, and at the end you have PDF and HTML converted from a MS Word source file. Or you could then also add validations:

  1. convert MS Word docx to HTML
  2. convert HTML to PDF
  3. validate HTML

That is what INK does. It enables file conversion experts to create simple steps that can be reused and chained together. INK then manages these processes in very smart ways, taking care of error reporting, inspection of results for each individual step, logs, resource management and a whole lot more. In essence it is a very powerful way to easily create pipelines (through a web interface) and, more importantly, takes care of the execution of those pipelines in very smart ways.

I think, interestingly, conversion vendors themselves might find this very useful to improve their services, but the primary target is publishers.

In many ways, INK is actually more powerful that what we need right now. We are using it primarily to support the conversion of MS Word to HTML in support of the Editoria platform. But INK is well ahead of that curve, with support for a few case studies that are just ahead of us. For example, INK supports the sending of parameters in requests that target specific steps in the execution chain, and it supports accounts for multiple organisations, and a few others things that we are not yet using. It is for this reason that we will change our focus to producing steps and recipes that publishers need since it doesn’t make sense to build too far ahead of ourselves (I’m not a fan of building anything for which we don’t currently have a demonstrated need).

These steps will cover a variety of use cases. In the first instance we will build some very simple ‘generic steps’ and some ‘utility steps’.

A Utility Step is something like ‘unzip file’ or ‘push result to store x’ etc…general steps that will be useful for common ‘utility operations’ (ie not file processing) across a variety of use cases.

A Generic Step is more interesting. Since there are a lot of very powerful command line apps out there for file processing we want to expose these easily for publishers to use. HTMLTidy, Pandoc, ebook-convert for example, are very powerful command line apps for performing conversions and validations etc. To get these tools to do what you want it is necessary to specify some options, otherwise known as parameters. We can currently support the sending of parameters (options) to steps and recipes to INK when requesting a conversion, so we will now make generic steps for each of these amazing command line apps.

That means we only have to build a generic ‘pandoc’ step, for example, and each time you want to use it for a different use case you can send the options to INK when requesting the conversion.

The trick is, however, that you need to know each of these tools intimately to get the best out of them. This is because they have so many options that only file conversion pros really know how to make them do what you need. You can check out, for example the list of options for HTMLTidy. They are pretty vast. To make these tools to do what you need it is necessary to send a lot of special options to them when you execute the command. So, to help with this, we will also make ‘modes’ to support a huge variety of use cases for each of these generic steps. Modes are basically shortcuts for a grouping of options to meet a specific use case. So you can send a request to run step ‘pandoc’ to convert an EPUB to ICML (for example) and instead of having to know all the options necessary to do this you just specific the shortcut ‘EPUBtoICML’…

This makes these generic steps extremely powerful and easy to use and we hope that file conversion pros will also contribute their favorite modes to the step code for others to use.

Anyway… I did intend to write about how INK got to where it was, similar to the write up I did about the road to PubSweet 1.0. I don’t think I have the energy for it right now so I will do it in detail some other time. But in short … INK has its roots in a project called Objavi that came out of FLOSS Manuals around 2008 or 2009 or so. I’ll get exact dates when I write it up properly. The need was the same, to manage conversion pipelines. It was established as a separate code base to the FLOSS Manuals book production interfaces and that was one of it’s core strengths. When booki (the 2nd gen FLOSS Manuals production tool) evolved to Booktype Objavi became Objavi2. Unfortunately after I left the conversion code was integrated into the core of Booktype which I always felt was a mistake, for many reasons of which I will leave also for a later post. So when I was asked by PLoS to develop a new Manuscript Submission System for them (Tahi/Aperta) I wanted to use Objavi for conversions but it was no longer maintained. So I initiated iHat. Unfortunately that code is not available from PLoS. Hence INK.

The good news is, this chain of events has meant INK, while conceptually related to Objavi and iHat, is an order of magnitude more sophisticated. It fulfils the use cases better, and does a better job of managing the processing of files. It also opens the door for community extensions (Steps are just a light wrapper, and they can be released and imported as Gem files into the INK framework).

At this moment we are looking at a number of steps, still deciding which to do first so if you have a use case let us know! A beginning list of possible steps is listed below:

Utility steps

  • Zip/tar into an archive (or unzip/untar)
  • Third-party API call
  • Collect all modified files from previous steps
  • Email all files
  • Download a file from a URI
  • SFTP files to external store
  • Generic terminal command

Generic Steps

  • Calibre
  • Mogrify (batch image processing)
  • Convert (Imagemagik command)
  • HTMLTidy
  • PDFTKF
  • Vivliostyle
  • Pandoc
  • WKHTMLTOPDF

Steps

  • HTML Validation
  • JATS Validation
  • Plagiarism checks
  • Interpolation of document structure and extraction/annotation of parts (Abstract, methods, results etc)
  • Subject matter taxonomy identification
  • Image extraction
  • DOI Assignment and registration
  • EpubCheck (being written by Richard Smith-Unna)
  • Natural Language Processing to identify authors, title, institution names, etc in an academic paper
  • Identify place names in a document
  • Identify people names in a document
  • Convert HTML body to JATS
  • Create valid JATS metadata
  • Merge JATS body and metadata
  • Syndicate content to various services
  • Push to Continuum

The Road to PubSweet 1.0

We are pretty close to our PubSweet 1.0 with the RFC now out for PubSweet 2.0, and a PubSweet dev site release next week.

pubsweet

It has been an amazing effort, particularly by Jure Triglav, the lead dev for PubSweet at Coko, but also fantastic work from Richard Smith-Unna, Alf Eaton, Yannis Barlas, and Christos Kokosias. Also more recently some great contribution from Alex Georgantas.

9

So, we are pretty much there and I’m presenting in San Francisco this week as part of a small Coko event to reflect on the future of the framework and discuss the RFC. For this purpose I’d thought I’d write a post to help me think through the thinking that got us here.

So…the thinking behind PubSweet started when I came back from Antarctica around 2007 or so (I was there setting up an autonomous base for artist-scientist collaborations).

5jan2007-2

I decided I wanted to give up the art world and try something new. The something new turned out to be FLOSS Manuals – a community writing free manuals about free software. I started it when I was living in Amsterdam somewhere around 2007. In order to execute on this mission I needed to get a couple of things sorted. Namely, learn how to build community, work out processes for rapid book production, and work out the tooling.

The tooling started with me scratching around with TWiki. A wiki written in Perl that happened to have the best plugins for rendering PDF. I scratched around, writing some Perl and cutting and pasting a whole lot more, and added some crazy .htaccess URL rewriting to produce a basic system for producing books. It was pretty scratchy but it actually worked. Later a buddy helped extend the system and later still I was able to pay him and others to extend it.

At the time it pretty much comprised a page (per book) for creating a table of contents.

developed_blog_chapterlist

and an interface to edit the content (chapters). I ripped out the native wiki markup editor and replaced it with a WYSIWYG editor, I think it was TinyMCE…

developed_fm_farsi

As you can see Right-to-Left content (in this case, Farsi) was also supported. There were also some basic things in place for keeping track of the status of a chapter, the version number, side by side diffs, side by side translation interfaces, and, later, dynamic table of contents organisation and edit locks.

Coupled with some basic PDF rendering stuff and a way to push the content from the ‘draft’ to the publishing front end and we were away.

flossmanuals-1

It actually had some other pretty cool stuff, such as side by side translation interfaces…

flossmanuals-howtotranslate-sideby-en

Remix

..a built in live chat for talking with collaborators…

booksprints-remotecontrib-chat-en

and even a way to send books between different instances (eg for sending a book from the FLOSS Manuals French community site to FLOSS Manuals Finnish for translation)….

flossmanuals-xchange-xchange_1-en
We could even render book formatted PDF and push to the lulu.com print on demand services. I just now checked and some of the books are still there!

Not bad for a Perl-based system, built on top of a wiki that wasn’t supposed to do this kind of thing, and built very with very few resources. The TWiki extensions were contributed back upstream to the TWiki repo and it was all open source but it was pretty hard to rebuild and no one I knew actually had a similar use case.

After this, I embarked on a journey to replace the system with a custom built solution specifically for book production. I can’t remember exactly when this started, maybe 2008 or 2009 or something. It was originally called Booki…

b

…which later became Booktype. Booki (and later Booktype) replaced the FLOSS Manuals tooling, although you can still see the working old tool here. That ole Perl code is still functional with no maintenance after 10 years, I can hardly believe it. The docs on how to use it also still exist.

Booki was built with Django (python) and pretty much had all the same stuff. Although the look and feel was changed quite a bit in the transition. There aren’t too many images around of Booki although I did find these screenshots of Booki taken by someone using it on the OLPC XO! (FLOSS Manuals did all the docs for OLPC/Sugar OS etc).

readingandsugar-booki-en

readingandsugar-booki5-en

It was hard to get financial support for it. Internet Archive gave us $25,000 at the time which seemed like a fortune. The evolution of Booki to Booktype represented me taking the project to a buddy’s in Berlin (I was living there at the time) based org (Sourcefabric) and parking it there so I could get more resources to build it out.

Booki/Booktype pretty much had, and has, the same stuff as the FLOSS Manuals system, just purpose built. So it had, a table of contents manager

images-cms-image-000003634

And a book (chapter) editor…

booktype_version_2_editor_screenshot

…chat…

booktype_2-0_edit_communication_tabs_chat

And the other stuff. Perhaps the only new features (compared to the FLOSS Manuals system) were a dashboard…

booktype_1-6-1_my_dashboard_740x429_1

…groups…

all_groups

and an interesting way to have Twitter-like messaging to pass snippets from chapters to other users.

figure-7_reference

Before I left Sourcefabric I wanted to get some other innovations built but didn’t get there. I did build some prototypes though. There was a task editor…

booktype-1024x531

…and live in-browser book design…

developed_booktype4-1024x768

Booktype is still going strong, now it is its own company (based in Berlin) and they also run the Omnibook commercial service using the software.

I left because John Chodacki and Kristen Ratan from PLoS invited me to come work for PLoS to design a new web-based journal submission system. I agreed…

But, before I leave the book story behind for a bit..I had set up Book Sprints as a company and put a small amount of my own money into building two new book production systems somewhere between leaving Sourcefabric and starting at PLoS. These two systems were PHP-based and Juan Gutierrez built them over some months.

juan

I wanted to do this because I was a little frustrated by Booktype not moving forward and also the platform was becoming more difficult to use. We were using it for Book Sprints but after I left the product took a new UI direction and I was finding Book Sprints participants were not enjoying using the system. So I built a Book Sprints specific system called… PubSweet… the namesake of the current Coko system which has eventually turned into something of a prototype for the new PubSweet… this new system was a lot simpler and easier to use than Booktype. It was initially meant to be modular but I think we lost that somewhere along the way. Cleanly modular systems take a lot of extra effort and time to produce so we gave in for speed of development’s sake.

The old PubSweet had a dashboard….

screenshot-from-2017-07-15-18-37-13

..table of contents manager…

ps1

and editor. Just like before!

screenshot-from-2017-07-15-18-40-08

We also introduced some new innovations including visualisations of the book production process…

developed_seo

Plus annotation (using Nick Stennings annotator software)…

screenshot-from-2017-07-15-18-48-01

 

and other stuff…I think threaded discussions, outline views, review page, an in-browser book renderer, book stats and I can’t remember what.

Anyway …I also built a platform on top of this old PubSweet for the United Nations Development Project. It was called Lexicon. Lexicon was pretty interesting as it opened my mind for the first time to the idea that an editor is not an editor is not an editor. Different content types (in a book) may require different editors or production environments.

Lexicon was produced to collaboratively produce a tri-lingual (Arabic, French, English) lexicon of electoral terms for distribution in Arabic regions.

screenshot-from-2017-07-15-19-09-11

Lexicon had all the same stuff as the old PubSweet but with one major innovation, you could create chapters that were WYSIWYG based, or you could create a chapter which enabled you to add and sort individual terms and provide translations.

screenshot-from-2017-07-15-19-09-34

It was a pretty interesting idea and we were able to make a really cool book which the UNDP printed and distributed across many Arabic-speaking countries. I still have the book on my bookshelf.

The other interesting thing was that the total cost for building this on top of the old PubSweet was $10,000 USD. This was mostly because we could leverage all the existing stuff and just build the difference…interesting idea!

Ok, so then I dropped book production systems around 2013 or so for a while and went to work for PLoS on a system that was called Tahi and then became Aperta. The name Tahi came from the name of the street I was living on in New Zealand before I had a US work visa and was designing the system – Reotahi Road (cool road). Reotahi means ‘one voice’ and ‘tahi’ means ‘one’ in Maori. It was built on Rails with Ember. Essentially the front end and backend were decoupled although it was really pushing the technology at the time to do this. I designed the system and moved to San Francisco to manage the team to build it.

Tahi (Aperta) had a dashboard (surprise!) and editor, just like the book production systems but I introduced two major innovations – Cards, and card-based workflow management interfaces. Unfortunately, while I was asked to come and build an open source system, things went a little weird at PLoS and they closed the repos, effectively making it a closed platform. So I quit. That also means I don’t have any screenshots to show you. Pity. If you sign an NDA with PLoS I believe they might show it to you.

However, you can picture it a little – imagine something like Trello, or Wekan – these are card based kanban systems. But imagine if you could custom make cards to do anything. Effectively cards were first class citizens of the platform and could access the db, perform system operations, make external calls, do validations, whatever you wanted. In hindsight, I think they were as close to an idea of an ‘app’ that you could have in a browser platform, although that wasn’t the way I thought about them at the time. Additionally, cards were imported into the system since each card was actually a gem file. This meant any publisher could custom make their own cards to do whatever they wanted and place them within the kanban-like workflow space (task manager). Pretty neat.

So, cards could be surfaced and used anywhere in the system. We used them for authors to enter submission data, but also for production staff to perform operations, for reviewers etc etc etc. They could also be placed on a kanban board to make a workflow. Cards could be moved around the workflow and deleted or new ones added at any time.

To manage all this my other idea was to let these cards flow through a TweetDeck-like interface. So you could sort cards, per role, per user, at volume.

Tahi essentially had four spaces – a dashboard, a submission page (which displayed the manuscript in an editor, and submission data could be entered through cards), a task manager (workflow for the article, using cards as tasks), and a ‘flow manager’ (the TweetDeck-like interface for sorting all your cards across all your articles). While the FLOSS Manuals, Booki and Booktype platforms were pretty much monolithic systems, the old PubSweet was sort of modular. However, Tahi did decouple the front end and back end but I wanted to also break these four spaces into discreet components. That would have given the system enormous flexibility but unfortunately I wasn’t able to do this before I left.

Anyways, Tahi/Aperta is a little old now but it was pretty cool. I don’t know what happened to Aperta but I believe it is now being used for PLoS Biology.

After I left PLoS I was offered a Fellowship by the Shuttleworth Foundation to continue on the mission to reform publishing. So I started Coko with Kristen Ratan (who was the publisher at PLoS)….

img_9224_792x445

So there are some themes from building the past 7 or 8 publishing systems (depending on how you count it… there were also some other interesting experiments in between). First, the next system you build is always better. That is for sure. It’s an important thing to realise because when I developed the FLOSS Manuals system I thought that was it. Nothing could be better! But I was wrong. Then Booki/Booktype and I felt the same thing. I was so proud of it and nothing could be better! haha… you get the picture. The reason why it’s important to understand this is because I think it gives someone like me a bit of freedom. I can take some risks with systems knowing you get some stuff right, you get some stuff wrong. But the next system will get that bit you got wrong, right. Taking this attitude also takes the pressure off and you can have more fun which is good for your health, the team you are working with, and the system.

As far as technical lessons learned… well… after looking back at all these systems when we started Coko, I realised that the idea of independent ‘spaces’ for publishing workflows had a heap of currency. How many systems did I have to build with baked in dashboards, task managers, editors, table of content managers, etc etc etc before I could realise it doesn’t make sense to do this over and over. I wanted to take the idea of these kinds of spaces forward and not have to build them again and again… so some kind of system where you could include whatever spaces/components you wanted would be ideal… This would have two very important side benefits:

  1. I could learn so much because if the next system you build is always better, what about a framework that would allow you to easily build a whole lot of systems at once! Or build a lot quickly over a short amount of time… just imagine how much you could learn…
  2. It would open the door for others to innovate. I have since given up the idea that my system (so to speak) was the best ever and no one could top it. That’s just the testosterone talking. I’m kinda over it (sorta). I want other people to be able to make better stuff than what I have produced so far, to bring in innovations I never thought of. I want to make that easy for them and now I understand a whole lot better how publishing workflows actually work I’m in a very good position to do that.

That was a lot of the thinking behind the new PubSweet – PubSweet 1.0. But there is some other stuff too. Through my time at PLoS, I came to understand just how many variables affect workflow choices in journal publishing and that each publisher has slightly different conditions and roles that affect this. That means that the access control is complex. We might think there are various roles – author, editor, reviewer etc that shepherd an article through a process but it’s not that simple. Any number of conditions can affect who gets to see or do what and when. So we need to have a very sophisticated way to set and manage this.

There was a lot of other stuff to take into account to but I mention these two specifically because recently when I was talking to Jure (lead PubSweet dev) about PubSweet 1.0 and reflecting on how far we came he nailed it, he identified the two major innovations of the system being:

  1. reusable/sharable components (spaces)
  2. attribute-based access control

I agree entirely. I think I might add another:

  • developer experience

It is pretty easy, and getting easier, for developers to develop publishing platforms/workflows (call them what you will) with PubSweet. I think it is pretty astonishing and I think these 3 characteristics put together enable us to build multiple publishing systems fast and in parallel (with small teams) as well opening the door for other to do the same and huge opportunities for innovation.

If we are successful at building community this will be a huge contribution to the publishing sector.

In a future post, I’ll break PubSweet spaces / components down in more detail. There were also a lot of other similar stories regarding technical innovations on the way (eg Objavi->iHat->INK), but I’ll break them down into posts on another day.

I meant to also talk about Editoria here, the monograph production system built on top of PubSweet, and xpub – the PubSweet-based Journal system.

2016-06-28_15-41-37

They are both pretty amazing and leverage so much more than the previous systems identified above.

Login page for our first Journal platform.

I think the main thing with them is that we are working extremely closely with publishers using the method I developed – the Cabbage Tree Method.

Editoria Design Session

This means that I am no longer involved in building, what I would call, naive publishing systems. Naive in the sense that publishers could use, for example, Booktype, but it’s not really built for publishers. It’s a general book production system built by someone who didn’t know much about publishing at the time. That’s great of course, there is a place for it. However, Editoria is not a naive system. It is designed by publishers for publishers and the difference is enormous.

But I will leave a longer rant about this for another post.

I do however, want to say that I didn’t, of course, build any of the above systems by myself. There were many people involved and I have credited them elsewhere in this blog. I’m not going to do another roll call here except for Jure Triglav.

Jure and I sat down just over 18 months ago to discuss some of the lessons I learned as explained above. We jammed it out over post-its, whiteboards, coffee, and food in Slovenia and you can read a little more about that process in the PubSweet 2.0 RFC. But Jure trusted me, and I trusted him, and he took these ideas and, with a small team in very good speed, made them a reality. As a result, I think PubSweet is an exciting system and will only get better. Congratulations Jure, you deserve special thanks and recognition for the absolutely amazing job you have done.

dsc00516-1

The invisible skill

I do a lot of facilitation and I’m very good at it. I don’t know how I came to be a facilitator exactly, it’s not the sort of thing I think you go to school to learn. It just somehow comes out of you bit by bit on a road to achieving something else entirely. In my case, I think it came about while trying to build FLOSS Manuals (a community that produces free manuals about free software). FLOSS Manuals (FM) needed content, and I realised that it was not going to happen at the scale needed if I wrote it all myself. So I had to learn to build community and building community requires facilitation (even if you don’t know it at the time).

floss-manuals-openmind-4-728

Anyway, long story short, I became a facilitator the hard way – by not knowing that it was facilitation that I was doing. I had no context. ‘How you get things done’ in the world seemed to be all about the doing of those things. If you want to write software manuals, for example, you wrote manuals. That is what publishers and author, editors etc did. Who ever heard of facilitation in the context? No one that I could find.

So it took me a while to even realise that facilitation could play a role in making manuals (books) and I only had that realisation after I first tried it the publishing industry’s way and failed. It’s even fair to say that I had no understanding that facilitation could have a role in helping people to make books until I had been facilitating people to make books for many years. I was that blind to the idea. Instead, I had this strange, slowly evolving awareness that somehow when I got people in a room and I was there also, then books resulted. It seemed like it was the participants that ‘were doing it’ and, bizarrely, every time we did it, it worked, every time we did it it was better than the last time, and I happened to be there to witness it.

8004_11

It took me some time to work out that this result was because of the role I played. It took a very very long time – I would say, possibly, 2-3 years. The awareness didn’t come in one shot either. I first thought it must be the process that made it work. So I started trying some stuff out. I remember very strongly thinking that there was this process that actually existed, like publishing processes exist, and that it was concrete and tangible, but it was just that it was unknown.  As seriously kooky as that might sound, that’s how I thought of it at the time. I felt I was discovering something that pre-existed, some process that just needed to be revealed.

Then, slowly, after Book Sprints were really kicking ass and producing remarkable books I’d have thoughts like ‘I wonder how important this process is?’ I wonder if it might actually be me that is the most important ingredient. Not me in an egoist way, like Adam Hyde is the only person that can do this (interestingly, other people thought this might be the case!), but maybe that it is not so much the process but what I am doing that is making this work.

7917_01

So comes the understanding of the extremely interesting and tangled relationship between facilitation and method. I spent many years untangling it and now I think I have a 1.0 understanding. Like, at the school of facilitation if you get this idea, then you are actually allowed to call yourself a facilitator…. it is that basic, and that central to what facilitation is. My un-battle-hardened synopsis is this series of truths might seem a little contradictory: a good facilitator is better with a good methodology. A good methodology is nothing without a good facilitator. And a good methodology to a good facilitator is nothing but an interesting yet weak navigational instrument.

Anyhow. My sum of this is that I often get people telling me they would be a good facilitator for this or that, or that they would be if they had the opportunity. I also see a world where methodology is seen as king, you just need to read it and follow it to the letter and you’ll be sweet! I can’t blame people for this. How could I when it took me so long to understand that facilitation was a thing? I can’t blame people when they think it’s something anyone can do. But of course, I do find it frustrating. I’m no saint! But after many many years of practice and experimentation, pondering, trying things out when I was terrified they would fail, failing, succeeding, mentoring others to do it well, exploring the weird psychology of it all, seeing others do it so very badly – I can now say I know what facilitation is and how special it is to be able to do well this invisible skill that so many do not know even exists.

16312812953_10cda14f90_b

Simplified, Narrative, and Transient Attribution

I’ve been thinking through attribution models for many years. Mainly for books, but it is also interesting to consider how credit is attributed in open source projects. For both open source and books, I believe narrative attribution is a great model which escapes many of the problems of simplified attribution, but there are some interesting gotchas… so, let’s take a look at it….

A copyright holder is different to those people we attribute credit to for producing a work. Michael Jackson, for example, held the copyright for the Beatles songs for a long time, however we attribute the credit for those songs to Ringo, George, Paul and John.

I like to think of this ‘naming’ of contributors to be a type of simplified attribution. It is a handy tool to be able to hang credit on a couple of names. Nice and clean, not complicated, and easy to remember. It is also the way copyright works. Copyright requires some names to be cited as the copyright holder and many times we, out of habit, co-relate the named copyright holders with the credit for producing the work. It’s just how our brains work.

However, there are a couple of things we need to tease out here. First, named credit is not synonymous with the named copyright holder and we should be a little more careful with this and avoid conflating the two. Michael Jackson, for example, clearly did not write the Beatles songs, so we should be careful to avoid conflating credit and copyright.

Secondly, simplifed attribution doesn’t actually tell us much. It is not rich information. It’s a couple of names and we tend to attribute far too much credit to those names. We have a romantic notion of creation and we tend to want to think there is a ‘solitary genius’ behind things. If not a solitary genius, then a couple of identifiable people that we can attribute genius to collectively. But this is really fantasy. Great things come from many people, not one. One person by themselves isn’t capable of much. How individuals work together is the real story and these stories are far more complicated than the vehicle of simplified attribution can convey.

Those who know the story of the Beatles, for example, know there is far more complexity to how the songs were written and by who. Each Beatles song has its own history and unique way of coming into existence. The Song Michelle, for example, was mainly written by Paul McCartney, with a small portion a collaboration between McCartney and Lennon. Even more interesting, is that other characters came into play. According to Wikipedia:

McCartney asked Jan Vaughan, a French teacher and the wife of his old friend Ivan Vaughan, to come up with a French name and a phrase that rhymed with it. "It was because I'd always thought that the song sounded French that I stuck with it. I can't speak French properly so that's why I needed help in sorting out the actual words", McCartney said.

Vaughan came up with "Michelle, ma belle", and a few days later McCartney asked for a translation of "these are words that go together well" — sont des mots qui vont très bien ensemble. When McCartney played the song for Lennon, Lennon suggested the "I love you" bridge. Lennon was inspired by a song he heard the previous evening, Nina Simone's version of "I Put a Spell on You", which used the same phrase but with the emphasis on the last word, "I love you".

The actual story of how the song was written reveals a much more interesting story than the typical way we assign credit for these songs. I find the story of how something was made, and by whom, way more interesting and informative than a list of a couple of names. I also think it’s a better way to attribute credit as it values the contribution of each individual and brings forward the interesting ways in which they made the contribution. I think this kind of narrative attribution is a far more important way to credit people. It honours both the people and the process a lot better than the more simplified naming of names.

I want to say something about transient attribution in a bit, but before moving on I’d like to give another example of why I think narrative attribution is necessary as we move increasingly into a collaborative future. At FLOSS Manuals, an org I founded 10 years ago to collaboratively produce free software manuals, we automated simplified attribution. If you made an edit, your name was automatically added to the credits page. That worked ok for the naming of names. But how useful is this?

Well, if one person claimed to write the book, this might be useful. But below is an example which illustrates the problem…this is a screenshot of the credits page of a freely licensed book that was updated and rewritten every year by a different team of people for 4 or 5 years. So, the simplified attribution looks like this :

fireshot-capture-1-credits-civicr_-https___docs-civicrm-org_user_en_stable_appendices_credits_

Well…you get the picture! It is pretty meaningless information, or at least, there is not much utility from this kind of attribution. It is, in short, too much information and too little information at the same time. A list of names like this is nothing more than a long list. It doesn’t highlight the many interesting ways individuals contributed, and no one here actually gets much credit at all from such as list because no one will read it. It is pretty much useless.

So…what can we do? Well, I believe we need to get away from simplified attribution at all times. We need to migrate to narrative attribution. We need to learn how to tell stories about how people make things and who was involved. It is for this reason that at FLOSS Manuals we started writing “How this book was written” chapters. Here is an example from the book Collaborative Futures which is now stored in the Wayback Machine. A brief snippet:

This book was first written over 5 days (18-22 Jan 2010) during a Book Sprint in Berlin. 7 people (5 writers, 1 programmer and 1 facilitator) gathered to collaborate and produce a book in 5 days with no prior preparation and with the only guiding light being the title ‘Collaborative Futures’.

These collaborators were: Mushon Zer-Aviv, Michael Mandiberg, Mike Linksvayer, Marta Peirano, Alan Toner, Aleksandar Erkalovic (programmer) and Adam Hyde (facilitator).

It is a short story but it goes some way to bring out a little of the nuance of how the book was made. We could have gone further but it gets the point across. We also made sure that this way of attributing credit was included as a chapter in the book. So wherever the book went, the story of how it was made travelled with it.

I think we could do the same with software. Let’s not conflate copyright with credit, for a start. Next, let’s proceed to a more interesting way of attributing credit. Not a naming of names, but telling a story of who was involved and how. Finally, let’s make sure this story is part of the software and travels with it ie. put the story in the source code repository.

But there is an obvious flaw to this approach. What happens when many people are involved over a long time. If you had been paying attention, for example, to that 29,000 pixel tall monstrosity I included as a screenshot above, you would have already deduced that narrative attribution is not going to do a better job of attributing credit than simplified attribution.

Just how long would the story be for that same book? It would be pretty long! So, how do we deal with this? Surely no one would read that story either? Good point! This is where I believe transient attribution comes into play.

Transient attribution is the recognition that large complicated works take a long time to make. In the case of software, the job pretty much never ends. A software must be updated to add features, or it may need to be refactored, updates needed for security fixes, or to meet the needs of new versions of operating system…it just doesn’t stop. That could mean that we just keep adding to the story, making it longer and longer as we go. But I would like to suggest another approach. I think it is more interesting if we were to tell only the story of the last phase leading up to the current version of the book / movie / software etc…there is really no need to tell the whole story all in one go…break it up into smaller parts. Learn to tell the story as it evolves. The software world already, kind of, does this in Release Notes. We commonly include Release Notes that document the differences between the previous and latest versions of the software. It is, in a way, the story of the software, but it is not the story of the people. Why not adopt this model for attribution too? Focus on the latest part of the story and call people out and celebrate their contributions for this most recent phase.

My recommendation is this:

  1. use narrative attribution to credit people for their work
  2. tell a story that says something about what they did
  3. use transient attribution to tell the story in smaller, more timely, pieces
  4. ensure the story travels with the software (ie. stored in the repo)

It doesn’t feel like such a stretch. It’s not that difficult to do either. The point is, software takes a long time to make, happens in stages, and different people with a diversity of skills come into play at different times. Let’s celebrate all the people that made it happen, and celebrate this as closely as possible to the moment they did the work. Further, the story of how a software is made is far more valuable to everyone than a simple naming of names. If we could take this short step (and it is not far) I think we would have richer attribution, happier communities, and a richer understanding about how software is actually made. And that, if you ask me, is a good thing.

The first not-really Book Sprint

korcula_overview
Korčula

I have been poking around in my own archives recently. It’s pretty cool to remember things that I had completely forgotten. Anyway, the following is a post I found about the first Book Sprint. It was about Pure Data and held on the little island of Korčula, Croatia. A buddy, Darko Fritz, let us use his apartment in the center of the old town. I was to join two friends, Derek Holzer and Luka Prinčič, at the apartment for 3-ish days. However, on the way I had a health freak out and had to participate remotely. It was our first attempt at a Book Sprint and looks almost nothing like how Book Sprints are done now, but it was a starting point and gave me a lot to think about.

The following is the report written by Luka for the FLOSS Manuals site. I love the two images in the post. They are the only images Luka and Derek sent me of this mysterious event.

So, Adam sent us (he was supposed to come too, but couldn't at the end) - he sent Derek Holzer and myself to this Croatian island Korcula, into the little (but very very cute) town ­named the same (Korcula) so that we Sprint-it up and write a Pure Data manual for FLOSS Manuals. I took my little old red car and drove down from Ljubljana over Rijeka and under scary stoney Velebit entering the warm sunny region of Dalmatia and driving further towards Split. Now this was a fun ride, especially under Velebit was very scenic. My cup of tea, as I hate the boredom of motorways. And my clutch was going to hell too, so my fingers were crossed. In split I caught this superfast catamaran-pedestrian-ferry that took me in under three hours to Korcula where Derek was waiting for me. And for phone line. And for internet connection.

news-_img_2484_1-enIn this lovely little town with incredible feel of Venetian architecture, tiny little streets, all set in stone, we waited for ADSLine to appear, which it did right next morning. The evening before we discussed over a plate of Italian pasta and a beer what would be good to work on in the coming days. After the internet magically appeared (yeah right!) we could also discuss with Adam, who was in Amsterdam, over Skype how to work and what to work on. Derek was somewhat the main captain although each of us worked on our own chapter. Adam was going after installation instructions for various platforms, Derek was rewriting and expanding tutorial on how to build a simple synth, and I was going deep into 'Dataflow' tutorials.

news-_img_2496-en On Monday when I arrived, the sea was quite wild as there was a lot of wind. Tuesday was all gray and rainy. We had some talk about doing some hiking, as Derek had already explored some terrain, but Tuesday was a bit too wet, so we stayed in and worked on bits and pieces trying to see how much work there is and how writing feels. It seems from the after-perspective that this was the day to slowly get the grip on how the writing will go.

On Wednesday, we took a bus in the morning to a nearby village by the coast and with a little map from the tourist office tried to find that little path that would correspond to a tiny dashed black line that goes up the hill towards another village about 500m higher. We searched for more than an hour until finally a communicative small but wide man told us that old paths were overridden with a new tarmacced road. "I used to go up those little paths, when I was young..." he said, "but now there's a road so why would one bother, but if you really want there's a little path there, you can find it...." We walked a path for a while and than decended slightly to find a road and walked the road for an hour, maybe two, under not-too-hot spring-ish sun. It was lovely. After icecream in a local shop (those local shops are always something!) we caught a bus and went back 'home'. We started to work immediately, had some lunch and worked quite late into the night. We talked to Adam over Skype and kept working each on our own part.

Interestingly, and maybe because we were supposed to leave on Saturday morning (Derek had a plane ticket), next two days somewhat turned into a bit of frenzied work. As we knew we had just two days left, we tried to finish our parts. In consequence, at the end of Friday, we realized we hadn't gone out much. A real Sprint! One of us would go to the shop now and then, couple of times a day, and cook something (mostly pasta! - but I found some local home-made from Istria, and also Korcula olive oil!) The weather was nice, not cold, with many bits of sun. The voices of kids, sometimes horny cats, and other inhabitants were protruding through an open door and were making a special kind of atmosphere. It was lovely to feel this Venetian-medieval town around you. Our flat was on the third, top floor, we could see the sea from it and the roofs around us.

On the friday evening we found ourselves with two huge finished chapters, both quite satisfied with it. We were tired but happy we made something. We searched the town for fish and found an empty but slick-looking restaurant with a young owner-cook who made us some fish and vegetables. I guess we could have made them ourselves at home, but none of us spotted a fishery and had much time and energy left. I must say it has also been very educational for me to listen to Derek's peculiar playlist which included a lot of metal, old rock (Black Sabbath, etc) and African (Angolan) music from 70s. Luckily, Derek had a near perfect (or at least compatible) feel for sound levels, so it was always very enjoyable to become aquanted with a genre I haven't explored yet in my life much.

In the aftermath of the Sprint of two-and-a-half days, we both felt we wouldn't be able to go like this for another day. If anything, we would need a day, or ideally two days of real break from this and than we could come back to the book with passion. I was thinking this would be great to do three or four times. In waves. Either that, or steadily work for 8, maybe 10 hours per day and then have a break. But I think this also depends on the writer and her/his ability and fitness to write. 

At the last moment I decided to stay for another two days (as our 'landlord' told us we could stay for us long as we liked, and Adam agreed). So Derek left on Saturday morning (6 am, ugh) and I worked on some of my music and had couple of walks around town and surroundings, really enjoying it immensely.

My car was still where I left it in Split, and the clutch was still functioning (barely), so I drove back, under lovely Velebit of course. It was almost like summer all the way to the border with Slovenia, where heavy rain was waiting for me.

Korčula image by PJL (Own work) CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/)

Some publishing systems I have developed

developed_floss

I recently went over some publishing systems with Yannis and Christos from Coko. We looked at various systems and discussed them. As we did this, I realised that I’d actually built quite a few! Although we weren’t just talking about those I had built, I began to think through what I had done right and wrong when building those earlier systems. Each development is a learning process and you will always get things both wrong and right at the same time. The trick is to get less wrong the next time round…

In the ten years that I have been building these systems, I have worked with some pretty talented people, including Luka Frelih, Douglas Bagnall, Alexander Erkalovic, Johannes Wilm, Remko Siemerink, Juan Gutierrez, Lotte Meijer, Fred Chasen, Michael Aufreiter, Oliver Buchtala, Nokome Bentley, Andi Plantenberg, Mike Mazur, Rizwan Reza, Gina Winkler and many others including the entire team of Coko – the most talented bunch of them all. Coko team:

Kristen Ratan – CoFounder, San Francisco
Jure Triglav – Lead PubSweet Developer, Slovenia
Richard Smith-Unna – PubSweet Developer, Kenya
Yannis Barlas – PubSweet Frontend Developer, Athens
Christos Kokosias – PubSweet Frontend Developer, Athens
Charlie Ablett – INK Lead Developer, New Zealand
Wendell Piez – XSL-pro, East Coast USA
Julien Taquet – UX-pro, France
Henrik van Leeuwen – Designer, Netherlands
Kresten van Leeuwen – Designer, Netherlands
Juan Guteirez, Sys Admin, Nicaragua

Alex Theg – Process, San Francisco

All systems except, unfortunately, one (see below) are open source.

FLOSS Manuals

top_read

The first publishing system I designed and built didn’t have a name really. It was the glue behind the FLOSS Manuals English site. FLOSS Manuals was, and is still, a community focussed on building free manuals about free software. I started the development in English but the system needed to be useful to a number of different language communities, of which Farsi was the most interesting. Today, only the French and English communities are still operational.

I built the FLOSS Manuals system in Amsterdam in about 2006. It was based on TWiki, an open-source Perl-based wiki. I chose TWiki because back then it was the only wiki around that had good PDF-generation support – I think this came from some of its plugins. TWiki had a good plugin and template system and I came to think of it as a rapid prototyping application – it was pretty malleable if you knew how. I was a crap programmer so I cut and pasted my way to a system that became usefully functional.

developed_twiki

I leveraged the account creation and permission systems of TWiki, and ripped out the wiki markup editing and replaced it with an HTML WYSIWYG editor (I think it was TinyMCE). So in wiki world terms I had committed something of a crime. Throwing out wiki markup at the time was unheard of in the post-2004 euphoria of Wikipedia. But JavaScript WYSIWYG editors were coming along just fine, and I thought wiki markup no longer made any sense (in fact it had been invented to make making HTML easier than writing raw HTML). But y’know… wiki markup was no easier than using a WYSIWYG editor, despite the sacred status of Wiki markup in the Wikimedia community. And despite my heresy,the Wikimedia did give me a barnstar for my efforts, which was nice of them:)

developed_star

After I reached the limit of my coding skills, a friend, Aco (Alexander Erkalovic), helped build the next bits. I found some money to pay him and that is when things started to move forward. I can’t remember all parts of the system, although it’s still in up and running for some FLOSS Manuals language sites. The core of the system was the manual overview page. This contained a list of all chapters in a manual. You could also set the status, add new chapters, view overall progress etc. from this one page. We had a separate mechanism for creating an ordered table of contents (index) for a manual.

developed_fmindex

Index builder, essentially a dynamic drag and drop mechanism for arranging chapters

developed_fm_overview

The overview page

Essentially, you added a chapter on the overview page and then opened the index builder and arranged the chapters. When saved, the (refreshed) overview page displayed the correct (new) order of chapters. We had to do it like this because back then, in the era of the ‘page refresh,’ it wasn’t possible to synchronise dynamic elements across multiple clients. So we couldn’t have one ‘shared’ index builder that would dynamically update all user sessions simultaneously. Nevertheless, it worked pretty well.

From the overview page, you could choose a chapter to edit. When you did so, you locked the chapter and, through some JS trickery Aco cooked up, we wereable to do this across multiple clients so everyone could see in real time who was editing what.

developed_blog_chapterlist

When editing a chapter, you could save the document and then, when ready, publish it. This way you could have an ‘in progress’ state of a chapter and a published state. At publish time the chapter was copied across to the system’s web delivery cache.

We also added chapter status markers, as you can see from the above image. It was pretty basic but nifty.

Next I hacked in a live chat, initially using IRC (freenode) for a global FLOSS Manuals-wide chat:

developed_irc

developed_irc2-irc-en

Then I hacked a fancy AJAX script into the chapter edit interface so each manual could have its own chat with the interface present while you were editing a chapter. It also looked a little nicer than the IRC channel. From the beginning, I tried to make everything look like it was meant to be there, even if it was a fiddly hack.

developed_adamwindow

FLOSS Manuals also had a lot of other interesting goodies. We had federated content, for example. This was established so one language site could import an entire manual from another and set up a translation workflow.

developed_xchange

We also had a simple side-by-side editor set up for translation that worked pretty well for translators.

developed_fm_metaphorIn addition, we had a remix system for generating new versions using mixed content from other manuals. This was useful for workshops and for making personal manuals, for example.

develped_remix_dropdown

developed_renixed2

One of the cool things about the remix is that you could output in many formats such as PDF and zipped HTML, add your own styles through the interface, PLUS you could embed the remix in your own website 🙂 The embed worked similarly to methods used today to embed YouTube or Vimeo videos in websites (by cut and pasting a simple snippet). The page delivery was ‘live,’ so any updates to the original manual were also displayed in the embed. I thought that was pretty cool. No one used it of course 🙂

developed_remix3

The system also had diffs so you could compare two versions of a chapter. It was good for seeing what had been done and by whom. In addition, it was possible to translate the user interface of the entire system to any language using PO files and a translation interface we custom built:
developed_translateBut by far the most interesting thing for me was building FLOSS Manuals to support Farsi. It was interesting because, at the time, no PDF-renderer I could find would do right-to-left rendering. Behdad Esfahbod advised me to just use the browser to do it. Leslie Hawthorn from Google Open Source Programs Office introduced me to him after I went on a naive search in the free software world for ‘someone that knew something about RTL in PDF’. I was very lucky. The guy is a generous genius. He just suggested an approach a new way to think about it and later I worked with Douglas Bagnall (see below) to work out how to do it. The basic idea being that HTML supported RTL and the PDF print engines rendered it nicely…so…it was my first realisation that the browser could be a typesetting engine.

Implementing RTL in the FLOSS Manuals system was so very tricky, and I was unfortunately on my own for working out Farsi regex .htaccess redirects and other mind-numbing stuff. Just trying to think in right-to-left for normal text did my head in pretty fast, but somehow I got it working.

developed_fm_farsi

Outputs of all language books were PDF and HTML, later also EPUB. I initially used HTMLDoc for HTML-to-PDF conversion. It was pretty good but didn’t really think like a book renderer needs to. This was my first introduction to the overly long wait for a good HTML-to-PDF typesetting engine. Later I found some money and employed Luka Frelih and then Douglas Bagnall to make a rendering engine separate from the FLOSS Manuals site (see Objavi below). Inspired by my chats with Behdad (above) Douglas introduced some clever PDF tricks with the Webkit browser engine to get book formatted PDF from HTML. I can’t remember exactly how he did it but essentially he used xvfb frame buffer to run a ‘headless browser’. In short, and for those that don’t know those terms, he came up with a very clever way to run a browser on a server to render PDF. It did it by rendering HTML to PDF in pages (using the browser PDF renderer) and then rendered another set of slightly larger blank PDF with page numbers etc and embedded one within the other. Wizard. Hence we were able to make PDF books from HTML. It also supported RTL 🙂 I think I need to say here that this whole process was immensely innovative and we did it on a dime. Also, because we refused to use proprietary systems we were forced to innovate. That was a very good thing and I welcomed the constraint and the challenge.

Later we also used WKHTMLTOPDF to make PDF. It worked and we worked with that for a long time. I even tried to start a WKHTMLTOPDF consortium with Jacob Trulson, the founder of the project, it got some way but not far enough (I am very happy WKHTMLTOPDF is still going strong!).

I played a lot with Pisa and Reportlab for generating PDF and finally cracked it when I started book.js (more on this below). Actually, when it comes down to it I think I used everything that wasn’t proprietary to make book formatted PDF from HTML. It was the start of a long love affair with the approach. I promoted this approach for a long time, even calling for a consortium to be formed to assist the approach:
http://toc.oreilly.com/2012/10/the-new-new-typography.html
http://toc.oreilly.com/2012/11/gutenberg-regions.html

As it happens, all this time later I’m still doing the same thing 🙂
http://www.pagedmedia.org

We integrated FLOSS Manuals with the Lulu API (now defunct). This allowed us to generate books automatically from HTML using Objavi (below) and they would be automagically uploaded to the Lulu print-on-demand marketplace for sale…that was amazing! Ah…but also no one used it. Lulu shut down the API as soon as they realised no one was using it.

We made many many manuals with this setup. Many of them printed from the auto-PDF magic and were distributed as paper books. Many of the books were in Farsi. The One Laptop Per Child community even built a FLOSS Manuals app that was distributed on all OLPC laptops with the documentation made with FLOSS Manuals.

developed_books

The system produced loads of manuals and printed books about free software. All free content.

developed_stripofbooks

The FLOSS Manuals platform didn’t have a name. It was a hack of TWiki. While you could get all the plugins online, it would have been a nightmare to deploy. I did deploy it many times but I essentially copied one install to another directory and then cleaned out all the content and user reg etc. and went to town rewriting all the .htaccess redirects. It sounds stupid now, but I spent a lot of time doing URL redirects to ditch the native TWiki URLs (which were extremely messy) and make them readable. Hence the system was a pain to deploy or maintain. It was feeling like we needed a standalone, dedicated, system….

Booki

Booki was the next step. It was clear we needed something more robust and also it was clear no web-based book production system existed, hence the hackery Twiki approach. So it was about time to build one. At the time, though, I remember it being very difficult to convince anyone that this was a good idea. I didn’t have access to publishers, and everyone else thought books were just soooo 1440. What they didn’t realise is that we were building structured narratives and that, IMHO, will have a lot of value for a long time. Call it a book or not. Anyway, we built Booki on Django (a Python framework) and the first time we used it was a Book Sprint for the Transmediale Festival in Berlin.

developed_collab

Aco literally would be building the platform as it was being used in the Book Sprint – restarting when everyone paused to talk. It was an effective trick. We learned a lot working with the tool and building it as it was being used.

At the time I couldn’t imagine book production being anything other than collaborative. FLOSS Manuals collaboratively built community manuals. Book Sprints were short events building books collaboratively. So Booki was meant to be all about collaboration. Booki also was run as a website (booki.cc) which was freely available for anyone to use.

develope4d_booki1

Many groups used it which was cool.

developed_booki2

Booki.cc run from with the OLPC laptop

Booki had all the basic stuff the FLOSS Manuals system had and we advanced the feature set as our needs became more sophisticated, but the basics were really the same.

developed_booki3

The main differences were that we had a dynamic ‘table of contents’ where you could add and rearrange chapters etc and the updated ToC would be dynamically updated across all user sessions. Hence the ToC became a kind of ‘control panel’ for the book.

developed_bookitoc

We also experimented with data visualisations of book production processes.

developed_civicrm

developed_seo

We did some cool stuff with the visualisations. For example, during the Open Web Book Sprint (also in Berlin) we worked in the Hungarian Embassy. They had a huge window that you could backwards project onto so people could see the projections on the street. So we made a visualisation of text from the book being overlayed as we wrote it. Below is an image shot with us standing in front of the projector…I mean..c’mon…how cool is that? 🙂

developed_hungary

developed_hungary2

Booki also had federation. You could enter the target URL of a book from another install and Booki 1 would make a portable archive (booki.zip) and send it to Booki 2. Booki 2 then unpacked the zip and populated the book structure complete with images etc. I liked the idea very much of using EPUB instead as a transport technology instead and was later able to do so for Aperta and PubSweet 1 (see below).

In general, Booki didn’t advance much further from the FLOSS Manuals set up. It kind of did ‘more of the same’. I think the only stuff that went further than the previous system was the dynamic table of contents plus it was easier to install and maintain. Having groups was also new, but that wasn’t used much. It was in some ways just a slightly different version of the previous system.

Booki was also used for producing a tonne of books.

developed_2books

developed_collabfutures

Objavi 1 & 2

Alongside the development of the FLOSS Manuals system and Booki, I headed up the development of Objavi. Objavi is basically a separate code base that was used as a file conversion workhorse. Objavi 1 was a little bit of a hacky maze but it did a good job. It would basically accept a request and then pipe that through a preset conversion pipeline. It did a good job. What I found most useful from this is that each step left a dir that I could open in a browser to inspect the conversion results. So if the conversion needed several steps, this was very helpful in troubleshooting.

Objavi 2 was meant to be an abstraction. However I don’t think it really got there, and Booki, which later became Booktype, came to internalise these conversion processes after I left the project. I always thought internalising file conversion was a bad idea because file conversion is resource-intensive, making it better to throw it out to another external service. And having a separate conversion engine enables you to completely overhaul the book production code without changing the conversion code. Hence FLOSS Manuals could migrate to Booki but still use the same external conversion engine. This was a HUGE advantage. Further, all the FLOSS Manuals instances, as well as booki.cc could use a single Objavi install for their conversions.

Objavi was actually also the magic behind the federated content in both the FLOSS Manuals system and Booki. All content would be passed through Objavi for import and export so Objavi became the obvious conduit for passing a book from one system to another. This gave me a lot of ideas about federated publishing which I have written about elsewhere and archived here.

Booktype

developed_booktype1

Booktype was really Booki taken to market. I was frustrated by not getting much uptake, so I took Booki to Sourcefabric in Berlin and headed up the development there. Booktype gained a UX cleanup. The editor was changed after an event I organised in Berlin called WYSIWHAT. The event was meant to catalyse energy around the adoption of a shared editor for many projects. It was of many things I did in pursuit of the perfect editor. At that event, we chose the Aloha contenteditable editor. I don’t think that was as useful a change as expected at the time, but back then contenteditable looked like the way to go despite there being few editors that supported it. Since then TinyMCE, CKEditor and others all have contenteditable support, further econtenteditable became a bit of a lacklustre implementation in browsers.

Booktype was literally the same code as Booki but rebranded. So many of the same features persisted for quite some time until Booktype eventually took on a life of its own.

developed_booktype2

Displaying ‘diffs’ (differences) between 2 different versions of a chapter

developed_booktype3

 

Activity stream of a book

I prototyped some interesting things in Booktype but not much of it got built. For example, I was keen on editing content in the style of the final output. The example below is using the CSS layout of Open Design Now which I mentioned below with reference to BookJS.

developed_booktype4

I also made a task manager prototype based on kanban cards but it was never integrated into Booktype.

developed_booktype5

I think there are only really three parts of Booktype’s development that occurred while I was in charge of the project. First was the integration of a short messaging system into a user’s dashboard and into the editor. Essentially you could highlight text in a chapter, click on the msg widget and a short message could be sent with that text to whoever you wanted (in the system). It was intended for fact checking or editing snippets etc. The snippets were loaded into an editor to assist with this kind of use case.

developed_booktype6

A good idea but seldom used.

In addition, Booktype supported groups, so users could form groups which were populated by users and books. The idea behind this was that you could form a group to work on a collection of books collaboratively.

Lastly, the renderer was integrated in a more sophisticated way so you had more control of the output from within Booktype. Essentially you could choose a number of output options and style them from within Booktype.

developed_booktype7

These were all interesting additions, but in the end, Booktype was really only Booki taken a step further as a product without offering much that was new. I think we should have probably actually removed a lot of things rather than adding new things that didn’t get used.

Booktype developers added some interesting stuff after I left. I particularly like the image editor and the application of themes to the content.

The image editor enables you to resize and effect an image from within the chapter editor.

developed_booktype8

The theme editor allows you to choose from an array of styles/themes and edit them.

developed_booktype9

developed_booktype_10

Booktype 2 has since been released. It has become a standalone business and is doing good work. Also, the Omnibook service is a ‘booki.cc’ online service based on Booktype 2.

Booktype has been used by many organisations and individuals to produce books.

developed_cryptoparty

developed_booktypeangola

I think Booktype 2 is good software but I didn’t particularly like the direction of the Booktype UX after I left the project – it was becoming too ‘boxy’ and formal. ‘Good UX’ is not necessarily good UX. So I eventually developed another system with a much simpler approach, specifically for using with Book Sprints (but it has had other uses as well). More in this in a bit.

book.js

One innovation, and a further exploration of using the browser as a typesetting engine, that resulted from my time with Booktype is book.js. Essentially I had been looking for a new way to render books from HTML using the browser. I wanted to understand how Google docs could have a page in the browser and then render it to PDF with 1-to-1 accuracy. Surely the same could be done with books? However, no one could tell me how it was done. So I researched and eventually found out about the nascent CSS Regions that would allow you to flow text from one box to another in HTML. I worked with Remko Siemerink at a workshop in Amsterdam to explore PDF production from CSS. We made an interesting book with some of the Sandberg designers.

developed_bookjs

After more research and breaking down what I thought could happen, Remko worked with CSS Regions (& js) to replicate the page design of a book called “Open Design Now.” It was amazing. He got the complex design down plus it was all just HTML and CSS, it looked like a page AND when printed from the browser it retained a 1:1 co-relation. Awesome.

The following summer I was able to employ someone for Booktype to work on the tech and I hired a developer (Johannes Wilm) to work on the PDF rendering. From there we eventually had BookJS that enabled in-browser rendering of books which could then be exported to print-ready PDF by just printing from the browser. Whoot!

developed_contents

After time, BookJS (original site still available) could also formulate a table of contents etc. It was, and is, pretty cool and IMHO is the right way to do this. Unfortunately, Google Chrome decided to discontinue CSS Regions so if you now want to use BookJS you have to use a very old version of Google Chrome. However, better technologies have come along that support the same approach, namely Vivliostyle (which Johannes later worked on).

Github Editor

During the time with Booktype, I also did some experiments in other processes. One was using Github as the store for an EPUB and using the native zip export that Github offered to output EPUB (since EPUB is just a zip formatted in a specific way). Juan Gutierrez put together a quick demo and I published about it here:
http://toc.oreilly.com/2013/01/forking-the-book.html

Sadly the demo is no longer available but later a good buddy, Phil Schatz, adopted the idea and built something similar and (I think) much better:
http://philschatz.com/2013/06/03/github-bookeditor/

PubSweet 1

Since Booktype was going its own way, I wanted to develop a new, lighter, system for Book Sprints. Hence Juan and I worked out PubSweet, a simple PHP-based system. This would later be reformulated as a JavaScript system (see below).

developed_pubsweet1

PubSweet was very simple. Essentially 3 components – a dashboard, a table of contents manager, and an editor. It would later evolve to include more features but it was really the same as the systems I had developed earlier, though simpler. I wanted to get to a cleaner interface and bare basic features. I also wanted to retain some book elements, hence the table of contents manager looks a little like a book table of contents (except the garish colors ;).

PubSweet 1 is still in use by the Book Sprints team and functions well. It uses BookJS for rendering paginated books and for PDF rendering straight out of the system. It can also generate EPUB directly. Apart from that, it is pretty simple and effective. I used a basic card-based task manager for this, based on an earlier prototype I made when working with Booktype. It was a simple kanban type board but we never properly integrated it.

We included annotation using Nick Stennings Annotator project:

developed_pubsweet2

PubSweet 1 has been involved in producing more books than I can count, for everyone from OReilly books to Cisco, to the World Bank, UNECA, IDEA etc etc etc

developed_oreilly

developed_labcraft

 

developed_cisco

developed_f5

PleigOS

Somewhere along the way, I developed a simple system for creating Pleigos – one-page books created by folding a single piece of paper which has text and images. The system places text and images so that when you fold a single page after printing, a small book is formed. It was more an artistic experiment than anything but it was fun.
http://www.pliegos.net/

Note: the Pleigos project was by my good friend Enric Senabre Hidalgo, I just developed the initial Pleigos software.

Lexicon

The UNDP approached me to build software for developing a tri-lingual lexicon of electoral terminology. The languages to be supported were English, French, and regional Arabic. It sounded interesting so I used PubSweet as the basis for this.

The main difference between Lexicon and PubSweet was that you could choose to create a chapter from 2 different types of editor. Editor 1 (‘WYSI’) was a typical WYSIWYG editor. This was used for producing chapters with prose. The second type of editor (‘lexicon’) allowed you to create a list of terms, each with three different translations – English, French, and Arabic. This opened my eyes to the possibilities of having different types of editors for different content types, a strategy I hope to use again.

developed_lexiconThe ‘lexicon’ editor in action

The final print output looked pretty good:

developed_lexicon2

The system enabled a dozen or so people from different Arabic regions to discuss translations and work collaboratively through a list while at remote locations. We built a specific discussion forum for them as part of the system and had up and down voting etc. I liked this project a lot because it was very different from any other project I had worked on yet it employed many of the same strategies.

Aperta

Along the way I was approached by John Chodacki of PLOS to build a Journal system for them. I knew nothing about scientific journals, so I went through a process of re-education 🙂 It sounded interesting! I spent a year-and-a-half heading up a team to design and develop this system. This is pretty much the first time I wasn’t scraping together pennies to build a system.

developed_aperta

Journal systems aren’t that different from book production systems, in fact doing this project helped me realise that the production of knowledge, in general, follows a particular high-level conceptual schema:

  1. produce
  2. improve
  3. manage
  4. share

Each artifact (journal article, chapter, book, issue, legislation, grant application etc) follows its own kind of path, with its own unique processes, through these four stages. I have written more about this elsewhere in this blog so won’t go into more detail about it here.

Research articles, (Aperta wasn’t initially intended to deal with Journal Issues which are a collection of articles) come into a Journal as (predominantly) MS Word. They then need to follow a process of checks (eg. to make sure the article is right for the journal etc) before going through the hands of various editors (handling editors, academic editors, etc) and reviewers, including a back-and-forth with the author(s) and a final pass through a production team and/or external vendor to prep the files for publication. The biggest difference I found from my previous experience with book production systems is that there is more to-and-fro involved. Simply put, there is more workflow. So Aperta addresses this with a simple workflow engine based on the Trello/Wekan kanban card-based model.

Aperta was built in Rails with Ember JS. The front end was pretty much decoupled from the backend. It was pretty ambitious and we decided to use the Wikimedia Foundations Visual Editor at the heart of the project. It was the most sophisticated editor available at that moment. I have come to learn that you have to take what you have at the time and work with it. We committed to adopt it and make contributions. I hired the Substance.io chaps (Michael and Oliver) to work on the editor. They did a great job. (Subsequent development of their own editor libs has enabled us to work with them for Coko (more below). )

Aperta’s approach was to simplify the submission process that was managed by Aries Editorial Manager. We leveraged the OxGarage project for MSWord-to-HTML conversion and imported the manuscripts into Aperta. Then workflow templates could be set using kanban cards (as per above). These cards then enabled a flexible method for gathering information (submission data) from the authors at submission time. A lot of time was spent on getting good clean, simple, UX. The kanban card system was very modular – the cards themselves were their own applications, enabling anyone to build a card to their own requirements. This made the cards extremely powerful as they were essentially portable applications.

Further, I developed a model of sorting a lot of cards into columns much like Tweetdeck’s saved-columns model. It was all rather neat. A ton of innovation. Unfortunately, I don’t have any screenshots apart from what I see PLOS has on their website:
http://journals.plos.org/plosbiology/s/aperta-user-guide-for-authors
http://journals.plos.org/plosbiology/s/file?id=d45b/Aperta-Quickstart.pdf
http://journals.plos.org/plosbiology/s/aperta-user-guide-for-reviewers

Aperta is in production now for PLOS Bio but unfortunately the code still isn’t available.

I think Aperta radically rethought how Journal workflows might be managed. I learned a lot through this process and it informed decisions and architecture for the next platform I was to work with – PubSweet 2, developed by Coko.

I also think taking a step into another realm where many concepts were transferable but the use case was different was very good for me. It helped me abstract a few notions. It was also good to build a sophisticated framework in another language and to have the resources to try things out. It was an excellent incubation period where I learned a lot while building a pretty good platform.

During this period I also developed a kind of Objavi 3 but I’m not sure now what it is called or if it is used.

PubSweet 2

The following systems are a result of a team of very talented people at Coko. This has been the first group of people that I have worked with that enable the systems to get close to what I believe is an ideal state for the problems at hand, the time we live in, and with the technology available. If I have learned one thing over the past years, it is that, generally speaking, development teams often prevent good solutions from happening. The Coko team is a rare case where the solutions can flourish and become what they need to be. I’m very lucky to be working with such people.

PubSweet 2 is not actually a publishing platform, it is a de-coupled JavaScript framework that enables the production of multiple publishing platforms.

developed_pubsweet_2dev

With all the other platforms I have been involved with I have always learned something. So the next platform is always better. Strangely enough, because when I look back I remember, for example, thinking that Booki was the best platform ever, and perhaps the best I could do. But not so. Each subsequent platform has been much better. So… why not build a framework that enables the rapid development of many platforms at once? good idea! Imagine what you could learn… Indeed!

Using PubSweet, we have developed two platforms to date. Science Blogger and Editoria. For the purposes of this blog post I’ll just focus on Editoria as Science Blogger is more of a reference implementation. However, all these front components can be developed independently of each other, and independent of the core framework. Hence we have released some PubSweet NPM modules (sort of like front end plugins) online:
developed_node

If you would like to read more about PubSweet check here:
http://slides.com/eset/ismte

Editoria

https://editoria.pub/blog/

developed_editoria

Editoria Book Builder component

What is interesting about Editoria is that I didn’t design it. I facilitated the staff at the University of California Press to design it. And what is really interesting is that it is a better book production software than any of the above that I did design…. That’s not to say that I finally realised I was a crappy designer 😉 , rather I came to realise that given the right guidance and parameters, the people with the problems can solve the problem with more deep understanding and nuance than I could as an outsider to their processes. It was a great thing to realise.

In general, Editoria follows the same model as PubSweet 1. It is a simple collection of 3 interfaces – dashboard, book builder (in PubSweet 1 we refer to it as a table of contents manager) and a chapter editor. Very similar to PubSweet 1 and also these components fit into the conceptual schema I mentioned above of:

  1. produce – this happens outside the system and we convert authored MS Word files to HTML
  2. improve – styling, editing, reviewing, all occurs in the editor component
  3. manage – basic workflow managed by the Book Builder component
  4. share – when done we export to book formatted PDF, and EPUB

Editoria is a very simple and elegant solution. It’s the only one of the systems I have worked with that is not in production but I’m looking forward to seeing it up and running soon.

It uses the Substance.io libraries to build a custom made and elegant editor. Finally with some real $ to spend on moving this field forward, and as part of my 2015 Shuttleworth Fellowship I committed $60,000 USD (10k a month) or so to Substance, Coko followed it by a similar amount, so they could focus on the development of their libraries to 1.0. Coko also put considerable effort into founding a consortium around Substance (although I’m not convinced it is really working well yet).
http://substance.io/consortium/

Editoria brings some nice implementations to the table including the use of Vivliostyle to render PDF from HTML. Plus MSWord-to-HTML conversion, front and back matter divisions plus body content. Pagination information (left/right breaking) etc. In addition, we are building into Editoria various tools in the editor including its own annotation system and a host of publisher-specific markup options for different styles (quotes, headings etc). Coko has employed Fred Chasen for some part time work to contribute to the Vivliostyle development (although we are still working out what we should work on).

In many ways, Editoria is the system I always wanted to be involved in. It is better than any other system I have seen or been involved in and this is a result of two critical factors:

  1. the technologies have matured, including Vivliostyle and Substance.io, that enable critical solutions to be solved
  2. the people who need the system have designed it

There is more to come from Editoria, so stayed tuned…

INK

INK is like Objavi on steroids. It is possibly an Objavi 4 😉 but done the right way. INK is a Rails-based web service which is primarily built to manage file conversions. However, it can actually be used for the management of any job you wish to throw at it. These jobs are what we call steps, and steps can be compiled into recipes. For example, you might have a step ‘convert MS word to HTML’ and then a second step ‘Validate HTML’. Hence INK enables you to chain together these steps.

develop3ed_ink

Additionally, these steps are plugins written as Gems. A gem is a Ruby-based plugin architecture. Accordingly, you can develop gem steps and distribute them online for others to use. We hope in time there will be a free (as in beer) market around these steps.

Summary thoughts

I think I learned a lot from writing this personal ‘sense making’ piece. I didn’t realise just how strong some of the themes were that I pursued until a wrote them down…for example, I pursued collaborations from inter-organisational consortiums a number of times and none of them have worked. Huh. Interesting. I’ve also seen various practices evolve from an idea to being mature thoughts, prototypes, workable solutions and eventually, eventually, adopted. But over many more years than I expected. Also interesting.

It is also obvious that there are some big ticket technological problems that still need to be solved for good to really move forward. They are mostly there but still in need of work, the top two being:

  1. browser as typesetting engine
  2. sophisticated editor libraries

I add a third which I haven’t noted much in the above. I have worked on this since Aperta onwards and it is necessary only in the publishing world where content is authored in the legacy ‘elsewhere’ (ie MSWORD):

  • reusable, sensible, MSWord to HTML conversion

This has been solved many times but has not yet been solved well.

These all need to be open source solutions. At Coko we are trying to move all these on and we are making good contributions. We, as a sector, are nearly there. Although it would be good to work more together to solve these problems by either making contributions to the existing efforts, or using their technologies. This is really the only way things move forward.

Finally, in the tech world people sometimes quote “Being too far ahead of your time is indistinguishable from being wrong,”. I’m not sure who said it but when I look through the evolution of technologies I have written about above, seeing various things evolve from idea to a production implementation many years later, I’m also thinking that perhaps sometimes waiting might be indistinguishable from being right 🙂