Learning Node

It’s that time again… time to learn something new. On my radar right now is Node – the environment for executing JavaScript on the server. It’s a long time since I gave up learning to program. I built several systems in Perl and PHP and tinkered in Python. I also learned some JavaScript a long time ago…as I’m writing I am also remembering I did a few projects in Visual Basic and RealBasic (Mac)… eeeek…. those were… err… the days… the days, that is, when I couldn’t afford to pay someone to do it properly. I would describe myself as a cut’n paster, not a programmer.

However, Node really moves the game on. It has been around for a while now but everything interesting I see online these days is more than likely to be using Node in some capacity. So, after having recently finished a reasonably schedule-heavy job, I have time to work on my own projects and start tinkering again. I’m really enjoying it. I don’t want to be a programmer, but I find that gaining a working knowledge of interesting new technologies really helps me imagine what is possible and talk pseudo-intelligently with programmers about ideas.

Node looks to be one very interesting option for whatever I do next. I’m using Talentbuddy to get a start. They have some good free starter tutorials. I’ll try them out and see where I get. Having said that I have, just this minute, discovered learnjs , so it looks like there’s no shortage of learning opportunities. Awesome.

publish.js

A list of some interesting Open Source Javascripts (mostly JS) that hold good possibilities for knowledge production and publishing.

Content Production

Lens
License: FreeBSD
Code: https://github.com/elifesciences/lens
WWW: http://lens.elifesciences.org/about/

Tangle
License: MIT
Code: https://github.com/worrydream/Tangle
WWW: http://worrydream.com/Tangle/

Flatsheet
License: MIT
Code: https://github.com/flatsheet/flatsheet
WWW: http://flatsheet.io/

Realtime Markdown Editor
License: TBD (emailed dev)
Code: https://github.com/scotch-io/node-realtime-markdown-viewer
WWW: https://scotch.io/tutorials/building-a-real-time-markdown-viewer(tutorial)
Demo: http://realtimemarkdown.herokuapp.com/

PlumbJS
License: MIT & GPL
Code: https://github.com/sporritt/jsplumb/
WWW: http://jsplumbtoolkit.com/

Ice
License: GPL 2
Code: https://github.com/NYTimes/ice
WWW: http://nytimes.github.io/ice/demo/

Annotator
License: MIT & GPL v3
Code: https://github.com/openannotation/annotator/
WWW: http://annotatorjs.org/

Space.js
License: MIT
Code: https://github.com/gopatrik/space.js
WWW: http://www.slashie.org/space.js/

MathJax
License: Apache
Code: https://github.com/mathjax/MathJax
WWW: https://www.mathjax.org/

KaTeX
License: MIT
Code: https://github.com/Khan/KaTeX
WWW: http://khan.github.io/KaTeX/

Velocity.js
License: MIT
Code: https://github.com/julianshapiro/velocity
WWW: http://julian.com/research/velocity/

JuxtaposeJS
License: Mozilla
Code: https://github.com/NUKnightLab/juxtapose
WWW: https://juxtapose.knightlab.com/

TimelineJS3
License: Mozilla
Code: https://github.com/NUKnightLab/TimelineJS
WWW: http://timeline.knightlab.com/

StoryMap
License: Mozilla
Code: https://github.com/NUKnightLab/StoryMapJS
WWW: https://storymap.knightlab.com/

Chartbuilder
License: MIT
Code: https://github.com/NUKnightLab/Chartbuilder
WWW: http://quartz.github.io/Chartbuilder/

Soundcite
License: Mozilla
Code: https://github.com/NUKnightLab/soundcite
WWW: http://soundcite.knightlab.com/

Pagination

BookJS
License: AGPL
Code: https://github.com/booktype/BookJS
Web: none

Cassius
License: AGPL
Code: https://github.com/MartinPaulEve/CaSSius
Web: https://www.martineve.com/2015/07/24/getting-started-typesetting-with-cassius/

Vivliostyle
License: Apache
Code: https://github.com/vivliostyle
Web: http://vivliostyle.com/

BookJS Polyfil
License: AGPL
Code: https://github.com/BookSprints/bookjs-polyfill
Web: none

Typography

Hyphenation

Sweet Justice
License: BSD
Code: https://github.com/aristus/sweet-justice
WWW: http://carlos.bueno.org/2010/04/sweet-justice.html

Hyphenator
License: LGPL
Code: https://code.google.com/p/hyphenator/downloads/list
WWW: https://code.google.com/p/hyphenator/

Hypher
License: BSD
Code: https://github.com/bramstein/hypher
WWW: http://www.bramstein.com/projects/hypher/

Font Resizing

FlowType.js
License: MIT
Code: http://github.com/simplefocus/FlowType.JS
WWW: http://simplefocus.com/flowtype/

Squishy
License: unspecified (argh! please include a license file!)
Code: https://github.com/lemonmade/squishy
WWW: http://cmsauve.com/projects/squishy/

Fittext
License: WTFPL
Code: https://github.com/davatron5000/FitText.js
WWW: http://fittextjs.com/

SlabText
License: MIT
Code: https://github.com/freqDec/slabText/
WWW: http://freqdec.github.io/slabText/

Responsive Text
License: MIT
Code: https://github.com/ghepting/jquery-responsive-text
WWW: n/a

Line Spacing

Typeset
License: unspecified (argh… )
Code: https://github.com/bramstein/typeset
WWW: http://www.bramstein.com/projects/typeset/

Kerning

Kern.js
License: WTFPL
Code: https://github.com/bstro/kern.js
WWW: http://www.kernjs.com/

Kerning.js
License: MIT (license file is in the wrong place – check the README)
Code: https://github.com/endtwist/kerning.js
WWW: http://kerningjs.com/

TypeButter
License: CC-BY-SA 3.0
Code: https://github.com/hudsonfoo/typebutter
WWW: http://typebutter.com/

Drop Caps

DropCap.js
License: confused (All rights reserved and apache??)
Code: https://github.com/adobe-webplatform/dropcap.js
WWW: http://blogs.adobe.com/webplatform/2014/10/02/drop-caps-are-beautiful/

Color

Color Font
Licence: WTFPL
Code: http://manufacturaindependente.com/colorfont/media/js/colorfont.js
WWW: http://manufacturaindependente.com/colorfont/

Font Tricks

Lettering JS
License: WTFPL
Code: https://github.com/davatron5000/Lettering.js
WWW: http://letteringjs.com/

jqisotext
License: MIT & GPL
Code: https://code.google.com/p/jqisotext/downloads/list
WWW: http://workshop.rs/2010/01/jqisotext-jquery-text-effect-plugin/

Arctext.js
License: MIT
Code: http://tympanus.net/Development/Arctext/Arctext.zip
WWW: http://tympanus.net/codrops/2012/01/24/arctext-js-curving-text-with-css3-and-jquery/

Blast.js
License: MIT
Code: https://github.com/julianshapiro/blast
WWW: http://julian.com/research/blast/

Base Lines

Baseline.js
License: WTFPL
Code: https://github.com/daneden/Baseline.js
WWW: n/a

Baseline CSS
License: CC-BY-SA 3.0
Code: http://baselinecss.com/download/baseline.zip
WWW: http://baselinecss.com/

HUGrid
License: GPL
Code: http://bohemianalps.com/tools/grid/HeadsUpGrid_download.zip
WWW: http://bohemianalps.com/tools/grid/

CSS Stuff

Tufte CSS
License: MIT
Code: https://github.com/daveliepmann/tufte-css
WWW: http://www.daveliepmann.com/tufte-css/

Other Resources
This and That

https://github.com/yumyo/js-type-master
http://stateofwebtype.com/
http://nytlabs.com/projects/editor.html
https://www.youtube.com/watch?v=VbCqFQ1sTYQ
http://dat-data.com/
https://github.com/sharelatex/clsi-sharelatex
https://anmolkoul.wordpress.com/2015/06/05/interactive-data-visualization-using-d3-js-dc-js-nodejs-and-mongodb/
http://webapplayers.com/inspiniaadmin-v2.2/cssanimation.html
http://yusugomori.com/projects/css-sans/fonts
http://codepen.io/juliangarnier/pen/idhuG
https://github.com/Modernizr/Modernizr/wiki/HTML5-Cross-Browser-Polyfills
https://github.com/WebComponents/webcomponentsjs

Some Good Reading

https://medium.com/@mql/self-host-a-scientific-journal-with-elife-lens-f420afb678aa
https://medium.com/@
mql/centralised-vs-decentralised-publishing-626055376c81
http://book.pressbooks.com/

Platforms

Booktype
License: AGPL
Code: https://github.com/sourcefabric/Booktype
WWW: https://www.sourcefabric.org/en/booktype/

PressBooks
License: not specified 🙁 emailed org
Code: https://github.com/pressbooks/pressbooks
WWW: http://pressbooks.com/

PubSweet
License: Apache
Code: https://github.com/BookSprints/PubSweet
WWW: none

More coming…

Building Book Production Platforms p5

Workflow

Much more to come.

Most of the book production platforms in circulation have very little workflow tools to speak of. This is not necessarily a bad thing. A platform that is ‘just an editing environment’ is still pretty powerful. If you do need tools to assist with workflow, then in situations where a small group know each other well they can use email or, if in real space, Post-it notes or paper to track what needs to be done next. In many cases, a live chat in the interface, or integrated topic-based forum, will be enough to satisfy many workflow needs, and in other cases the platform can be augmented by external systems such as wikis, online spreadsheets, content management systems and other tools to meet particular requirements.

However, there are a number of situations where these ‘solutions’ become unsatisfactory. This is especially true for organsiations which have a large number of people involved in processing content, or which have sophisticated content-processing needs (such as book publishers).

Before going too much further, let me clarify what “workflow tools” are. In the broadest sense, they are tools that help you to know what needs to be done, and when it needs to be done by. Using this very broad definition, we can see that mechanisms such as discussion forums and live chats are workflow tools. By chatting with colleagues through a live chat or forum, you can work out what needs to be done next, or get a ‘notification’ (a shout out) that it needs to be done now… From there, systems can evolve into complex technical environments which are either relatively open-ended (such as Trello) or relatively closed, such as hard-coded workflow pipelines.

The first book production system I built for FLOSS Manuals was ‘built’ on top of Twiki in 2006-2007, had some basic workflow tools, namely:

  • a basic live chat
  • a dropdown status-selector for marking chapter statuses (needs content, needs images, finished, and so on)
  • notifications in the table of contents when someone is editing a chapter
  • a mailing list where efforts could be coordinated

blog-notif-en-1

These tools were simple and effective and served us well for a number of years. I also incorporated similar mechanisms into Booktype and PubSweet. In addition, when we used these platforms for Book Sprints, lots of whiteboard scribbles and Post-its were utilised.

whiteboard_scribbles

nameless

In a Book Sprint, notably, the facilitator is the main coordinating workflow mechanism. I point that out because it is important to understand that workflow tools can include humans – often the easiest way to know what needs to be done and when is to be done by, is to get someone else to tell you.

sprinters

And let’s not forget that human factor! We are living at a time when we tend to want to programmatically solve problems with overly prescriptive technical systems. But sometimes underdetermining the technical systems is the right way to go.

I first tried pushing past these basic software workflow tools with Booktype – a book production system I founded, now housed with Sourcefabric. I leveraged the kanban idea of multiple columns (phases) populated by ‘todo’ items to build the equivalent of a digital kanban system, making the first simple prototype in a demo for the Frankfurt Book Fair in 2012. The inspiration came from Pivotal Tracker and the Open Source Fulcrum.

Most often the technology used to set up a kanban system is a whiteboard, with marker pens to draw and label the columns, and Post-it notes as a marker of the tasks. This kind of system is popular in unconferences, and also often used by software development houses. We also use this type of kanban approach a lot in Book Sprints.

booktype

The task manager (as I called it) and the production system were linked to each book and worked nicely. Although this system didn’t make it into the core code of Booktype, this version got the idea across, and later Juan Gutierrez made an integrated version for PubSweet. (During 2014, I also built this idea into a system for PLOS).

The task manager used a whiteboard-like interface in which the user could use to create columns (phases). Cards could be added to each phase and simple notes kept on each card. It was simple but effective.

In time I discovered Trello, and Why Cards are the Future of the Web by Paul Adams – these examples placed cards nicely within evolving design paradigms of the Internet, and I started to think about this model in more detail.

There are many advantages to cards, not the least being that cards can ‘follow the user’ – think of them as powerful work-unit-applications that can be accessed by a user within any context where they are needed.

cards_web

Additionally, when thinking of digital cards within the digital workflow-kanban paradigm, the nice thing is that it is a very simple model. There are essentially just 2 elements – cards and columns. You can create as many of each as you like. Further, you can name the columns and cards anything you like. That means these two devices can be used to represent any number of simple or complex workflows. You can start from the kanban default – three columns marked ‘to do’, ‘doing’ and ‘done,’ and add cards for each task – progressing them from left to right as tasks progress from ‘to do’ to ‘done.’ This is the default configuration when creating a new Trello board.

Replicating this system in an application is pretty easy to do. Trello is an excellent example. While Trello is not easily integrated into another technical system (such as an in-house publishing system), it is interesting in that the designers, while surely tempted by all that a web application could offer, have endeavoured to keep the Trello system true to the kanban ideology of useful but simple. With Trello, therefore, you can add columns, and cards to columns, naming each as required. When you open a card, however, you have some nice widgets for making lists, comments, discussions, attaching files etc. This is something paper cannot easily do, at least not with the small real estate afforded by Post-it notes.

Trello is a lovely application precisely because these systems, like the paper kanban, have been designed to be simple to use and serve as many generic use cases as possible.

However,while digital kanban systems like this are useful as standalone ‘context agnostic’ systems, they could be much more powerful for publishers (or anyone) if this simplicity and flexibility could be preserved while the system also served their specific use case. The trick is to preserve the simplicity and flexibility to allow publishers to model existing and future workflows in an easily ‘grok-able’ drag and drop manner (similar to Trello), while building cards that reflect the publisher’s specific needs (to invite editors, push content to external vendor services, perform peer review etc).

Building cards like this, means pushing cards away from the Trello/kanban generic-use paper metaphor towards a more sophisticated specific-use digital and networked paradigm. This means embracing the idea that cards are networked applications and building cards that precisely serve the publisher’s needs and integrate into their existing internal and external systems.

The Four Phases of Knowledge Production

Knowledge production consists of four basic phases – manage, create, process, and share. When designing knowledge production systems, it pays to keep these four phases in mind and to build platforms that have an eye on this high-level abstraction. These phases can be linear dependencies, overlap and/or be concurrent.

‘Thinking from above’ can help us to better understand where the needs of each phase might be placed within the system we are designing.

Manage – when producing a knowledge asset, there needs to be some management of the context. This mainly includes things like the setting up and processing of users, roles, permissions, and groups. However, ‘manage’ can also be an appropriate way to think about the ways users access content. A dashboard is a management interface, as are interfaces to create a Table of Contents (book) or keep track of a Collection (eg a set of related journals). The usefulness of thinking of management needs in this way is that it really highlights that a dashboard (consider Google Drive as a dashboard) and a Table of Contents interface for managing an online book production system are the same category of interface. They might have a different treatment according to the use case, but they facilitate much the same kinds of activities.

Create – content needs to be created, so we need content-creation interfaces. We are familiar with these: think about blogging and where you write your posts – that is a content creation interface; now think about a book production system where you write a chapter – that is also a content creation interface. These interfaces both belong in the same high level abstract container.

We need to think of content creation interfaces at a high level of abstraction when designing these kinds of systems for many reasons.

First, it is useful to think about how components may be re-used. What, for example, is the difference between an interface that is used to create a blog post, and one used to create a chapter? For a great deal of use cases, the answer is nothing. This re-use approach is taken by PressBooks, for example. PressBooks uses WordPress as a book production platform (note: I don’t agree this is a good idea, as WordPress is an entire suite best suited to blogs and not books, but I am pointing out the similarity of the content production needs at a very high level).

Second, it is interesting to ask ourselves what kind of content we are trying to produce and whether we have the right type of content production interface. Think, for example, of a book production system. All (except one) of the online book production tools I am familiar with have one kind of content creation interface – a WYSIWYG editor attached to a blank page. You use the editor to fill up the page until your chapter is done.

But what if you need to produce a glossary? Most book production systems use the same tool. That doesn’t seem like a good idea. Glossaries have very specific needs – users need to be able to sort, create new items, perhaps even translate terms into other languages and sort by those languages etc. A ‘waterfall’ cascading content creation interface (WYSIWYG and a blank page) doesn’t meet that need very well. What if you wish to produce 2 page spreads (in the case of paper books), or an index ? …then imagine that instead of books, we are are looking to produce annotated data sets.

We need to liberate ourselves from the one-size-fits-all approach to content production, and placing these needs in a high level abstract container allows us to think of the needs and not the UI.

Process – one thing I have come to learn through working in STM publishing, is that publishers add value to content by improving it. That is pretty much what every publisher does. In my mind, we should be re-framing publishers and calling them processors. After all, making content public these days is a doddle. And the difference between ‘self-publishing’ and ‘publishing’… is the processing bit. Self- publishers (generally speaking) do not have access to the same level of experienced and useful processing that can improve a work. As the day goes on we will have less and less value for the ‘publishing’, and in the case of STM, the long wait to making science public is already an impediment to progress.

The processing of content is an important part of knowledge production. If we want good knowledge, we need to be able to bring into play all those people and (sometimes) machines at the right moment to improve the work. That is essentially what workflow is all about. Processing is a high-level abstraction for workflow. It’s important to consider processing at a high-level abstraction because far too many systems build hard-coded workflow pipelines into their platforms, and that retards the opportunity to reconsider, optimise, and even radicalise workflows. I also consider other UI elements such as discussions to belong in the ‘process bucket’. Discussions are workflow and they are often the only type of mechanism that can account for the high level of specificity and issue resolution required on a per-knowledge asset (eg issues for each manuscript, chapter etc) level.

Share – formerly this was getting the book to the bookshop, but under current conditions this is something else entirely. Sharing works in the age of digital assets and network communications is all about file formats, APIs, and syndication. Avenues for sharing and the requirements of this process are prolfierating daily and the needs can be so complex that any system built to manage ‘sharing’ needs to be extremely flexible.

Thinking about sharing at a high-level of abstraction helps us with this enormously. For example, ingestion of .docx to an HTML-based knowledge production system is actually an act of sharing. It consists of file conversion and feeding the result into the production system for others to access. That ‘feeding’ of content into our production system, is a type of syndication. And the next ‘export’ of the finished product to some other system (eg a book sales system), requires a further process of file conversion (to the target book format) and feeding that format into the target system. Exactly the same high-level process as ‘ingestion’. They both belong to the ‘sharing’ bucket.

So, we can see that actually ‘import/ingestion’ and ‘export’ are actually the same thing. Consequently, we can save ourselves a lot of effort by building a framework in any knowledge production system that will ‘do both’ (ie recognise that there is no difference between import and export, and manage both rather than building redundant parallel processes).

The beauty of this ‘conceptual schema’ for knowledge production is that we can apply it to a wide variety of use cases to understand the knowledge production process at hand, and the variation and similarities between any of them. For example, Book Sprints traverse each of these 4 phases, as does a typical book from a publisher, as does a Wikipedia article, as does a grant application to a funder. Thinking of the process that way helps us see where the variance is – and helps us to better focus on designing for the needs of each. This suggests that a really good, efficient, single system can be designed to enable the production of a vast range of knowledge types, and accommodate apparently different processes which were formerly housed in standalone single use case platforms.

Colophon: written in Piha after walking on the beach, then cleaned somewhat by Raewyn. Written using Ghost blogging software (free software!).

Why Persistent Identifiers are the Wrong Idea

I think I will rewrite this. It seems to me it’s only half the solution…more coming shortly.
This afternoon I was reading Elizabeth Eisenstein’s “The Printing Revolution in Early Modern Europe” when I came across this passage:

To consult different books, it was no longer so essential to be a wandering scholar...The era of the glossator and the commentator came to an end, and a new "era of intense cross-referencing between one book and another" began.

The point here is that after the printing press came about, there were more books available. An obvious point. As a result, cross-referencing became a feature, since it was possible to access and, consequently, reference other literature more readily.

This made me think about the current discussions around persistent identifiers for scholarly content. It seems the current solution is to offer a layer of indirection: this enables a stable identifier to persist, and should the ‘actual location’ of the content be changed, then we can re-configure the redirect to point to the new location.

Martin Fenner and Geoff Bilder point to this solution in their very good postings on this topic. However, this method does not overcome the real problem. What if either:

  • no one updates the redirection after the location has changed
  • the content really goes offline (through loss of domain, for example)

It appears to me that we have the wrong solution. There is really no way to solve this issue with URIs. We can only minimise it.

So, how to go about resolving this (so to speak).

One way to get some insight into the issues is to wind back the clock and look at the way content was located in the age of the newly-born printing press. In this age, scholars were liberated because they didn’t have to wander the world looking for a particular book. Instead, identical printed copies proliferated, and it was just a matter of finding a copy of the work you were pursuing. Preferably you found a copy in a library or shop nearby. To find that work, one merely needed the cross-reference information to track it down. That “persistent identifier,” comprised of author’s name, book’s title, page of reference for the quoted material, publisher’s name and location, publication date, was commonly referred to as a citation (we still use that term). The citation helped the reader or researcher find a copy of the work cited. Not a particular printed book, but a copy of that book.

So, how is it that in an age of digital media we have gone backwards? Where copying something is even easier than in the printed age, why are we still pointing to ‘one’ authoritative copy? In essence, we are still referencing a book by stating the exact book that sits in a specific institution, on a particular shelf, with the blue (not green) cover.

It feels a little like the great leap backwards to me.

A way to get around this problem would be simply to allow and encourage content to be copied. Let digital media do what it does best – copy and distribute itself. The ‘unique identifier’ would then not be an URL (with a layer of indirection) but would take the form of a checksum or hash. Finding the right work would then be a matter of searching for a copy of the material with the right checksum.

I don’t know. I’m probably missing something. But it seems we have no problem tracking down YouTube videos when they spawn into the ether. We can also tell one version of software from another, no matter where it is and how it is labelled. Why not just let the content go and provide mechanisms to find a specific version of the content via hash search (also solving the issues of versioning URIs)?

Colophon: written on a lovely Sunday afternoon in the Mission. Adam got up Monday morning with that nagging feeling… rethought it. Talked to Raewyn. Rewrite coming. Written using Ghost (free) software.

Its not US and THEM, its a TEAM, stupid

hi y’all

I have just been reading some posts on Scholarly Kitchen about content creation and the next wave of authoring systems.

It seems the STM sector has long been in need of developing a solution to get their publishing processes out of various traps. The most obvious trap is MS Word. A horrible format, to be sure, but it has long been the default file format for manuscript production, with a small tip of the hat to LaTeX for the technologically gifted. Their recent discussions have mainly been about online ‘authoring systems,’ going beyond MS Word to anticipate documents that are fully transparent to whatever combination of machine and human interactions play a part in understanding and processing the information.

This ‘get-out-of-MS-Word-free’ card is a very attractive proposition. MS Word is basically a binary blob to most publishing systems (even though in actual fact the format of .docx is XML -thereby also abruptly ending the false argument that XML inherently brings structure). As a ‘binary’ (go with me on this for now) MS Word is not transparent to the publication system, there is no record of when the author has worked on it; finding out what they have done since version xxx.xxx and version xxxx.xxxxxxxx is very difficult; nobody else can work on it when the author is also working on it, and there is no control over structure etc etc etc

So getting away from reliance on MS Word is the aim. But getting into (what I might rename as) an ‘authoring only‘ platform – is not the solution.

What is interesting about the SK forum, is that there seems to be a very clear distinction in the minds of publishers between the worlds of the author and the publisher. Most of the comments make this split, and there is much talk of ‘authoring systems’.

It seems a little bizarre to me, as I don’t think it’s wise to think about the author and the publisher as being distinct entities. It’s not a matter of author and publisher working on separate processes to shepherd a manuscript through to publication: it is very much a team effort. Authors and publishers work together in a way that should not be dichotomised: they are a team.

If we don’t acknowledge that, then we will not be able to design good publication systems. There is a lot of unclear thinking around this topic at the moment. The “authoring system” model assumes that content is made in an authoring system by a writer, and then migrates to the publisher’s submission, processing and publishing system, where the publisher does some stuff, and then at various times pings the author back to make changes to metadata, submission information, the manuscript and attendant assets (eg figures)…

In this model, next the author takes the manuscript out of the publisher’s system, ingests to the old authoring system, works on it, exports it, and re-ingests it into the publisher’s system… Hmmm…this cycle is exactly one of the pain points we were trying to avoid by getting away from Microsoft Word.

It seems to me that the current trend to build better authoring systems is a mistake. It is based on the false assumption that ‘MS Word’ is the problem, without realising that there is more to it. Word has been seen as the problem only because it has been the only problem in town. We don’t need better ‘authoring systems’ that repeat the separation between writing and publishing that is inherent in reliance on MS Word. We shouldn’t invest in new authoring systems and believe in them purely because they are ‘not Microsoft Word’. Rather, we need documents to be contained within submission and processing systems for the entire duration of their life, and they need to be completely operational and transparent within that system to all parties that must work on them. Without understanding that need, we are merely mitigating the problem by small steps whilst fooling ourselves that we have solved the larger problem.

We don’t want the author-publisher response/change cycle (a collaborative effort by the team which includes author and publisher) to be in separate systems. We want them working together in the same system. We need teams to work together in the most efficient way possible, and that is in the same (real world- or cyber-) space. Teams work best when they work in the same *space.

Though I see the current efforts towards authoring system development to be interesting, unless they are integrated with processing and workflow features, they will sooner or later be made redundant.

Colophon: written by Adam in 30 mins in a tizz. Tinkered with by Raewyn for another 30 mins. Written using Ghost software (free software!)

Single Voice

version 1.0 ‘not as raw’

During a Book Sprint, or when talking about Book Sprints, the question very quickly arises – ‘what about the author’s single voice?’

The fear is that collaboratively produced books will lose that personal, individual voice that we know so well from all the books we have read and loved.

Wouldn’t Frankenstein be a little lumpy if it was written by a collective? Same goes for any Tom Clancy book (he famously said that “Collaboration on a book is the ultimate unnatural act”). Clancy’s books are not high art, but they do seem to contain a particular ‘Clancy’ style. What about good contemporary literature? Could, for example, the wonderful The Art of Fielding be as wonderful if written by anyone other than Chad Harbach? And what about poetry by the father of English literature – Chaucer? It’s unimaginable that his works could be produced by anyone other than Chaucer.

We believe that both high and low literature would suffer if the works weren’t produced by a single author. There is only one Chaucer, one Clancy (thankfully), one Harbach, one Mary Shelley. We can tell their works apart because each contains a distinctive authorial voice. We know these writers. We know those voices.

We can only imagine what a mess would be created if books were written by more than one person. They would lose the single point of view. That special perspective. That special voice.

Well… first of all, it might be worth knowing that each of these examples actually had more than one contributing author, and each in its own interesting way. From Erick Kelemen’s work in the forensic field of textual criticism, there is good evidence that both Byron and Percy Shelly had a hand in at least some of Frankenstein. According to Kelemen, the extent of the collaboration is not exactly known, and we need to be aware that the discussion is also tainted by a good ole sexist lens. However, there is good evidence of collaboration, not just in the Preface (which some say is written entirely by Percy Shelly), but also in the content of the rest of the story.

Tom Clancy, in his own mind the enemy of collaborative book production, actually collaborated with others on many of his books. Some of the books he has credit for were actually written mostly by others, a common practice amongst authors of best-selling thriller and mystery series for at least the past twenty years.

And in fact, manuscripts produced at the time Chaucer was writing were shared documents, and it is extremely likely the exact words that we now consider to be Chaucer’s were not his at all. As Lawrence Liang has noted, in his discussion of the process of Chaucer’s canonisation, the process was essentially a gathering of manuscripts after Chaucer’s death by experts who decided which words were, and which were not, Chaucer’s, for all time.

In the disclaimer before the Miller’s Tale for instance, Chaucer states that he is merely repeating tales told by others, and that the Tales are designed to be the written record of a lively exchange of stories between multiple tellers, each with different, sometimes opposing, intents.

Interestingly, Chaucer seems not only to recognize the importance of retelling stories, but also a mode of reading that incorporates the ability to edit and write.

If you want to understand the role of collaboration in single-author-culture right now, there is no better story to read than The Book on Publishing which provides a great tale about the publishing of Harbach’s The Art of Fielding and acknowledges the huge value an editor can play in re-writing and restructuring a book.

There are two points here to keep in mind.

Firstly, we don’t know much about how books are written, nor how models of the writing process have changed over time. Paper is not a good medium for preserving versioning, and we lack an on-paper-process mechanism like git blame that can backtrack to show how the text was created. A great pity. The lack of this kind of tool for the vast majority of publishing history means publishing has been able to propagate the very marketable myth of the single author. Collaboration has been obscured and de-valued. Worse, the extent and value of collaboration is not understood. We don’t even have a good language for talking about it.

Secondly, we are left believing claims such as “books have a single voice because they are written by a single author” when this is demonstrably false. Almost every published book has had at least two authorial contributors – the author and the editor; and most books will have been improved during the drafting process by the contributions of test readers.

Collaboration exists to improve works. It is why there are editors in publishing. Editors give feedback and shape the work to, amongst other things, strengthen the impression of the single authorial voice. It is very probably true that an effective single voice can only be achieved by 2 or more people collaborating.

So next time you find yourself asking “how can an authoritative singular voice be preserved in collaborative book production?” it might be better to take a deep breath and ask yourself “how could a single voice ever be effectively realised without collaborating?” That is the real question at play.

Colophon: version 1.0 Written in an hour by Adam Hyde. Raewyn Whyte then improved it (‘made it stronger’). Also, some references still need to be checked as the needed books are in storage in NZ somewhere! Written with Ghost Blog free software (MIT) https://github.com/tryghost/Ghost.

Fantasies of the Library

Fantasies of the Library is a book released last week by Berlin publisher k-verlag. There is an interview in it with me about the future of book publishing beyond the proprietary model. I also talk about my current work for the Public Library of Science and the relationship between Open Access and Open Source.

fantasies_cover

The full interview is also online and can be read here.

My favourite passage is this:
Charles Stankievech: “But why should one value open source and open access? What are the political ramifications of such a philosophy and practice?”

Adam Hyde: “Because both provide more value to humanity. Political ramifications are vast and complex. I like to think about the personal aspects of this choice, however. Living a life of open source and open access forces you to peel away layer by layer the proprietary way of thinking, doing, and being that we have all grown up with. It can be a very painful process, but it’s also extremely liberating and healthy. Largely, it actually means learning to live without fear and paranoia of people ‘stealing your ideas’. That’s quite a freedom in itself.

Books are Evil, Really Evil pt1

Right now books are something of an ironic artefact for me. I am involved in the rapid production of books through a process known as a Book Sprint. We create books. We throw a bunch of people in a room for a week, and carefully facilitate them through a process, progressing them step by step, from zero to finished book, in 5 days or less.Write a book in a week?! An astonishing proposal. Most people who attend a Book Sprint for the first time think it is impossible. Create a book in a week?! Most think that maybe they can get the table of contents done in that time. Maybe even some structure. But a book? 5 days later they have a finished book and they are amazed.

There are many essential ingredients to a Book Sprint. An experienced Book Sprint facilitator is a must. A venue set up just so… Lightweight and easy-to-use book production software. A toolchain that supports rapid rendering of PDF and EPUB from HTML. Good food… A writing team… and a lot more.

One of the contributing factors to success is the terror caused by the seemingly impossible idea that the group will create a book. It is a huge motivator. Such is the enormity of the task in the participants’ minds that they follow the facilitator and dedicate themselves to extremely long hours, working on minute details even when exhausted. There is a lot of chemistry in there. Camaraderie and peer pressure are pushed to maximum effect as a motivational factor, as is fear of failure, especially fear of failure before your peers, both inside and outside the Sprint room. The pleasure of helping your peers is a strong motivator, as is the idea that together we will do this! But the number one motivator is the idea that we are going to produce a book.

We all know that books these days, paper books, are published from a PDF. You send a PDF to the printer, and the final output is a perfect bound book. This happens for most Book Sprints – we send the final PDF to a printer for them to produce the printed book. So what we are creating is actually a PDF (along with an EPUB) …but imagine if we were to call the event “PDF Sprint”. At the beginning of the PDF Sprint we could announce that we have gathered everyone together…so that…at the end of the week…they will have….(gasp!)…a PDF!

Nope. Doesn’t work. Doesn’t even nearly work. A book is the seemingly impossible outcome that Book Sprint participants have come to conquer. Even though the definition of ‘what a book is’ is completely up for grabs, it is abook they are determined to produce. A book is the pinacle of knowledge products, and writing a book is about equal in cerebal achievements to climbing Everest. A PDF is merely getting to base camp, or perhaps the equivalent to planning the trip from your armchair.

So, what’s the problem? Books are good then! A great motivator for Book Sprints. Where exactly is the irony? How can I complain?

Book Sprints are extraordinary events. The people are not just put into a room and left to write. They are led through a process where notions of single authorship and ownership of content just no longer make sense. Such ideas are unsustainable and nonsensical in this environment, and participants slowly deconstruct ideas of authorship over the 5 days.

The participants actively collaborate during the event. Really collaborate. Book Sprints are a kind of collaborative therapy. Each participant learns to let go of their own voice so they can contribute to constructing a new shared voice with the rest of the team. They learn new ways to contribute to group processes, to communicate, to improve each other’s contributions, to synthesize, to empower and encourage others to improve the work without having to ask permission.

The resulting book has no perceivable author. It has been delivered by what is now a community. And as a result, most of the books, about 99% I would say, end up being freely licensed. A book born by sharing is more easily shared. More easily shared than a book created with the notions of author-ownership. The idea of sharing is embedded in the DNA of the Book Sprint, part of the genesis of the product, and sharing more often than not becomes part of the life of the book after the Book Sprint is completed.

But books are evil

So, how is it possible I can take the position that books are evil? Where exactly is the irony? It is a lovely story I just painted. Lots of flowers and warm fuzzy feelings. Wow. Sharing, sharing, sharing… it’s a book love-in!

Well… with some regret, I have to admit that most books do not come into the world this way. They are produced and delivered through legacy processes. Cultural norms shape the production and reception of books, and the ideas contained within them are not born into freedom. These books are, normatively, created by ‘single author geniuses’, born into All Rights Reserved knowledge incarceration, and you cannot recycle them.

Try as we may, we are a little group of people. A small band of Book Sprinters, and it is unlikely that we can sway the mainstream to our way of doing things. We have many victories – Cisco released one of its Book-Sprinted books freely online! Whoot! That’s massive! But… as big as Cisco is, one Cisco book in the sea of publishing is merely a grain of salt in the Pacific. By adding our special grain of salt to this ocean we are by no means making our point more salient.

Books are doomed to be the gatekeepers of knowledge. If you make a book, you are, more than likely, sentencing the words in it to life + 50 years (depending on where you live).

Books are in fact the very artefacts that maintain proprietary knowledge culture.

It comes down to these three issues for me:
1. books gave birth to copyright
2. books gave birth to industrialised knowledge production
3. books gave birth to the notion of the author genius

These three things together are the mainstays of proprietary knowledge culture, and proprietary knowledge culture has been firmly encased and sealed, with loving kisses, between the covers of the book. Ironically these three things, through the process of the Book Sprint, are what we are trying to deconstruct.

many thanks to Raewyn Whyte for improving this post