The below is an excerpt from a chapter on facilitation for the forthcoming book on the Cabbage Tree Method (collaborative product design). The image is by the wonderful Pepper Curry.
EFFECTIVELY FACILITATING THE DESIGN SESSIONS
Facilitation is a key ingredient for the Design Sessions. In fact, no group should attempt to use the Cabbage Tree Method or run a Design Session without a facilitator.
The facilitator is part of your open source team and should not be part of (if possible) the user group that needs the solution. This neutrality frees the facilitator, as best as possible, from existing ideas, and organisational politics and dynamics. In turn, this enables the facilitator to read the group clearly and interact freely.
You will need post-it notes, a large whiteboard and markers (or large pieces of paper).
FACILITATION TIPS
Facilitation is both an art and a science. It’s difficult to master both aspects of facilitation. If you’re new to facilitation then it’s best, if possible, to watch experienced facilitators in action and try some small experiments first. I strongly advise against learning facilitation from a book. There’s no substitute for experience.
Having said that, here are some things to remember when facilitating Design Sessions. They’re intended to help you formulate an approach to this process. They’re not intended as a facilitation ‘how to’:
Build trust before products – As a facilitator, your job is to build trust in you, the process, the group, and the outcomes. Each time you make a move that diminishes trust you’re taking away from the process and reducing the likelihood of success. You must put building trust, working with what people say and towards what people want, ahead of building the product. Trust that the product, and a commitment to it, will emerge from this process, and it will be better for it. I am thankful to Allen Gunn for this wisdom made very kindly to me many years ago.
Change is cultural, not technical – Too often people propose technology as the mechanism for change. Technology doesn’t create change. People create change, with the assistance of technology. Make sure the group is not seeing technology as a magic wand. The group must be committed to changing what they do. The what they do shouldn’t be posited as a function of technology. It’s their behavior that will always need to change, regardless of whether there is a change in the technology they use. Behavioral and cultural change is a necessity, not something that might happen. The group needs to recognise this and be committed to changing how they work. This also means that they must commit to using the solutions you design together.
Even the best solutions have problems – Understand and embrace that even the best solutions, including the one you’re representing, will have problems. There is no perfect solution. If there is then, inevitably, time will make it imperfect.
Leave your know-it-all at the door – If you’re part of a product team, then you may be expected to be somewhat of a domain expert. It’s not possible to leave that at the door, of course, but you should be careful that you don’t use this knowledge as heavy handed doctrine. Instead, your domain knowledge should manifest itself in signposts, salient points, and inspiring examples at the right moments. These points should add to the conversation. They shouldn’t override the conversation. Don’t overplay your product knowledge and vision. You’ll lose people if you do.
Give up your solution before you enter the room – If you’re part of a product team, it’s a mistake to enter into a Design Session and advocate or direct the group towards your product or your vision for a solution. If the group chooses a technology or path you don’t have a stake in, then that’s fine. The likelihood is that they’ll choose your technology or approach, otherwise you wouldn’t be asked to be in the same room with the group. However, if you facilitate the process and drive them to your solution you’ll be reading the group wrong and they’ll feel coerced. You must free yourself from this constraint and be prepared to propose, advocate, or accept a solution that’s not yours or which you didn’t have in mind when you entered the room. This will not only free you up to drive people to the solution they want, but it will earn you trust. When the group chooses your solution, and chances are they will, then they’ll trust it because they chose it and that reflects their trust in you and the process. If they don’t choose it, but you have done a good job, then they’ll trust you and come back another time when they’re ready for what you have to offer. They’ll also advocate you to others.
Move one step at a time – Move through each step slowly. The time it takes to move through a step is the time it takes to move through a step. There is no sense in hurrying the process, as doing that will not lead to better results or faster agreements. It will, in all likelihood, move towards shallow agreements that don’t stick and ill-thought-out solutions that don’t properly address the problem. The time it takes to move through a step will also give you a good indication of the time you’ll need for the steps to come. Be prepared to adjust your timelines if necessary, and to return to earlier unresolved issues if need be.
Ask many questions, get clarifying answers, ask dumb questions, the dumber the better – Try not to ask leading questions. Keep asking questions until you get clear, simple answers. Break down compound answers, where necessary, into fragments and drill down until you have the clarity you need. Dumb questions are often the most valuable questions. Asking a dumb question often unpacks important issues or reveals hidden and unchallenged assumptions. It’s very surprising how fundamental some of these assumptions can be, so be brave on this point.
Reduce, reduce, reduce – The clearest points are simple ones.
Get agreements as you go – Double check that everyone is on the same page as you go. Get explicit agreement, even on seemingly obvious points. Total consensus is not always necessary or possible, but general agreement is.
Document it all – Get it down in simple terms on whiteboards or using whatever tools you have available. At the end of sessions, commit what you’ve documented to digital, shareable media.
Summarise your journey and where you are now – At the end of a session, it’s always necessary to summarise in clear terms the journey you have taken and the point at which you’ve arrived. Get consensus on this. It’s also useful at various points throughout the session to do this as a way of 1) illustrating you’re all on the same journey, and 2) reminding people what the session is all about. In the summary, make sure to illustrate that you understand the key points. Bring out points that may have taken a while to get clarity on or that you may have initially misunderstood. There will be many of those points if you’re doing your job well! Doing that lets the group know you’re embedded with them and not merely guiding them towards your own predesigned end game.
Use their semantics – Use the group’s terminology to describe the problem and the solution. Don’t impose your own semantics. Remember, they are the experts. Use their language. If, instead, you use your terms in the process then inevitably people will become confused. If a term doesn’t exist for something, ask them to define it.
Remember, facilitation is an art of invention – At the very least, facilitation is an act of translating known patterns into a new context and tweaking those patterns to fit that context. But the reality is that much of the what a facilitator does involves making things up. Instinct and the fear of failure force the on-the-fly invention of methods, but a good facilitator makes it appear that the method has existed for a hundred years and has never been known to fail. When it does fail, a really good facilitator will ensure that no one notices and that the outcomes were the ones you were after.
I’m in New Zealand for a few weeks so I’m working and enjoying the summer. Good thing is that the timezones overlap perfectly so I can be in touch with San Francisco in the morning, and in the evening work with the team members in Europe. In between I either work on a book about the design methodology (the Cabbage Tree Method) or go surfing! Perfect!
Recently I have had two streams of negative energy focused on me by others. One was a troll that wished to damage my reputation, and the other has (recently) been negative comments made in response to a post I wrote about diversity in open source.
I think I am getting better in handling this kind of attention. Spending time to consider the issues and the negative responses has, interestingly, led me to reflect on and, consequently, to strengthen my position. Hence in some ways my response has been ‘unequal and opposite’ to the negative energy. Instead of hurting me, as this kind of negative strategy intends, it has helped me to learn a lot about what I am doing and why, and to commit to my position more energetically in a way which is positive in both response and result.
I might actually be growing up. Gasp! (not without the help and patience of many wise friends 🙂
I’ve been thinking about Open Source and diversity. It is true that open source, in my experience, is largely populated by white men who can program. I think this would come as no surprise to most people reading this post. However, I have been stimulated to think about it a little more due to some very odd reactions on an email list I am subscribed to when I, very gently, pointed this out. The reactions to my comments were extremely negative and very personal. Over the last few weeks, in reaction to points I have made on the list (related to diversity and possible new models for open source projects), I have been accused of racism, questioned about my nationality, asked if I speak English, made fun of, and it has been suggested many times that the kind of work I am doing is fruitless.
I find these points say a whole lot more about the persons making them and the associated dominant power dynamic than anything else. Never-the-less, as a result of these comments, I have been prompted to reflect on, and refine, my position regarding diversity in open source.
Essentially, I see the lack of diversity as a problem on so many levels. I am disappointed, on a human level, that there isn’t a greater diversity of ethnicity, and gender, involved in open source projects. I also lament the lack of role and skill diversity in open source projects. I believe this lack of diversity hurts us as humans and hurts the effectiveness of open source. I wish to see a greater diversity in open source.
I am a white male and I am part of the dominant power demographic. I don’t feel guilty for being so, but I do believe, as someone that is part of the dominant power group, that I have a responsibility to address the imbalance. So, if I wish to see greater diversity in open source, then I should do something about it.
After pondering what I can do, I have come to two points, and they reflect how I see the yin/yang personal – political nature of this issue:
I will endeavor to help produce greater diversity – in part I am doing this through my work with the Cabbage Tree Method. Odd as it may sound, I very much believe that the Cabbage Tree Method is not ‘just a design methodology’ but it is a template for a different model of open source organisation, one that requires, at its core, a more inclusive project culture.
I take responsibility for my own networks – when I work on interesting projects I often bring in good people I know who do great work in the area. I have been very lucky to have met many wonderful people that also have great talents, and projects like Coko and Book Sprints reflect that. However most of the people I know come from a pretty homogeneous demographic, at least in terms of ethnicity. So, I need to take responsibility for that and diversify my networks.
In a way, there is nothing new here. But I needed to think this through a little to work out my own position regarding the issue of diversity. How I can better achieve the actions above is something I will need to work on (continually). I’ll ponder it more, and I welcome any thoughts you might have on the issue that can help me on my way.
After posting the possible introduction chapter to a book on the Cabbage Tree Method to a list full of Open Source luminaries, I got some great feedback, but I was also ridiculed by some and called racist by a couple of others. The racism allegation was in response to my statement that in open source I have observed “the dominant power being held by white men who know how to code.”
Amazing the reactions you get from pointing out the obvious and very lightly advocating a reasoned alternative to the current models of open source projects.
So, I’m reading a little about diversity in Open Source. Some great materials I am currently pondering include:
You will note from the first link that there is some suggestion that women, for example, constitute only 1.5% (max 10%, its unclear) of open source project contributors.
The following is an update of the opening chapter for a book about the Collaborative Product Design method I developed. It is meant to be ‘thought provoking’. Feedback is very much welcomed…
Open source is broken. Or, more precisely, it’s partially broken. It works for solving problems developers have, but it doesn’t generally work so well for fixing problems that (so-called) users have.
The part that works is where the famous ‘itch to scratch’ model comes into play. If you’re a developer and you have a problem which could be fixed with some new code, then you’re in a good position to solve it. This model has been thought to work because the developer has the coding skills to create software, so they are in a good position to solve software problems. But I’d like to suggest that while writing code is a necessary condition for creating software, it is not the critical reason this model works. The critical reason is that in these cases the developer is the user. Hence the developer understands the problem in depth since it’s their problem.
This is where open source excels. This is why we have such a proliferation of open source ‘under the hood’ libraries and tools for developers. In fact, it’s nearly impossible to be a developer these days and not use open source tools and libraries other developers have made. The itch to scratch model has worked for developers solving the problems they and, consequently, other developers, have.
But outside of developers solving their own problems, open source has largely failed. There are few convincing ‘user facing’ open source solutions that can beat out their proprietary rivals on approach, utility, and usability. This is because we have incorrectly concluded that the ability to develop software is the same as the ability to solve any problem that involves software. But understanding the problem, developing an approach to the solution, and developing software, are different skills. Open Source culture has not, by and large, recognised this and we have instead conflated these three conditions into one – the ability to develop software. To move on, we need to recognise the non sequiteur inherent in the logic of open source culture and develop models to address it.
The strength in open source is in people fixing their own problems (itching their own scratch). It works because these people bring insights into the problem that no one else has. Consequently, they come up with great solutions. The strength is not developers solving problems, but people solving their own problems. That is why developers do a great job fixing developer problems. The model fails when we extrapolate the developer solving developer problems model and assume that those who can create software can solve any problems where software may be required. We have conflated writing software with solving problems that require software and we miss the point, the fundamental strength, of open source. Open Source succeeds not because of code, but because of people solving their own problems.
This may sound counter intuitive. Open Source’s success is not because of code? But it’s not that difficult to understand. Simply put, software is a means to an end. We have focused too much on the means (developing software) and forgotten that it is not what we are doing. We are actually trying to solve a problem for someone, somewhere. Software, open or closed, is no good if it doesn’t solve that problem. It is completely useless. Open source’s primary goal is to solve problems, not create code, even though software is the dominant means to get there.
Not understanding this has lead Open Source down any number of non-productive wormholes and has lead to a technocentric culture that doesn’t understand its own strengths and limitations. We have been myopic in our approach to problems, seeing all problems as merely issues developers have not yet solved. It has led to terrible cultural issues for open source, including the over-reliance on ‘technical thinking’, the dominant power being held by white men who know how to code, the understanding that software freedom is merely a licensing issue, and a lack of understanding of wider issues of empowerment, the belief that developer tools (such as git) are good for any problem, if only we knew how to use them, the requirement in open source projects that all those involved should use the tool in the developer’s stack, the lack of role diversity in open source projects and the power imbalance that accompanies this, the use of the term ‘non-coders’ to name all skilled people that are not developers, the absence of patience for the ‘user,’ and the distillation of the user (the one with the problem) to the furthermost point from the center of the endeavor.
All of these issues have lead to a lack of cultural diversity, shortcomings on approaches to the many interesting problems the world has to offer open source, and, simply put, many bad open source solutions.
We need to address this issue and fast. Open source is fast losing out in ‘user space.’ We need to turn the corner and understand quickly that the real strength that open source represents is when people solve their own problems. When the user brings insights into the problem that no one else has. When they are deeply involved in scratching their own itch. This is when great solutions evolve, as open source has already proven by the many amazing, innovative, world-beating solutions developers have created for themselves. But since we’ve mistakenly conflated ‘person with the problem’ with ‘developer’, we’ve missed the value of our own proposition.
What are we to do with this? The answer is simple: always have the people with the problem at the heart of the open source project. While the answer is simple, the implications are huge. They include the need to diversify participation on open source models, to push the developers out of the central roles they currently occupy in open source projects, to tear down technical meritocracy as the single determinant of value, and to experiment with new models of open source cultures.
This is why the Cabbage Tree Method exists. It’s a process designed to assist open source projects to develop world-beating, user-facing software. It’s also a process that changes the typical open source model substantially and advocates, by example, for a fresh approach to how open source projects are created, constituted, and operated.
Vivliostyle advertises itself as a “publishing workflow tool.” It is a great tool but this isn’t a terribly accurate or descriptive byline since publishers can’t really use it without changing everything else they do. Hence my preference to refer to it as an HTML pagination engine (although I used to refer to this process as Browser Typesetting). It can be part of a publisher’s workflow but only after they have transformed everything else they do to an HTML-first workflow.
On this point, I think Vivliostyle have their sales pitch wrong but it is hard to know how they might do it better since Vivliostyle imagines, as do many other tools, a radically different way of doing things to how publishers operate now. Vivliostyle is one necessary part of an approach that is, in effect, a paradigm shift in how publishers work. This ‘new way’ of working is exciting and transformative, but it is also hard to capture paradigmatic changes in a few catch phrases. So, to understand the value of Vivliostyle, you first have to understand how publishers work now, why it is a terrible way to work, and what HTML-first workflows can offer. Once you understand that, you can see how exciting Vivliostyle could be for publishing.
So… what does it do? Vivliostyle is the latest in a long list of tools that, among other things (more on these in later posts), can enable the transformation of HTML to PDF. These tools have been typically used to produce PDF for printing – book formatted PDF. There are a few of these softwares out there but most are proprietary and only a handful of them have been Open Source (notably BookJS that I was involved in a long time ago, and CaSSius). Vivliostyle is the latest in this family tree, the root category of which I would describe as an HTML Pagination Engine.
What it does is this – it takes an HTML file and paginates it. It flows the HTML through the ‘boxes’ (pages) you have defined and lays the content out nicely in each box, one box after the other, flowing all the content through it with the right margins (and more, see below). So instead of scrolling through the web page, you page through the document on screen. Vivliostyle converts the HTML into ‘pages’. It is an HTML pagination engine.
It is then possible to do a lot more with these pages, including adding page numbers, using CSS (the style rules used by browsers) to define the look and feel of text and images(etc), adding headers, notes and footers etc. In other words, you can add to each page everything you need to make the result ‘look like a book’.
You can do this transformation in the browser because Vivliostyle is written in JavaScript. If you want to see this in action, check out their demos page. For example, look at this page showing the raw HTML of a book by Lea Verou (published by O’Reilly), then open this page (give it a minute to render). Now right click and print (best in Chrome or Chromium browsers – you may need turn margins off and background images on when printing). The result should give you a one to one co-relation of the paginated content in the browser (which is HTML) to the paginated print-ready PDF.
Since the browser can also print PDF, then you can take this newly styled ‘book looking’ HTML and print it to PDF from the browser. From this, you have a book-formatted PDF that is ready to send to the printer to be printed and bound. I’ve worked this way for many years and printed many books this way for organisations from the World Bank to Cisco. The system works and the printed books look great. Vivliostyle is, at this point, the most sophisticated open source tool for doing this.
It is pretty amazing stuff. From HTML to PDF to printed book at the press of a button. Magic. This process is also catching on. Hachette produce their trade paperbacks using an HTML-to-PDF rendering engine with styling via CSS. So it is no longer a process reserved for small experimental players.
So, why is this interesting? Well, it is interesting almost without an explanation! Transforming HTML in this manner seems pretty magical and it’s kinda neat just to look at the demos and marvel at it. However, the really interesting part comes into play when we start talking about workflows. This is where Vivliostyle can be part of transforming publishing workflows.
As a publisher, if you were to take Vivliostyle ‘out of the box’ it would not be of much use to you. How many web pages do you have that you want to turn into PDF? Not many, if any. How much of your book or journal content in your current workflow is stored in HTML? Probably none. In all likelihood, HTML in your business is restricted to your website and, perhaps, EPUB if you produce them (EPUB is just a zip archive containing HTML files and some other stuff). But in most publisher’s workflows the EPUB is an end-of-line format. Publishers take the completed copy in MS Word, or (sometimes, regrettably) PDF, and send it to a vendor (typically in India) to transform into EPUB and send back. So chances are, HTML is not the format being used as the basis for your book or journal workflows (apart from possibly being an end-of-line format).
As a publisher, you don’t have manuscript copy in HTML, so Vivliostyle is not going to fit snuggly into your workflow. In order to utilise it, you need to transform the way that you work. You have to start working in HTML or, more difficultly, work in some format that can transform into a very tightly controlled HTML output so that Vivliostyle can work with it.
I don’t like the latter style of workflow. This is where you work in something like XML (of some sort) and then transform to HTML at the end of the process. It’s ugly workflow, not friendly for non-techie users and typically full of workflow redundancy. If you want good HTML, then just work with, and in, HTML. This comes with additional benefits since from good HTML you can get to any format you want PLUS you have the advantage of now being able to move your workflow into the browser. And this is where Vivliostyle fits into a toolset, an approach, that could transform how you work – the HTML-first production environment.
The current ‘state of the nation’ in publishing is pretty terrible. Most publishers use MS Word docx as their document format, and Track Changes and email are their primary workflow tools. This means that there is a single document of record – the collection of Word files. These are shareable, in the sense that you can email people copies, but you cannot have multiple people simultaneously accessing them at the same time. In effect, the MS Word files are like digital paper in the worst possible way. There is only one ‘up to date’ version and only one person that can work on that version while they hold onto the files. There is no easy way to follow a document’s history, revert to specific versions, or identify who made what change when. Further, there is no inherent backup strategy built into standalone MS Word files. Everything must be done manually. That means organizing the files in directory structures with naming conventions that are known by only those in the know (since there is no standard way of doing this). There are also problems with email as a collaboration tool. Did it send? Did they get it? Did they get the right version? Plus there is no way of understanding the status of the documents unless you ask via email or there is some other system that is manually updated for status tracking. The system is not transparent. Further, changing workflows when using systems like this, even for small optimisations, is quite difficult and the larger the team the more difficult it gets.
Additionally, using Word and email like this is really placing unnecessary gateway mechanisms on the content. If I have the up-to-date versions then you can’t have them or work on them. There is really only one copy and only one person can work on it. That makes for linear workflows and strongly delineated roles. No one can ‘jump in and help’ and it is very difficult to alter the linearity of the process or redistribute the labor to achieve efficiencies.
If publishing is to move on, then workflows need to migrate to the browser. With browser-based workflows, there is no need to have multiple copies of the same file, versioning is taken care of as is document history, it is easy to add and remove people from the process, and labor can be better distributed over both roles and time to create more elegant, efficient, workflows. I wrote about this in an earlier post and will write more in posts to come since there is a lotmore to it. But suffice to say that publishing workflows to the browser is a little like ‘sucking all the gaps’ out of the current Word-email workflows (plus a whole lot of other benefits). No more checking your Inbox while you wait for status updates or someone to send you the files so you, and you alone, can work on the next little part of the process while everyone else waits. Additionally, there can be full transparency as to what needs to be, has been, and is being done (and by whom). There is the opportunity to break down larger tasks into smaller tasks and have them all in play concurrently. There is the opportunity to share the same tools and hence enhance communication and redistribute the work to where (who) it makes most sense. There is so much to be gained.
This is not to say that browser-based workflows are ‘anything goes’ workflows (which is what most publishers think this way of working amounts to). You can still assert rules of who has access and when. But… in my experience, when you migrate workflows to the browser then publishers start rethinking how they work and you often hear comments like “but we don’t need to do it like that anymore’…They then start designing radically better workflows themselves.
So, the point of all this is that Vivliostyle by itself does not achieve this. It is not, in itself, a workflow tool for publishers. You first need all the other things that enable an HTML-first workflow to be in place and once they are there, then you can utilize Vivliostyle to transform the HTML (at the push of a button) to the PDF you need for printing. That is the radical improvement Vivliostyle can offer. Cut out the file conversion vendors and render the content according to templated style sheets (automated typesetting can produce beautiful results). This means you can check what the book will look like at any moment, plus the CSS stylesheets you use can also be included in your EPUBs (also rendered at the push of a button since the original content is already in HTML, the content filetype for EPUB) so your printed book and the EPUB look the same.
So, Vivliostyle is a necessary tool for HTML workflows and with an HTML workflow you will radically improve what you do.
This is why Vivliostyle is important to publishers but you cannot consider it isolation. You must consider it with regard to migrating to an HTML first workflow. If you migrate to this kind of workflow then not only will you experience the efficiencies described above but your organisational culture will be transformed and the types of content you can then produce will become a lot more open ended. This is the vision that Vivliostyle, and other tools that enable HTML-first workflows (including those developed by UCP and Coko), are imagining and building towards.
Dear reader, out of principle, I do not use proprietary social media platforms and networks. So, if you like this content, please use your channels to promote it – email it to a friend (for example). Many thanks! Adam
I’m currently writing a book about (what is now called) Collaborative Product Design – the methodology I designed to facilitate ‘users’ to design their own software. Scott Nesbitt is helping me produce the book as are Raewyn Whyte, Julien Taquet and Pepper Curry.
Scott asked me to write an intro to the book… so, here is my first attempt. I’d appreciate feedback!
Open Source is broken. Or, more precisely, it is partially broken. It works for solving problems developers have: it doesn't generally work so well for fixing problems 'users' have.
The part that works is where the famous 'itch to scratch' model works. If you are a developer, and you see a problem you have as a developer, then you are in a good position to solve it. Commonly, this model has been thought to work because the developer has the necessary skills to solve the problem, but that is not the critical issue. The critical issue is that, in these cases, the developer is the user, and as such, understands the problem in depth. They own the problem, and hence the solution.
This is where Open Source excels. This is why we have such a proliferation of open source 'under the hood' libraries and tools for developers such that it is near impossible to be a developer these days and not use Open Source tools. The itch to scratch model has worked for developers solving the problems they and, consequently, other developers, have.
But outside of developers solving their own problems, open source has largely failed. There are very few convincing 'user-facing' softwares that can beat out their proprietary rivals on approach, utility, and usability. This is because Open Source works [mostly, only] when people are 'fixing their own problems' (itching their own scratch). This is the fundamental strength of open source. The model's strength is NOT developers solving other's problems, but people solving their own problems. When developers are the single solution provider for 'other peoples problems,' the model fails.
What are we to do with this? The answer is simple but has profound impacts for open source culture. The answer is to always have the 'people with the problem' at the heart of the open source project. The answer is simple, the implications are huge. They include the need to diversify participation in open source projects, to push the developer out of the central role they currently occupy in open source projects as a single solution provider, to tear down technical meritocracy as the single determinant of value, and to experiment with new models of open source cultures.
Seems a week after I suggested it could happen…Medium pivots… anyways, good judgement more than prescience but it illustrates a point.
This illustrates that putting your content in a web-based start up is rolling the dice in a game where you don’t have a say in who is playing, the rules, or who wins.
Apparently small changes in writing technologies can radicalise publishing workflows. Matthew Kirschenbaum makes a good case for this in his book “Track Changes: A Literary History of Word Processing“. There are a number of cases in the book that illustrate this. I’m about 1/3 of the way through and the example that astonished me was related to Issac Asimov.
The setting is mid 1981 and Asimov has just encountered his first word processor at age 61 (Kirschenbaum places 1981 as, pretty much, the start of the word processing revolution). Previous to this, Asimov had written with a typewriter. He was extremely fast and hence many errors were made that could not be corrected easily with a typewriter. Errors, in those days, were the problem of the copy editor who marked the manuscript up with corrections.
As it happens, Asimov finds typing faster with the new electronic word processor and so he makes more mistakes. The copy editor then gets frustrated and in turn frustrates Asimov with many more clarifying queries about the errors made. Dr Asimov then starts correcting his own text using find-and-replace to correct common errors…
“cold” suddenly becomes “could” and no sign exists that it was ever anything else.
The result is cleaner copy which elicits fewer inline queries from the copy editor. Right there, publishing workflows changed and labor was redistributed from the copy editor to the author.
A migration from a typewriter to a word processor is not a small change. However, on a higher, more abstract, level the ability to correct content ‘as you go’ is not a huge cognitive leap. When the technology enabled this, however, the change it effected was great.
Small design changes can have huge consequences for publishing workflows to the point of redefining roles. With the introduction of the word processor came the ability to change copy easily and consequently the author assumed more of this role and the copy editor’s role was redefined.
I think we might have something similar in the systems we are building at the moment – particularly Editoria, the platform for scholarly monograph production. I’m not claiming the system will have an effect on the same scale of the introduction of word processors, rather that a few, apparently small, design decisions will have a large impact on the workflow of those using the system. This change is not just about efficiency, which is certainly what we are aiming to achieve, but how things are done ie. a similar redistribution of labor.
In the current workflow of the University of California Press (UCP), with whom we are designing the system, the following steps occur:
The acquisition dept acquires the book from the author in form of a collection of Microsoft Word (MS Word) files in docx format. One file per chapter. These are sent to the production dept.
Production editors (in the production dept) open each file on the premises of UCP where MS Word is installed with custom macros. These macros enable the editors to select a part of the text and apply custom styles (see image below).
The newly styled MS Word files are then pushed through the review cycle featuring the copy editor and author(s).
Step 2, styling, is a long manual process. Also, it can only occur on computers that have these macros installed. Further, the subsequent edits by the copy editor and author(s) might revert some styling changes. So some of the work will need to be done again. Lastly, when the version of MS Word is updated then the Macros must be checked, potentially rewritten to keep pace with the new version, and re-installed on each machine.
So…the small change we are introducing we call HTML Typescript. It is nothing fancy, nothing new, but it is a fresh look at how we can bring offline word processing workflows into the browser. The critical problem being, how to do you get the MS word files provided by an author into a web-based production workflow? In clearer, slightly more technical, terms how do you transform a document from MS Word to HTML? Well… many people literally say “it can’t be done.” The criticism is not as literal as it sounds and really comes down to what you are trying to achieve. If you wish a one-to-one conversion of an author’s structural intent for the production a correctly marked up HTML document, then automated conversion through any process is not going to achieve this. Not even the very complex processes that go from docx to structured XML (of some type) and then to HTML, which is the approach file conversion specialists prefer, will achieve this.
This is where we think HTML Typescript solves some very interesting problems.
HTML Typescript offers a reframing of the problem that, consequently, enables processes that closely match the publisher’s current workflow. It also offers the potential to radically redefine that workflow.
Docx is a type of XML. It is, laughably, called OpenXML by Miscrosoft. Laughably, because while you can unzip a docx file (a docx file is a zip file containing a simple directory structure and some files), you can open the document.xml and poke around with it (see video demo below), but it is incomprehensible. There are two critical problems that contribute to this:
The XML is pretty unreadable by humans. It is dense and verbose and very very messy.
Due to the editing environment (MS Word) and its inability to constrain authors from using ‘font size 12, bold’ instead of marking the text correctly as a Heading Level 2 (for example), the resulting styles applied by the author, and described in the XML, are erratic at best.
Number 2 is the big problem, and issue number 1 helps obfuscate it.
So, when you want to transfer from docx to HTML, it is very difficult to determine the author’s strucural intent without looking at the original display version of the MS Word file. Is this indented single line marked ‘font size 16.6, bold’ a heading 1,2, or 3…or is it meant to be a block quote? For example…
This is exactly why the macros mentioned above must be used, because named styles are not used ‘upstream’ by the author. So the production editors must work their way through each file and apply the correct designated styles using the custom macros.
With XSLT, the conversion process of choice for file conversion pros, it is very possible to convert from docx to HTML. Nothing impossible there. You don’t end up losing content (there are a few gotchas for this which I will cover in later posts) but you do end up with messy unstructured HTML. This is because the original docx is messy and unstructured. This is what people mean when they say ‘it can’t be done’ ie. you cannot infer all the structural intent of the author from the docx file, because it is a mess, and by some magic subsequently convert it to lovely clean, structured, HTML.
But, that is actually ok. Converting to ‘messy’ HTML really puts you in about the same position as having a messy docx file. They are both equivalently unstructured.
File conversion specialists don’t like this position. They want, by their very nature, documents to be nicely structured. HTML does not, in its raw form, stipulate much structure. Sure you can have headings 1,2,3 but also you can have, as MS Word does, arbitrary font sizes and styles that look like a heading, but the underlying markup doesn’t explicitly state this is the case. In addition, HTML doesn’t worry about document sectional structure too much. What if you wish to identify a section of the text as a ‘method’ section? How do you deal with that? Custom XML can deal with this as you can design a mark-up structure to suit. But HTML, as it is described by the standards, doesn’t enable this (yet…web components will change this to some degree). However, with plain ole HTML you can use any sort of div or span you like to wrap around sections of a HTML document and call them what you like. For example, you can have ‘div class=”method”‘ and there you go…the section is defined, for your intents and purposes, as a method section. This does not have much currency in the outside world (the world outside your platform) since there is no standard way of describing a method section in HTML, but for the purposes of creating production systems, this is really ok. We can work with this. The important thing is that we can now apply document structure with HTML should we wish to. When we wish the document to leave the system we can make that conversion at export time (more on this below).
So… Where does that leave us? Ok… well, at the time of conversion we can move from docx to HTML and make some ‘educated’ guesses as to what the author’s intent is. For example, if there is one single line text with ‘font size 24, bold’ and then 16 of ‘font size 20, bold’ and 6 of ‘font size 14, bold’ and then a bunch of sentences groups with font size 12 – then we can map these to heading 1,2,3 respectively, and the last being standard paragraphs. It is not perfect -it will not catch all structure, and the process will make incorrect inferences. However, it will get us part of the way there. So we can already start improving on the structure of the MS Word file automagically.
Arguably, we are in a better position with an initial rules-based clean up of the file structure as it passes from docx to HTML. The good thing is that we can improve these rules over time. In time, the automated conversion will produce better results.
After a conversion of this kind, we have a partially structured file. This is where file conversion specialists often leave the conversation. They don’t like partially structured anything. It is not in their DNA. That is because they are primarily concerned with conforming file A into structure X. Their metric is ‘ is it well structured?’. It is a pass/fail binary. If you are to look primarily at whether file A or B is well structured then you want well-defined schemas that describe the structure, and you want files that subscribe perfectly to that schema. Anything that falls out of this is a fail. Partially structured documents are a fail.
But with the HTML Typescript approach we see partially structured documents as a strength, not a weakness. We know it is not possible to get to perfectly formed documents in one go, so rather than consider this a fail we accept we must get there progressively. That is the fundamental principle behind what we are calling HTML Typescript. It is the use of HTML as a document format that can be progressively improved to get to the structure you desire over time using both machine and manual processes.
HTML is the perfect format for this. HTML’s lack of formal structure, along with its ability to define any kind of structure you want, enables us to progressively add any kind of structure we need to a document. One part of this process is the automated clean up at conversion time, and the next part of the process is where it starts getting interesting… this is the manual application of structure.
This is where the apparentweakness of HTML becomes a strength – we can manually add structure over time. We can progressively structure the document. For this process, we can build, and the Coko team are building, a suite of tools in the production environment so that a production editor (for example) can click on an element (eg a heading) and choose the style they wish to apply from a menu – similar to how they currently work with macros.
The advantages of adding the correct structure in the browser vs MS Word? Well, firstly, as mentioned above, we can computationally improve the structure before the manual process. This results in less work to do. Secondly, we have complete control over the tools available to the platform’s distributed citizens (I don’t like the term ‘user’ and so are experimenting with other terms. Let’s try citizen for now). Hence we can make the tools available to everyone, not to just those citizens with the MS Word macros installed. That means:
There is no need to update the (equivalent of) macros against the underlying desktop software version across many machines.
If I wish to update the features that enable styling, then all users can leverage these updates immediately.
So, as a publisher, I’m not stuck in the harrowing, expensive, cycle of continual software upgrades and installs against random (or planned) updates of MS Word that maybe conducted by my org. That is already a saving. There is a second tier of savings here – we are building open source systems with the intention that they are used by a large number of orgs. If we share a common platform and common toolkit, we can also share the costs of maintenance and development, and free from vendor upgrade cycles. Each publisher doesn’t have to do their own software development to keep their own styling macros up to date. Yet you can still innovate and develop new features – while any new feature can potentially be available to everyone.
But…more importantly…just as with great power comes great responsibility, we could say that with a shared toolset comes shared responsibility. That doesn’t reference the sharing of costs mentioned above (although it could) but rather it references the roles of the people in the organisation using the toolset. If the tools for styling are available to all citizens, then so might the responsibility for using those tools. Much like Asimov noticed he could now do some of the copy editing, so might a publisher re-distribute the work of styling a text. This is where a small design decision might have a large impact. Publishers that are forced to design workflows based, literally, on where the tools are (what machines the macros are installed on in this case), can now design workflows dependent on who they think would be best placed to do the work. That is pretty interesting. Such a small design decision might actually cause pretty radical changes to workflow.
There are a couple of things I want to comment on with regard to this. Firstly, achieving efficiencies like the above are only available to you if you design systems rather than software. Designing software is a technical endeavor, designing systems is not. Designing systems requires an understanding of what is trying to be achieved, by whom, and with what constraints. It is very difficult, for example, to imagine and design a progressive approach to structuring files if you are simply focused on building software that produces well-structured files. It sounds like a distinction without a difference, but it’s not. The difference is profound.
Secondly, what I have described above requires, in this implementation, HTML Typescript. But HTML Typescript is not a format, it is merely a way of using an existing format – HTML – and seeing its apparent weakness (no strict formal structure) as a strength. So HTML Typescript is not really a technical solution so much as it is an approach — one which enables us to build interesting systems to solve the long-standing problem of ‘getting out of MS Word’ and into the web.
Lastly, there may be some concern that this kind of approach doesn’t get us to structured formats like JATS (or whatever) that may be required by your content sector. Well, that’s not a big deal. You can design your version of HTML with all the structure you need to make an easy conversion to whatever XML format you want. That’s not so tricky. HTML can contain that structure and you can build the production tools (the web-based equivalent of the MS Word macros) to suit. In other words, end-of-line formats (for distribution or syndication) should remain an end-of-line problem. When you press ‘export’ (or whatever), then that is the time, presumably, when the document is nicely structured, that you convert it to whatever you need in a relatively straightforward conversion.
In summary, I’ve been kicking around in this area for a long time and feel that there are some ideas that are surfacing that not only solve some tricky problems, but open the door to new possibilities. I think the HTML Typescript approach is one such case and I’m looking forward to finishing the implementation of the HTML Typescript approach with Editoria and trialing in the weeks to come. More thoughts to come.
HTML Typescript is an approach developed by myself and Wendell Piez.