Athens Road Map Meet

Had a great monthly roadmap meeting with the Coko Greek crew yesterday in Athens. We discussed and planned micropubs, Editoria, Wax 2, xpub-columbia/collabra, and chatted about Dan Morgan’s pizza theory for workflow, and stuff and things. Drank lots of coffee and this time had some yummy good for lunch. Always fun. Always productive. Fantastic bunch 🙂

dsc01090
dsc01091
dsc01102
dsc01095
dsc01103
dsc01098
dsc01099
dsc01100

Cultural Method

I’ve been pondering some stuff in preparation for a presentation at Open Source Lisbon this week. In essence, I’m trying to understand Open Source and how it works… not to say I don’t know how Open Source works, we do it well at Coko…I mean to zoom up a level and really understand the theory and not just the mechanics. It is one thing to facilitate a bunch of people to meld into a community, it is quite another to understand why that is important, and what the upsides and downsides are on a meta level. If you take the ideals out and look purely at the mechanics from a bird’s eye view. then what, ultimately, makes Open Source a better endeavor than proprietary software? What is exactly going on?

I have some clues… some threads…but while each thread makes sense when you consider it on its own, when you combine them all it doesn’t exactly make a nice neat little montage. Or if it does, I am currently not at the right zoom level to see it clearly instead I see lots of different threads criss-crossing each other,

Ok…so enough rambling… what is it I’m trying to understand…well, I think when you embark on making software there is this meta category of methods known as the Systems Development Life Cycles (SDLC). Its a broad grouping that describes the path from conception of the idea, through to design, build, implementation, maintenance and back again etc…

Under this broad umbrella are a whole lot of methods. Agile is one which you may have heard of. As is Lean. Then there are things like Joint Application Design (JAD), and Spiral, Xtreme programming, and a whole lot more. Each has its own philosophy and if you know them you can sort of see them like a bookshelf of offerings…you browse it and intentionally choose the one you want. Except these days people don’t choose really, they go with the fashion. Agile and Lean being the most fashionable right now.

The point is, these are explicit, well documented, methods. You can even get trained and certified in many of them.

But… Open Source doesn’t have that. There isn’t a bookshelf of open source software development methods. There are a few books, with a few clues, but these are largely written to explain the mechanics of things and they seldom acknowledge context. I say that because the books I have read like this make a whole lot of assumptions and those assumptions are largely based on the ‘first wave’ of Open Source – the story of the lone programmer starting off and writing some code then finding out it’s a good idea to then build community instead of purely code therefore magnifying the effect. A la Linus Torvalds.

But its very down-on-the-ground stuff. I’m thinking of Producing Open Source by Karl Fogel, and The Art of Community by Jono Bacon. Both very well known texts and I have found both very useful in the past. But they don’t provide a framework for understanding open source. I’ve also read some research articles on the matter that weren’t very good. They tend to also regurgitate first generation myths as if open source is this magic thing and they struggle to understand ‘the magic’. In other words, I miss a ‘unified theory’, a framework, for open source…

I think it is particularly important these days as we are beyond the first generation and yet our imaginations are lagging behind us. There are many more models of open source now than when Eric Raymond described a kind of cultural method which he referred to as ‘the bazaar’ in his cathedral and the bazaar. We now have a multitude of ways to make open source and so the license no longer prescribes a first generational approach, producing open source is much  richer than that these days.

As it happens, Raymond’s text does attempt to provide some kind of coherent theory about why things work although it often mixes ‘the mechanical’ (do this) with an attempt to explain why these processes work. It doesn’t do a bad job, there is some good stuff in there, but it varies in level of description and explanation in a way that is uneven and sometimes unsatisfying. Also, as per above, it only addresses the first generation ‘bazaar’ model. While this model is still common today in open source circles, it needs a more thorough examination and updating to include the last 15 years of other emergent models for open source. There are, for example (and to stretch the metaphors to breaking point), many cathedral models in open source these days that seem to work, and some that look rather like bazaar-cathedral hybrids.

Recently Mozilla attempted to make some sense of these ‘new’ (-ish) models with their recent paper on ‘archetypes’

https://blog.mozilla.org/wp-content/uploads/2018/05/MZOTS_OS_Archetypes_report_ext_scr.pdf

Here they kind of describe what reads as Systems Development Life Cycle methods…indeed they even refer to them as methods

The report provides a set of open source project archetypes as a common starting point for discussions and decision-making: a conceptual framework that Mozillians can use to talk about the goals and methods appropriate for a given project.

They have even given them names such as ‘Trusted Vendor’ and ‘Bathwater’ and the descriptions of each of these ‘types’ of open source project sound to me like they are trying to make a first stab at a taxonomy of open source cultural practices – so you can choose one, just like a proprietary project would choose, or self identify as, Agile or Lean. Infact, the video on the blog promoting this study pretty much says as much. It’s Mozilla’s attempt at constructing a kind of SDLC  based on project type (which is like choosing a ‘culture’ instead of a method).

However it doesn’t quite work. The paper compacts a whole lot of stuff into several categories and it is so dense that, while it is obvious a lot of thought has gone into it, it is pretty hard to parse. I couldn’t extract much value of what one model meant vs the other, or how I would identify if a project was one or the other. It was just too dense.

Mozilla has effectively written a text that describes a number of different types of bazaars, and also some cathedrals, without actually explaining why they work – except in a few pages that sort of off-handedly comment on some reasons why Open Source works. I’m referring to the section that provides some light assertions as to why Open Source is good to:

  • Improve product quality.
  • Amplify or expand developer base.
  • Increase the size or quality of your organization’s developer hiring pool.
  • Improve internal collaboration within one’s own organization.
  • ..etc…

But this is the important stuff… if these things, and the other items listed in that section are true (I believe they are), they why are they true? Why do they work? Under what conditions do they work and when do they fail?

In other words, I think the Mozilla doc is interesting, but it is cross cutting at the wrong angle. I think a definition of archetypes is probably going to yield as many archetypes as there are open source projects – so choosing one archetype is a hopeful thought. Also the boundaries seem a little arbitrary. While the doc is interesting, I think it is the characteristics listed in the ‘Benefits of Open Source’ section of the Moz doc that are the important things to understand – this is where a framework could be built that would describe the elements that make open source work…..allowing us to understand in our own contexts what things we may be doing well at, what we could improve, what we should avoid, useful tools etc

The sort of thing I’m asking for is a structured piece of knowledge that can take each of the pieces of the puzzle and put them together with an explanation of why they work…not just that they exist and, at times, do work, or are sometimes/often grouped together in certain ways. An explanation of why things work would provide a useful framework for understanding what we are doing so we can improvise, improve our game, and avoid repeating errors that many have made before us.

With this a project could understand why open source works, and then drill down to design the operational mechanics for their context. They could design / choose how to implement an open source framework to meet their needs.

Such texts do exist in other sectors. Some of these actually could contribute to such a model. I think, for example, the Diffusion of Innovations by Everett Rogers is such a text, as is Open Innovations by Henry Chesbrough. These texts, while focused on other sectors, do explain some crucial reasons why open source works. Rogers explains why ‘open source can spread so quickly’ (as referenced in one line in the Moz doc), and Chesbrough provides substantial insights into why innovation can flourish in a healthy open source culture, and how system architecture might play a role in that.

Also the work of John Abele is important to look at and his ideas of collaborative leadership. As well as Eric Raymond’s text…but it all needs to be tied together in a cohesive framework…

This post isn’t meant to be a review of the Moz article. It reflects the enjoyment I have gained from understanding elements of open source by reading comprehensive analysis and explanation of phenomenon like diffusion and open innovation. These texts are compelling and I have learned a lot from them which have helped when developing the model for Coko because at the end of the day, there is no archetype that exactly fits – it is better to construct your own framework, your own theory of open source, to guide how you put things together, than to try and second guess and copy another project from a distance. Its for this reason that I would love to have a unified framework for open source that takes a stab at explaining why all these benefits of open source work so I can decide for myself which ones fit or how they fit with the projects I am involved with.

Open Innovation

Reading Open Innovation – a thesis evolved by Henry Chesbrough in 2003. I have also the follow-up book published in 2006 which is a collaboration with other researchers going through his earlier thesis.

I’m researching this as I’m interested in what current literature exists that explains Open Source and why / how it works which is not from the Open Source domain. Books that emanate from the Open Source domain tend to be religious in nature and it is also true that most attacks against Open Source take it from the religious angle… so having literature that endorses the model which is not open source evangelicalism is very useful.

Previous to this I found a lot of value in The Diffusion of Innovations (originally published in 1962) by Everett Rogers.

Open Innovation and the Diffusion of Innovations separately explain quite a bit about why Open Source works, and I think I’ll post more about this as it becomes clearer in my head.

Chesbrough’s thesis can be summed up in one quote

The Open Innovation paradigm treats R&D as an open system. Open Innovation suggests that valuable ideas can come from inside or outside the company and can go to market from inside or outside the company

Essentially it is the admission that any one company doesn’t have all the smart teams/people/ideas. So how about re-imagining innovation and release it from a so-called ‘vertical innovation’ model, where all the R&D is done inhouse and where IP (Intellectual Property) is jealously guarded, to a open model where innovation essentially comes through collaboration with orgs and individuals outside the company.

From an Open Source point of view this is a ‘duh’ moment… Open Source has long expounded this approach. But…I have never found it well explained…

So it is good to find this argument made elsewhere and in clearer terms…but unfortunately the Chesborough thesis was published in 2003 when Open Source was still very young. Consequently Chesbrough reads Open Source as a idealistic and altruistic movement… he doesn’t really consider open source projects to have a business model and a business model is central to his thesis. Its a pity as Open Source has moved on since then and there are a lot of very successful and interesting examples of Open Source business models. But if you sorta squint while you are reading, and blur out the dated-ness then there is a lot of stuff that could just be quoted verbatim that makes a strong argument for Open Source as seen through the lens of the Open Innovation thesis.

Thats pretty interesting as, combined with the Diffusion of Innovations, these two bodies of work explain the value (and consequently provide a rationale which does not come from the open source sector directly) of open source. Open Innovation explains why open source is a good idea if you are a company whose business requires software to function in its core offerings, and the Diffusion of Innovation theory helps us understand why open source can beat closed source software in the arena of adoption.

The point is, if you can combine the two you have a winner – a model that enables rapid adoption and innovates faster than closed alternatives/competitors. If you can marry successful commercial activity to this you have something very powerful that can potentially wipe out the existing proprietary offerings – which is what we need in the publishing sector. The aim of what we are now doing in Coko, in this post-foundational stage, is to seed the commercial activity around the very healthy core of community technologies we have built.

Anyways… here are some quotes I liked from some of the chapters….Some of the quotes come from this chapter by Joel West and Scott Gallagher http://web.simmons.edu/~weigle/INNOVATION/Patterns%20of%20Open%20Innovation.pdf

Open Innovation is the use of purposive inflows and outflows of knowledge to accelerate internal innovation, and expand the markets for external use of innovation, respectively. Open Innovation is a paradigm that assumes that firms can and should use external ideas as well as internal ideas, and internal and external paths to market, as they look to advance their technology. Open Innovation processes combine internal and external ideas into architectures and systems. They utilize business models to define the requirements for these architectures and systems. The business model utilizes both external and internal ideas to create value, while defining internal mechanisms to claim some portion of that value. Open Innovation assumes that internal ideas can also be taken to market through external channels, outside the current businesses of the firm, to generate additional value

…useful knowledge is scarce, hard to find, and hazardous to rely upon (a root cause of the NIH syndrome). In Open Innovation, useful knowledge is generally believed to be widely distributed, and of generally high quality

IP becomes a critical element of innovation, since IP flows in and out of the firm on a regular basis, and can facilitate the use of markets to exchange valuable knowledge. IP can sometimes even be given away through publication, or donation.

Recently, open source software has emerged as an important phenomenon that utilizes external knowledge in a network structure (Lerner and Tirole 2002; O’Mahoney 2003; Dedrick and West 2004; von Hippel 2005)

 Most software users would face significant switching costs in using some other software package, due to some combination of retraining user skills and converting data stored in proprietary file formats. As Arthur (1996) observes, software thus has tremendous positive returns to scale, generally allowing only one (or a small number) of winners to emerge.

These winners are tempted to extract rents from their customers by increasing prices and creating additional switching costs to protect those rents (Shapiro and Varian 1999). From these production economics, commercial software firms seek to build complete systems to meet a broad range of needs, in hopes of forestalling potential competitors and protecting high gross profit margins

In other cases, a system architecture will consist of various components. Some mature (or highly competitive) components may be highly commoditized, while other pieces are more rapidly changing or otherwise difficult to imitate and thus offer opportunities for capturing economic value. Two open source examples are the IBM’s WebSphere and Apple’s Safari browser…

…Customers access the WebSphere e-commerce software using standard web browsers, so IBM originally developed a proprietary httpd (web page) server. IBM later abandoned its server for the Apache httpd server, recognizing that it would be wasting resources trying to catch up to the better quality and larger market share enjoyed by Apache (West 2003). Today, IBM engineers are involved in the ongoing Apache innovation, both for the httpd server and also related projects hosted by the Apache Software Foundation (Apache.org website)

What is xPub?

So, there are a lot of product names at Coko – PubSweet, Editoria, XSweet, INK, xPub etc etc etc… so becoming tricky to track, but I wanted it seems there are quite a few people interested in xPub right now.

xPub is not really a product as such, it’s more a group of products – each of which relates to Journal workflows. The names for each product indicyae those relationships: xpub-collabra, xpub-elife, xpub-faraday (Hindawi) and xpub-epmc

Each of these is a platform, so yip, you read it right – there are actually no less than 4 journal platforms being developed in the Coko community, not including the Micropublications platforms (of which there are two – one developed with Wormbase, and the other with the Organisation for Human Brain Mapping).

So, right from the get go our ‘offer’ is not standard. We don’t offer one platform to rule them all – there are many journal platforms in production. All of these are built on top of PubSweet… PubSweet is a kind of ‘headless’ publishing platform….it’s more or less the backend brains in that it is a kind of framework that ‘thinks like a publishing system’ but has not determined a workflow for you. So, each of the xpub-* platforms are actually publishing platforms built to meet a specific workflow and built on top of PubSweet…

Here is a crappy diagram (drawn in haste) to get the basic idea across:

ya

In the above, you see a PubSweet platform (eg. xpub-elife). PubSweet itself is the whole app – it sort of ‘encapsulates’ everything. Really though, you get the PubSweet core and then extend it with front-end components to meet your workflow needs. A typical front-end component might be a login screen, or a dash, a submission info page, a reviewer form etc.

You can also extend PubSweet on the back end as well (eg to enable integration with external services etc). The following is a more accurate but slightly techy architectural overview (you can skip this image and the following paragraph if you want to avoid the slightly techie part of this article).

ps2

In case you are wondering – everything is written in Javascript. Why did we choose JS? It was a deliberate choice to go with a language that was prolific. Almost every dev around these days needs to know some Javascript (it’s the most popular language by far on Github), this makes finding a developer to work on your project is as easy as we could possibly make it. JS is also a phenomenal language these days. Fast, sophisticated and more than capable of supporting large-scale publishing requirements. I mean, if it’s good enough for Paypal, Netflix, Linkedin, Uber and ebay, then it is good enough for you.

So each PubSweet platform has its own collection of front and back end components to meet the workflow of someone’s dreams… the idea is that to achieve the platform of your dreams you can reuse what others have already built and then just build what isn’t already available. In a sense, you can ‘assemble’ your platform from existing parts.

c1e22edf10cd018c9d1a697c29fbd891

The nice thing about this is that each of the organisations building Journal platforms are sharing the following:

  1. the same back end/framework (PubSweet)
  2. various front (and back) end components
  3. lessons learned…

Each organisation has a vision of their ideal Journal workflow. They then design and build this on top of PubSweet, but as they do, they build various components (either page-based components such as a dashboard, or smaller UI components we call atoms and molecules) and they share these components with everyone. Hence, you should check the list of components before you start building in case the component you need already exists.

We have various agreed-upon ways to build and share components (see this as an example). These best practices are continuously evolving but you can read some of the latest discussions about this approach here – https://gitlab.coko.foundation/pubsweet/pubsweet/issues/408

Of course, all code is reusable in the Coko community because it’s all open source. The best practices are there to make the code easy to reuse.

All agreements as per above are made by consensus by the community. It is actually a pretty snappy process – don’t believe every crappy thing you read about how open source is built. Open source community processes can be elegant and fast, and the resulting code can beautiful. Coko is a good example of this – a community of professionals collaborating together to make fantastic open source software.

dsc01017

So… back to the xPub story. To make these decisions on how to share components etc, we have regular meetings with various workgroups (we keep the numbers in each group small so we can move fast), and we also have quite a few in-person meetings. Not only do we have PubSweet community meetings where all of these organisations meet, but we have various get-togethers on various topics if required. For example, we met in Cambridge recently to discuss Libero (the eLife delivery product), before that EBI there was an onboarding session in Athens, next month we have a special designers workgroup in-person meet, anbd so on. All this helps us keep in contact with each other, which helps build trust, but also turns out to lift energy levels and boost production. These meetings are fun.

dsc01022

Also at these meetings, we learn from each other. One of the big problems in this sector, and one of the reasons people ask ‘why are publishing platforms so hard to build?‘ (after a number of high profile failures), is that there has not been a focused effort to share experiences on how to build publishing platforms. So, that is what we are doing – at each meeting we talk about what we have learned, how we are thinking about things, and show each other what we have done. The last meeting, for example, each xPub platform gave a deep dive to everyone in the group in a process known as speed geeking (it wasn’t so speedy as each table had 25-30 mins to go through their platform).

dsc00903

dsc00944

So, xPub isn’t a single platform, it’s more the community that is building journal platforms on top of PubSweet and sharing learnings and components. It is a very cool thing.

As a community, we also produced a book entitled ‘PubSweet – how to build a publishing platform’. You can get this for free here – https://coko.foundation/books/

I can also send you a print copy (they look great!) if you send me a postal address.

dsc00851

The book covers many things – a whole lot of technical documentation for how to build and share components etc, as well as some information about to think about workflow and how to map this into a PubSweet system. Personally, I don’t think the technology is very hard when it comes to building platforms – we knew what we wanted from the beginning with PubSweet and went about and built it (not to downplay the extraordinary job Jure Triglav has done in leading this effort). But the real hard stuff is actually thinking about workflow – not many publishing orgs have had the opportunity to think about designing their workflow to meet exactly what they want (rather than shoe-horning it into an existing system), and so we have had to beat this track for other to follow. It has been quite a journey but the book pretty effectively outlines how to think about workflow (which of course is a process that can be accelerated using Workflow Sprints which is a process I designed to facilitate a publisher to design their own workflow and PubSweet platform in one day).

0e7a0345

In that book you can see some high level ‘architectures’ of each of the xPub platforms such as the following (xpub-collabora):

02-adam-workflow-v3

Or this (xpub-epmc):

ebi

So, you might be asking…. at what stage of development  are each of these platforms? … I’ll leave it to each of the teams to say exactly, but it has been shocking to see how fast things have come along. I mean, the xPub community only really got together for the first time mid-last year, and building started for most late last year and early this year. It looks like we will have a lot of platforms landing by the end of this year. In this sector that is lightning fast. EBI are particularly impressive – they started two months ago and are almost ready to go – if it weren’t for the fact we are all replacing the under-the-hood data model we designed together at the last PubSweet meet, they would be good to go already. They can achieve such speed because they are reusing a lot of the components that the other teams built before them (and because EBI are just very good 🙂 ).

I can speak a little more about xpub-collabra since that is the platform Coko is building for and with the Collabra Psychology Journal. It’s looking pretty nice. We have some work tidying up the UI, replacing the data model and a few other things, but it’s looking rather good. We are also putting some time into making it a generic platform since the Collabra workflow follows a fairly ‘plain vanilla’ model. So we are building in some management interfaces and various bits and pieces to make it more widely useful – for example, Giannis just built in a Submission Info builder – enabling a Journal admin to build their own submission forms. It requires some hand-holding to use right now, but we’ll shape it to be very usable by your general journal adminy-type. We also have to integrate ORCID and DOIs, extend the range of submission file types etc… but it’s pretty close.

Below is a video showing xpub-collabra in action. It is a a version from some months ago, but you can see the workflow pretty well in this demo.

I think most of the xpub community is going to offer their new platforms to the market in various ways. This will be interesting as there are very different approaches at play. Hindawi, for example, is looking to make their platform a multi-tenant platform, eLife will put JATS at the centre of the workflow etc… So look for more news on that also. For our part, Coko will offer xPub through a partnership with a hosting provider – probably with the same organisations we will work with for Editoria hosting services. Since everything we do is open source, we will also be supplying all the Docker, Helm and Kubernetes scripts so that you can set up your own commercial hosting service if that is your cup of tea (or you can extend your offering should you already be a hosting provider). Coko is pretty close to getting our first hosting partnership set up, so look for news of that soon!

One last thing – because of the modular nature of any platform built on top of PubSweet, it is possible to take any of the xPub platforms and customise it to meet your needs. No need to start building from zero. Additionally, the modularity means you can extend the systems with your own interesting new innovation – finally a place for innovation to call home in the publishing platform world….

So… there is a lot going on in the xPub world….we look quite different to the rest of the market because we are not building one platform – we have instead focused on building a community to support the development of multiple platform solutions for journal workflows. You can pick and choose which one you want, or build something else, reusing as much of the other systems as you can to reduce your development costs (we also spend a lot of time onboarding new folks and supporting them as much as we can).

But it’s not just about improving the game today – supporting the optimisation of workflows is one part of what we are trying to do, the other is to support future innovations.

If you want to know more, feel free to jump into the Coko community channel and chat – https://mattermost.coko.foundation

Come and join the party. We are happy to support you and happy to learn from you…. not-for-profit or commercial we don’t care, build a better journal platform on PubSweet than the rest of us and we’d be happy!  We are in it for the mission – come to talk to us! no preciousness here 🙂

The Awesome Paged.js

So, I’ve been pursuing this dream for many years… every since I started rendering books in the browser using an ad-hoc collection of tools around 10 years ago…. then I instigated the book.js project (which unfortunately died due to lack of browser support of CSS Regions), and now… paged.js…

Built by the talented trio of Julie Blanc, Fred Chasen and Julien Taquet – it’s all open source and modular… there is a lot to this story, but we’ll get to that. Full release in a few weeks, this is a sneak peak:

Paged.js – sneak peeks

This project is entirely funded by the awesome Shuttleworth Foundation.

Hanging in Athens

I’m going to try and be in Athens every month now to work with the team, setting dev priorities for the month (roadmaps are stored in our READMEs). Awesome bunch – doing great work and heaps of fun to hang out with. Can you believe we have four devs in this office and they account for 4 publishing platforms? (wax, micropubs, Editoria, xpub)… I like that we are punching above our weight by some considerable margin… awesome…

Some pics from the week. Geeking and graffiti from around the ‘hood…

dsc00337_small

dsc00316_small dsc00324_small dsc00333_small

CodeMirror Embedded in Wax

The Wax (a web based word processor we are developing) lead dev -Christos Kokosias – yesterday embedded another editor, in the editor. We need this for the PubSweet Book Sprint where we will be including code snippets in  the documentation we create. So we need a nice way to manage that… consequently Christos embedded the CodeMirror code editor into Wax. So it has things like syntax highlighting, line numbering, auto complete etc all built into the code blocks…its pretty cool… below is are two short vids Christos made to show it in action:

 

Pretty cool stuff and extremely useful for writing documentation that includes code…ask yourself… can MS Word or Google Docs do that? #opensourceisbetter

Editoria Vid

If you missed the webinar, the video is here (it will be available from the Coko site also)…starts 9min 30s in:

Also, we had 40 or so live questions during the Q&A. We answered them all and Julien decided to do a quick dive into Right to Left (RTL) support. He posted this screenshot with mixed RTL and LTR text to show it works out of the box (still more to test):

archmerge_2018-05-16_10-25-17

XSweet 1.0!

We have 1.0 of the docx -> HTML transformation tool XSweet out today! It also has a new site:

http://xsweet.coko.foundation/

XSweet is a finely crafted tool. It takes docx files, those horrible mangy MS Word files, and translates them into clean, lovely, HTML. XSweet is open source, modular, and very nicely done.

A huge tip of the hat to Wendell Piez (XML guru), and to Alex Theg. As geeky as it sounds, I loved watching these two chat about the issues they encountered making this software. The attention to detail was really unbelievable. Amazing work. XSweet is a finely crafted tool.

More info on Coko https://coko.foundation/announcing-xsweet-1-0/

If you want to do a deep dive into why I think this is important, I wrote this some time ago – https://www.adamhyde.net/typescript-redistributing-labor/