Leave PDF and JATS as end-of-line formats

Over the last years, I’ve seen a few attempts and ponders about building journal submission processes around end-of-line formats – ie. PDF and JATS. I call them end-of-line formats because these formats are ‘outputs’ from other formats and processes. No one, for example, sits down and authors in JATS or PDF. Most researchers produce their work in MS Word, LaTeX and sometimes HTML and (ugh) markdown…there is also, of course, the R Markdown and similar schools that render to (for example) MS Word for submission purposes.

But it has to be said that no one authors in JATS or PDF. If they do then they are treading a long painful path when there are far simpler and painless ways to get there.

My small plea to the world is to leave these end-of-line formats where they are – at the end of the line. Keep content in flexible containers (file formats) so the works can be altered, reshaped, completely overhauled, fluidly re-flowed… whatever is needed… but don’t make the mistake of trying to either prematurely optimise document structure, or choose a file format that was not designed for authoring as your format of choice for authoring. It just doesn’t make sense. Instead, build systems that enable flexible authoring formats and layer structure on it through the process, then output to all the end of line formats you desire and QA those using the appropriate tools.

For my part, obviously, I believe that format should be HTML. I do see that it will take some time for publishers and researchers to unanimously agree with me 🙂 But I’ve seen enough people come round to the idea over the years that I believe that there is a steady rising of the HTML tide in scholarly publishing that will eventually mean all ships will rise…

coda : Seeing that my post above was not clear  (https://twitter.com/martin_eve/status/894866889343721472). I’m not arguing that we need JATS at all except that ‘to segue to the future’ we need to support it. I believe the research world would be better off without both PDF and JATS. My point is, don’t build Manuscript Submission Systems that use JATS or PDF as the primary source file format. Use HTML, and (where necessary and unfortunately) niche formats like LaTeX where you have very little chance of convincing them to use anything else.