September 06, 2011 | By: Nick Morris

Aa Aa Aa

SOLO11: Day 1 - Workshops at #solo11

On day 1 of Science Online London (solo11) there were a number of workshops available and I decided to attend three:

Workshop 1: If you build it - APIs and developers

The session was hosted by: Chandran Honour (nature.com) and Ian Brown (Mashery.com)

This was a very busy and popular session. So popular that we had to move to a larger room!

The workshop kicked off with an important question: "Who in this room is a developer?" Turned out that most of the people present considered themselves developers, a promising start.... This meant that most of the people in the room knew what an API was (an application program interface) and how to use such a thing. However, things then went a little weird.....

The group was split in to a number of 'working parties' to discuss APIs in terms of what the users wanted from them, how they wanted to use them etc. This was a good idea, but was wasted as these parties never reported back on their discussions, so in most cases people in the session had no idea what had been discussed just a few feet away.

There then followed a series of presentations from the session hosts. Ian Brown asked the question: "Why do people build APIs?" and then answered it with: "Basically to add value to data". He gave a good example of this by saying that the iPad was a "Presentation layer of the future". However, as it had been established that most people in the room were developers, then this is all slightly 'preaching to the choir'.

Chandran Honour then spoke about how Nature wanted to do "APIs better", and that it will launching a developer site http://developers.nature.com in the autumn of 2011. Basically all of this was a PR job for what nature.com are planning to do.

Tony, also from nature.com, gave an overview of the Nature APIs, and one interesting point that did emerge was that Nature is archiving scientific blogs other than there own, however, no indication was given of how these blogs were picked, and which ones were included.

Finally, Hilary (nature.com) - can you see a pattern? Pretty much repeated all that had been previously said about http://developers.nature.com and announced that it would be supported by Mashery.com (which I had guessed).

Summary

I felt cheated by the session as it was essentially a PR job by Nature on a future product. If this has been explicitly spelt out in the title (e.g. If you build it - Nature.com APIs and developers) of the workshop I may have picked another session.

Workshop 2: Bridging the divide: Building around the PDF - Steve Pettifer, Philip McDermott

This was a very good session and definitely the best workshop I attended on day 1.

The session introduced a program called Utopia (Attwood TK, Kell DB, McDermott P, Marsh J, Pettifer SR, Thorne D. Utopia documents: linking scholarly literature with research data. Bioinformatics. 2010 Sep 15;26(18):i568-74 - link to paper) - but that alone doesn't fully do the session justice.

Steve Pettifer introduced the 'This is not a pipe' (Rene Magritte - Ceci n'est pas une pipe) and 'The Hamburger' problem of PDFs of scientific papers, that is, the PDF is NOT the science, but a representation of the science. The science to the paper, is as the cow is to the hamburger. Basically the problem set was - how do you get from the PDF back to the underlying methods, and the scientific data? So, if PDFs are so bad, why are they used? Well, PDFs are convenient, look nice, act as a record of science as they can't be easily altered (plus, once downloads publishers can't get them back).

One scary fact from the session is that two life-sciences papers are published every minute, so trying to understand, follow and curate all the information is a big problem. So, how do we get back from the PDF of the paper paper to the data? And, is there a better way to get the data published?

So, where does the problem lie - The publishers? The scientists? Somewhere else?

Scientific publishers are in the business, in most cases, of making money (the exception being society publishers), and scientists experience the 'prisoners dilemma' - that is, publish or perish - stepping out of this system means the scientist disappears. This is all nothing new, and was really one of the underlying themes of SOLO11, that is, how can non-traditional scientific publication be recognised and rewarded? As was evident later in the last panel of the day (see SOLO11: Science online London 2011 (#solo11) - day 1, panel 2 - "What's in it for us?") grant awarding bodies also play their part in this process as they need a metric to show that they are getting good value for money, and the metric they tend to use for this is the traditional scientific publication.

Steve made the argument that a sort of deadly embrace exists between the scientists, the publishers and the grant awarding bodies. And that there is no real incentive to break it up. The best we can do for now is to use tools such as Utopia to 'de-hamburger' the paper back to the science and the data.

How to 'de-hamburger'

This is not an easy process. Humans and machines speak different languages. A toddler can follow the simple instruction to sort something by size, whereas for a computer this is a very difficult concept and process.

A good example of this was given by Steve - take the sentence 'Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo' which is not as nonsensical as it may appear (see Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo), but even for a human is difficult to understand. What would a machine make of that?

Scientific papers are not machine readable, and to fix the problem we need either revolution or evolution. However, as Steve pointed out, there is the deadly embrace between the scientist, the publisher and the grant awarding body, and so it is very risky for a scientist to step outside this triad as it is publish or perish. Interestingly these observations also built on the ideas in Michael Nielsen's keynote at the start of the day, which looked at why open science was struggling to be accepted.

The Utopia software that Steve was introducing has the ability to add links to data within the PDF, and also link to external sources of additional information. And as was pointed out, this type of approach is not ideal as it is in part reliant on the computer correctly identifying data and things to link to in the text, and it is also reliant on the journals (one example being the Biochemical Journal), and possible other users, marking up relevant sections of the papers.

(One interesting point that was made was that journals are no longer really journals in the sense of something in which your record your daily thoughts and work, and letters in journals are no longer really letters, as they are all peer reviewed. Therefore, does this make the blogs the journals and letters from years ago?)

Summary

Good session, and good demonstration. However, is this really just a sticking plaster for the problem? Or is there a better solution?

Workshop 3: How are wikis being used to carry out and communicate science? - Michael Peel, Hennry Scowcroft, Alok Jha

Again, this was another session that had a misleading title. The title should have been: How IS WIKIPEDIA being used to carry out and communicate science? (I thought it was kind of arrogant that the people from Wikipedia had confused Wikipedia with wiki.)

For me, only two really interesting things came out of the session:

1. Cancer research UK has received training from Wikipedia in how to edit and manage articles. What was interesting was that the wikipedia community reacted to the edits, changes and corrections made by Cancer research UK in a negative way, and in a number of cases reverted the article back to its original form as the community percieved the actions of the cancer charity as interference by a large corporation. As a result of Cancer research UK involvement a number of other organisation are now receiving training and will get involved in the editing and creation of entries.

2. Alok - who writes for Guardian and has also written two popular science books (more info), said that he doesn't ever quote wikipedia, but will often use it for leads (references) to the primary source on information.

0 Comment

SOLO11: Day 1 - Workshops at #solo11

share

Email your Friend