Sunday, June 1, 2014

Skeptics continued ...

I've done more than my fair share of posting to Skeptics, enough to get a reputation of 599 which is an average of 30 points a day.  I haven't actually been working that hard since I got a bonus of 100 reputation points each for being a member of stackoverflow, and also superuser.  Actually I have another alias on stackoverflow as well ... but let's not get into that right now!

Anyway, the point of today's post is that it has become somewhat tiresome to reformat references to add to one's post.  Sure, one can inline them, but we don't do this when writing papers for publication so I think it's better to just stick with what's accepted.

To show what I mean, let's pick a reference I used in one of my answers.  The pubmed link is http://www.ncbi.nlm.nih.gov/pubmed/23648697.

Now if you go there, you see this:

Gastroenterology. 2013 Aug;145(2):320-8.e1-3. doi: 10.1053/j.gastro.2013.04.051. Epub 2013 May 4.

No effects of gluten in patients with self-reported non-celiac gluten sensitivity after dietary reduction of fermentable, poorly absorbed, short-chain carbohydrates.

However, I need to turn it into this:
Biesiekierski JR, Peters SL, Newnham ED, Rosella O, Muir JG, Gibson PR. [No effects of gluten in patients with self-reported non-celiac gluten sensitivity after dietary reduction of fermentable, poorly absorbed, short-chain carbohydrates.](http://www.ncbi.nlm.nih.gov/pubmed/23648697) Gastroenterology 2013 Aug;145(2):320-8.e1-3. doi: 10.1053/j.gastro.2013.04.051. PubMed PMID: 23648697.
 Now, we can get there part way by registering with NCBI from the pubmed pages.  Once you do that, you can then use the "Send to" link at the top right to send the reference to "My Bibliography".  You can then login to My NCBI which gives you nicely formatted references.  But, not in markdown format, which is what we need for stackexchange.

With an idle Sunday afternoon on the 1st day of the Southern Hemisphere winter 2014, I decided to script a tool to do this.  Rather than using the html, and parsing it, I decided to look at the XML.  Now, if you're on the pubmed page for the above you can change the display settings to XML using the display menu at the top left, which gives you http://www.ncbi.nlm.nih.gov/pubmed/23648697?report=xml&format=text but it's not really XML at all.  Just a text version of XML, so we have to massage it further.  And furthermore it doesn't contain the same data as their actual XML link which is at a completely different address: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&id=23648697 Now, this is something we can work with.  And thanks to the blogger where I found this information.

The current working script is now at my github repository and written in Rebol3, and I used it to produce the output above.  I intend to turn it into a CGI script soon once I have it tested somewhat more.

In the end this script should also prove useful for any papers I write.  Some journals have their own peculiar requirements.  For instance The New Zealand Medical Journal want a maximum of 3 authors, with the remaining authors consigned to obscurity with an et al.  Pity really since the senior author is often one of those in the et al.

PS: CGI script is here http://www.rebol.info/cgi-bin/pubmed.cgi