Subscribe to PlausibleAccuracy
Posts
Comments

Sometimes it pays to get up early, even on a holiday

July 4th, 2008

On the way to meet up with a few out of town friends at a diner for breakfast this morning, we got to hear a great interview on NPR with Laura Marling, a musician from England.  It’s only 7 minutes or so, give it a listen.

There were two parts of the short interview that really struck me. The song “Alas, I Cannot Swim”, and a quote that comes out of the discussion:

You know, you are what you can prove you’ve done. That’s how people judge you. You know, I’ve released an album so I can prove to people that I was a songwriter. People who’ve finished university can prove that they’ve got a degree and that they’ve worked with something. And I think like, I love so many things but I haven’t got any qualifications. Apart from the album; the album is my qualification I guess.

Nature takes a look at PLoS finances & business model

July 3rd, 2008

Yesterday, Nature published an article in which they take a look at the finances and business model of one of the major Open Access publishing houses, PLoS.  The article is generating a buzz; in the blogs, mainly because of a percieved negative slant that many are chalking up to conflict of interest.  Let’s use the time-honored tradition of pulling some quotes out of context to look for the bias.  First, the byline:

Science-publishing firm struggles to make ends meet with open-access model

Ok, sounds negative to me…

relying on bulk, cheap publishing of lower quality papers to subsidize its handful of high-quality flagship journals.

Insult (they publish crap) followed by an inherent compliment that they have high-quality journals.

But its financial future is looking brighter thanks to a cash cow in the form of PLoS One, an online database that PLoS launched in December 2006

Use of the term “cash cow” and calling PLoS One a “database”… ok now we’re firmly in the realm of hatchet work.  I’m not even going into the repeated F.U.D. stirring of the “open access means huge author fee” pot that the article does as well.

Ok, so I think I agree that the article is sort of unnecessarily rude and demeaning, but I wouldn’t really expect anything different from a for-profit publisher. The worst part is that everything Dr. Butler tries to imply is a failing of PLoS has been done many times over in the closed-access for-profit journal community.

Right, so let’s try to look past the blatant attack and take a look at the actual facts, shall we?  PLoS has received about $17 million in grants, and last year had spendings of $6.7 million on revenues of $2.86 million.  The high number of papers being published in PLoS One is bringing in a lot of revenue.

So, what this is saying is that PLoS has made a sound business decision to release and promote PLoS One, as it’s helping their financial situation.  The implication that the articles published in PLoS One are sub-standard is fallacious, and shows a complete misunderstanding of this new method of crowd-sourced peer review.  A paper is made available in PLoS One as long as the board of reviewers finds that it is methodologically sound.  After it is available, anyone in the community is free to comment, rate, and discuss the paper in a public forum which is permanently attached to that paper.  In other words, if I wonder what the community thinks of the paper, all I have to do is look to the sidebar.  This is a vast improvement over standard papers, which I have no real indication of the community acceptance for.  Perhaps this paper was published because it managed to “slip through” a sympathetic reviewer, but is widely considered flawed.  You have no clear way of knowing this, especially as a young grad student.  With commented, rated, and annotated PLoS One papers, however, this is clear from the moment you first read the manuscript.

I’m going on a bit of a tangent on journal quality here, so let’s get back to finances.  To my eyes, the facts of the article indicate that PLoS is actually not doing too poorly financially.  As PLoS One becomes even more popular and more people appreciate the freedom of publishing in the “top tier” journals as well, they should move quickly towards breaking even.  And although I’m sure Nature and other profit-hungry journals might look down on this business performance, I think it will be a great day for OA and the larger availability of scientific knowledge.

EDIT: Bora has a good roundup of all of the other commentary on this (apparently quite inflammatory) article.

Identi.ca launches open source Twitter lookalike

July 2nd, 2008

The number of microblogging services out there keeps growing. While Twitter went through a burst of popularity (and is still in heavy use), frequent service outages have users looking elsewhere. There is a moderately active group over at FriendFeed, and via that resource I’ve heard of a new microblog called Identi.ca. This one looks quite a bit like Twitter (140 character text messages), however it is open source and content is released under a Creative Commons license. Being so new, it is light on features for the moment. As soon as they release an API, I’ll try to work on some manner of Wordpress integration.

The good news is that since it is open source, it will likely be possible to hook into Twitter, meaning that you don’t necessarily have to leave behind any contacts you have on that service. Indeed, I expect this to be one of the first features that the community develops, if the actual dev team doesn’t do it first.

Meanwhile, follow me.

Shameless and unsolicited plug for BioMedCentral

July 2nd, 2008

The plethora of job listings over at BioMedCentral is almost enough to make me want to cross the pond. They have positions available in production, web work, editorial roles, and sales.

If you are in the U.K. and interested in OA jobs, make sure to give them a look. They also seem to be one of the better companies at keeping their job offer pages up to date, so even if there is nothing that strikes your fancy now it might be worthwhile to check back later.

Full disclosure: I had some extremely preliminary contact with some people over at BMC in regards to jobs, but this was before they realized that I was a yankee. I had of course been looking at their website for some time before this in the hope of finding a telecommuting-type position. They didn’t seem to be against remote employment, but I’m not entirely sure how the logistics work out for such a long separation.

Playing the waiting game.

July 1st, 2008

Yesterday was the closing date for applications on a job prospect that I’m hoping for. I don’t think I’ve mentioned in this space which job this is (I think I’m starting to get superstitious about these things), but it’s an editorial position at an OA journal. The job offer is relatively close to one of Mrs. PA’s post-doc prospects, and therefore it’s just sort of a conglomeration of good things(tm). Of course this makes me even more nervous about the whole thing.

Add onto that the tough spot we’re in. Mrs. PA defended her thesis a few months ago, unfortunately without having a job lined up in advance. Since then it’s been a ticking clock until her contract runs out, which is very soon now. Because of that hard deadline, there is some urgency to resolve the situation as soon as possible. We’d prefer not to commit to living apart for a terribly long time, and the two jobs we’re hoping for here are the first that we’ve found close enough to one another so that we can live together.

The situation has made it tougher for me to get off the fence with my graduate school trajectory as well. I feel like if we had a concrete lead on where we were going next, it would ease my decision on how best to finish up my work here. The nebulous situation that we’re in, without any sort of specific location or date of a move, makes it tougher to set hard deadlines in my own work.

What makes the situation even a little more bothersome is that we might be doing something similar in a few years, when Mrs. PA finishes her post-doc. Once again, this is a selling point of the location that our current prospects are in; it’s a major area, and hopefully she’d be able to find gainful employment there without us having to move again.

There have been one or two other times that I felt we were close to sorting all of this out, and those have all fallen through so far. It makes me a bit pessimistic, but at the same time one can always hope.

A weekend of hiking and old games

June 30th, 2008

I spent the end of last week wrestling with Django and not getting very far. I’m trying to recode this site more or less from the ground-up in order to implement a few features that I’d like. Unfortunately I seem to progress rather quickly through the verbose error messages to the non-verbose and impossible to decipher sort. After a few days of this I was in need of a nice break.

Fortunately the weekend came around, and it was off to the woods for another round of geocaching. We chose 4 sites in a state park near us and took off, finding 3 of the 4. We probably would have gotten all of them if we hadn’t bitten off a bit more than we could chew for the actual hiking portion; we ended up going something like 6 or 7 miles if the park map is to be believed.

After collapsing back at the house, I decided to install and play around with some old computer games. First I gave Space Quest 6 a shot, but that one seems to have a bug that makes a certain portion unpassable. After that I installed Grim Fandango, a LucasArts puzzle game. It was a lot of fun (although frustrating as only these types of games can be) to play through some of that with Mrs. PA.

I also installed two demos of newer games off of Steam: Overlord and Audiosurf. Overlord received some good reviews in the gaming press when it came out a while back, but the demo was sort of “meh”, and I think at $39.99 I’m going to pass. Audiosurf, on the other hand, was a lot of fun. This $10 game allows you to play sort of a musical tetris, with your own audio files as the source material. You select one of your MP3s, and the game auto-generates a level based on the sonic profile. You then fly your ship through the level, collecting colored blocks in different lanes to make rows. It’s a bit hard to describe, but I highly recommend checking it out.

Oh, and I also got to watch Spain beat Germany in Euro 2008 :) It was a nice weekend! Now the week is upon us once again, so it’s off to the lab…

ChemSpider tantalizes me with the promise of article markup tools

June 26th, 2008

From the ChemSpider blog:

We are adding our finishing touches to some markup tools for Open Access articles at present and they will unveil shortly.

Don’t leave me hanging! Are these automated tools that put XML all over the manuscript, making it facile to slice and dice the articles as we please? Do the tools you’ve developed do the slicing and dicing largely?

Regardless, I’ll be keeping my eye on further developments.

More fun & games with ElementTree

June 25th, 2008

I’m still deeply in love with ElementTree. Here’s a script that will give you all the lines of a given player. Note that the import area has changed, as on this machine I’m using Python 2.4:

#!/usr/bin/python

# Initialization
import elementtree.ElementTree as ET
import os, string

# Read in our XML file
infile = raw_input("Input XML file name > ")
xmltree = ET.ElementTree(file=infile)
rootelem = xmltree.getroot()

speaker_find = raw_input("Which player's lines do you wish? > ")

act_list = rootelem.findall('ACT')
for act in act_list:
	scene_list = act.findall('SCENE')
	for scene in scene_list:
		speech_list = scene.findall('SPEECH')
		for speech in speech_list:
			speaker_list = speech.findall('SPEAKER')
			for speaker in speaker_list:
				if speaker.text == speaker_find.upper():
					print speaker.text
					lines = speech.findall('LINE')
					for line in lines:
						print line.text
					print ''

and the output:

Input XML file name > hamlet.xml
Which player's lines do you wish? > osric
OSRIC
Your lordship is right welcome back to Denmark.

OSRIC
Sweet lord, if your lordship were at leisure, I
should impart a thing to you from his majesty.

OSRIC
I thank your lordship, it is very hot.

OSRIC
It is indifferent cold, my lord, indeed.

OSRIC
Exceedingly, my lord; it is very sultry,--as
'twere,--I cannot tell how. But, my lord, his
majesty bade me signify to you that he has laid a
great wager on your head: sir, this is the matter,--

OSRIC
Nay, good my lord; for mine ease, in good faith.
Sir, here is newly come to court Laertes; believe
me, an absolute gentleman, full of most excellent
differences, of very soft society and great showing:
indeed, to speak feelingly of him, he is the card or
calendar of gentry, for you shall find in him the
continent of what part a gentleman would see.

OSRIC
Your lordship speaks most infallibly of him.

OSRIC
Sir?

OSRIC
Of Laertes?

OSRIC
I know you are not ignorant--

OSRIC
You are not ignorant of what excellence Laertes is--

OSRIC
I mean, sir, for his weapon; but in the imputation
laid on him by them, in his meed he's unfellowed.

OSRIC
Rapier and dagger.

OSRIC
The king, sir, hath wagered with him six Barbary
horses: against the which he has imponed, as I take
it, six French rapiers and poniards, with their
assigns, as girdle, hangers, and so: three of the
carriages, in faith, are very dear to fancy, very
responsive to the hilts, most delicate carriages,
and of very liberal conceit.

OSRIC
The carriages, sir, are the hangers.

OSRIC
The king, sir, hath laid, that in a dozen passes
between yourself and him, he shall not exceed you
three hits: he hath laid on twelve for nine; and it
would come to immediate trial, if your lordship
would vouchsafe the answer.

OSRIC
I mean, my lord, the opposition of your person in trial.

OSRIC
Shall I re-deliver you e'en so?

OSRIC
I commend my duty to your lordship.

OSRIC
Ay, my good lord.

OSRIC
A hit, a very palpable hit.

OSRIC
Nothing, neither way.

OSRIC
Look to the queen there, ho!

OSRIC
How is't, Laertes?

OSRIC
Young Fortinbras, with conquest come from Poland,
To the ambassadors of England gives
This warlike volley.

It’s pretty great… 25 lines of code, and I’m still doing things in sort of a long way. For instance, I’ve already figured out how to rewrite the counting script from my previous post using the getiterator function:

#!/usr/bin/python

# Initialization
import elementtree.ElementTree as ET
import os, string

# Read in our XML file
infile = raw_input("Input XML file name > ")
xmltree = ET.ElementTree(file=infile)
rootelem = xmltree.getroot()

speaker_find = raw_input("Which player's lines do you wish? > ")

i = 0
speaker_list = xmltree.getiterator('SPEAKER')
for speaker in speaker_list:
	if speaker.text == speaker_find.upper():
		i = i + 1
print i

This steps through all the levels of the XML document for us, so there isn’t any need to do that manually. It’s facile to rewrite the “Player’s Lines” scripts similarly:

#!/usr/bin/python

# Initialization
import elementtree.ElementTree as ET
import os, string

# Read in our XML file
infile = raw_input("Input XML file name > ")
xmltree = ET.ElementTree(file=infile)
rootelem = xmltree.getroot()

speaker_find = raw_input("Which player's lines do you wish? > ")

speeches = xmltree.getiterator('SPEECH')
for speech in speeches:
	speaker = speech.find('SPEAKER')
	lines = speech.findall('LINE')
	if speaker.text == speaker_find.upper():
		print speaker.text
		for line in lines:
			print line.text
		print ''

So far I’m still learning a lot with every poke at ElementTree. I’m sure I’ll be adding more as I go.

Hope you find this sort of thing interesting.

Using Python to parse XML is easier than it should be

June 24th, 2008

A few months back when I was just starting to poke around with Python, I saw this XKCD comic come through my RSS feed (my apologies if this clashes with the right hand sidebar; maximizing your window might help):
import soul
XKCD
At the time, I thought it was sort of funny, more for the complete nerdiness of creating a pet from an Eee PC and a hamster ball than anything else. The kicker at the end about importing a soul was just icing.

I bring this up because in preparation for the Elsevier Article 2.0 Challenge coming up in September, I wanted to start spending more time learning how to handle XML files. Since Python has become my language of choice (ok, full honesty - it’s the only language I can speak at all really, and even then only in primitive grunts), I wanted to see how hard it would be to work up an XML parser. It’s really easy. You just have to import it.

import xml.etree.ElementTree as ET

I wrote a very very simple and short script just to make sure that it was as easy as I thought it was, and sure enough this is the case.

xmlparser.py

# Read in our XML file
infile = raw_input("Input XML file name > ")
xmltree = ET.ElementTree(file=infile)
rootelem = xmltree.getroot()

print "This should be root_element"
print rootelem.tag

print "This should print two subelement tags"
for subelement in rootelem:
	print subelement.tag

print "This should print out the content of the sub elements"
for subelement in rootelem:
	print subelement.text

And I used a self-generated test file, test.xml:

<root_element>
	<sub_element>This is a sub element</sub_element>
	<sub_element id="2">This is a sub element with the ID set to "2"</sub_element>
</root_element>

and the output pretty much matches what you would guess:

Input XML file name > test.xml
This should be root_element
root_element
This should print two subelement tags
sub_element
sub_element
This should print out the content of the sub elements
This is a sub element
This is a sub element with the ID set to "2"

This took all of about 10 minutes to do… I’m still sort of stunned.  I’m sure the programmers/Python jockies are laughing right now, but c’est la vie I suppose.

I mean, it’s really almost frighteningly simple.  Let’s try playing with Hamlet, available online in XML format of course.  We can write a quick script to count how often Rosencrantz speaks:

#!/usr/bin/python

# Initialization
import xml.etree.ElementTree as ET

# Read in our XML file
infile = raw_input("Input XML file name > ")
xmltree = ET.ElementTree(file=infile)
rootelem = xmltree.getroot()

i = 0
act_list = rootelem.findall('ACT')
for act in act_list:
	scene_list = act.findall('SCENE')
	for scene in scene_list:
		speech_list = scene.findall('SPEECH')
		for speech in speech_list:
			speaker_list = speech.findall('SPEAKER')
			for speaker in speaker_list:
				if speaker.text == "ROSENCRANTZ":
					i = i + 1

print i

It’s 49, in case you are wondering.

I’m pretty excited in my experimentation with ElementTree so far.  As usual I’ve got a ton to learn, but it’s great to know that this powerful tool was lurking inside of python the whole time.

All right, this is getting a little scary

June 23rd, 2008

There is a date on the calendar I can point to.  As of this writing, on that day:

  • We won’t have a place to live (lease runs out on our house)
  • Which is sort of good, because our household income will consist of a single grad student stipend

Needless to say, since that date is less than 3 months away, I’m a little nervous these days.

Mrs. PA and I are going into major job hunt mode.  She’s got a few candidates kicking around, but unfortunately I have work to do on getting my grad school situation sorted.  I’ve got a few things out there, but I think it’s (past) time for “the talk” with my adviser, and also a redoubling of my own hunt for a position.

More later.