Posts Tagged ‘science2.0’
Thoughts on open notebooks for software scientists
Open notebook science
People are increasingly interested in open notebooks, lab notes from scientists which everyone can view. There seems to be a separation in features (wanted or offered, not sure which).
On one hand, people seem to get by with what are essentially text repositories like simple wikis (TiddlyWiki, Mediawiki), note taking apps (from Stickies to Evernote), or blogs (WordPress). These are great: after all, most of our lab notes are just text. I’m not someone who works a lot at all with hardware or wetware, so none of my work needs it, but I’m curious how the translation is made from the physical to the digital. Do people take photos? Sketches? How do these translate into the digital domain? Is a simple wiki or blog enough?
The other side of things is more of a ‘life-stream’, as demonstrated by Cameron Neylon. Here, every research activity is streamed onto the Web: papers read, code committed, presentations given, etc. The primary technology here is RSS/Atom feeds, FriendFeed, Twitter, etc. For example, my readings are available via Mendeley, my source code at GitHub, all synchronized via FriendFeed. Maybe more important, these activities all have a (more or less) permanent resource locator, enabling people to re-assemble my work or link to pieces of it.
The difference is between tracking specific experimental activities (move sample A to Erlenmeyer flask B) and research activities in the large. For subscribers with non-specific interests (what does Cameron do for his day job, anyway?) I prefer the latter approach: I can see the large-scale activities without being bothered with the minutiae. But if I’m a close colleague (or competitor), then the small-scale notes are more useful; now I want information that will help me to replicate the study. The audience for the small-scale is probably less than 10, and at least 1 (yourself, I’m assuming).
Neil’s Open Notebook
What would an open notebook look like for my work? I’ll look at my objectives in using one, my ‘sciency’ activities, what I do now, and possible obstacles.
Objective
To place my ”research activities” online, to track my methodologies, to enable others to see/copy what I do, to allow me to have a record of what I have done.
What I do (and how to open them):
- Brainstorm
- Twitter questions; whiteboard meetings with digital photos; email logs posted;
- Logics and proofs
- Publish figures; publish theorems and proofs; write papers. Add inline references to previous work or definitions (e.g., similar proofs of non-monotonicity).
- Reading journals and conference papers
- Publish commentaries on blogs; RSS feeds of papers added to library;
- Data collection and manipulation
- Blog post on procedures; source code posted; archive data online (e.g. PromiseData); save workflow in repeatable form
- Paper writing
- Write paper on GitHub with Latex source available; write introductory blog post; post completed drafts; post figures
- Figures and presentations
- Post on SlideShare; Post on Flickr or Picasa;
- Coding
- GitHub projects; tar’red files on personal web page; Python virtualenv to host ‘contextualized’ lab environments
Current procedures
- Tiddlywiki notes for early-stage ideas or shots in the dark (my safe place).
- Latex documents for creating publications. Includes plenty of formulas.
- Github for some projects. Post source code and (depending on size) data.
- Pdfs on my website of past projects. Defeat the paywalls, increase citability.
- Blog posts (e.g. msr tag). Currently this is more retrospective, sharing in the hopes it saves someone else time.
- Posts on Daytum – track personal activities and productivity, e.g. amount of time spent on projects.
- RSS feed of Mendeley activity – lists papers I’ve read.
Possible obstacles
- collaborators won’t want everything open and accessible
- IP theft
- forget to update feeds
- diversion from ‘publishable’ activities (an open notebook won’t get you tenure, and may get you fired).
Other software scientists using open notebooks (if you have more, please let me know)
- Abram Hindle – notes on projects
- Aran Donohue – ‘open thesis’
- PROMISE data repository