<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Semantic Werks &#187; science 2.0</title>
	<atom:link href="http://neilernst.net/tag/science-2-0/feed/" rel="self" type="application/rss+xml" />
	<link>http://neilernst.net</link>
	<description>Thoughts on people, machines and systems.</description>
	<lastBuildDate>Mon, 30 Jan 2012 19:08:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='neilernst.net' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/1f438fa0e70f195d1594a7a5a6b0aaed?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Semantic Werks &#187; science 2.0</title>
		<link>http://neilernst.net</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://neilernst.net/osd.xml" title="Semantic Werks" />
	<atom:link rel='hub' href='http://neilernst.net/?pushpress=hub'/>
		<item>
		<title>More on open notebooks</title>
		<link>http://neilernst.net/2010/02/13/more-on-open-notebooks/</link>
		<comments>http://neilernst.net/2010/02/13/more-on-open-notebooks/#comments</comments>
		<pubDate>Sun, 14 Feb 2010 00:31:11 +0000</pubDate>
		<dc:creator>Neil</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[open notebook]]></category>
		<category><![CDATA[r]]></category>
		<category><![CDATA[science 2.0]]></category>
		<category><![CDATA[wave]]></category>

		<guid isPermaLink="false">http://neilernst.net/?p=1062</guid>
		<description><![CDATA[I recently posted about what an open notebook in software science might look like. I think I confused life stream (where life == work with notebook. From what I&#8217;ve seen looking at projects like OpenWetWare, they seem more like Trac or Github then a friendfeed account. You get a wiki to write on, image handling, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=neilernst.net&amp;blog=62241&amp;post=1062&amp;subd=fink08&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I <a href="http://neilernst.net/2010/01/28/thoughts-on-open-notebooks-for-software-scientists/">recently posted</a> about what an open notebook in software science might look like. I think I confused life stream (where life == work <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  with notebook. From what I&#8217;ve seen looking at projects like OpenWetWare, they seem more like Trac or Github then a friendfeed account. You get a wiki to write on, image handling, etc., but it isn&#8217;t automated: you have to enter all the data yourself.</p>
<p>This is incredibly useful, but am I right in thinking it is similar to tools software engineers have known for decades? It seems like the innovations are in collaborative editing, version control, and digital data.</p>
<p>What I was imagining was more automatic: whenever your microarray machine ran an experiment, it would auto-enter the results on your open notebook. Similarly for code you might run for statistical analysis (like the R workspace question I raised earlier).</p>
<p>I like the idea of &#8216;recording&#8217; what you HAVE done (not what you will do, which is more brainstroming, mind-mapping, whiteboarding etc.). It is a very important part of selfish science, which is to say, self-replication (presumably the sine qua non of scientific reproducibility). Here are a few features I think are useful for personal lab notes:</p>
<ul>
<li>A wiki with dates. </li>
<li>Separate entries. </li>
<li>Graphviz-Dot conversion. </li>
<li>Semantic markup. </li>
<li>Inline photos.</li>
<li>Inline LateX</li>
</ul>
<p>I&#8217;m not saying these notebooks have no value: clearly they do. But I think there is a lot more that could be done with the concept. Particularly using linked data <span style="color:#888888;">(oh noes! the semantic web!)</span> to import other researchers&#8217; results.</p>
<p>What we really want is a list of steps &#8211; some small <a href="http://friendfeed.com/axiomsofchoice/85b96f76/whats-smallest-possible-scientific-artefact">&#8216;unit of science&#8217;</a> that can be repeated. We should show this using process models, so we can model loops, branches, and possibly execute them, recompose them. <a href="http://wave.google.com/">Google Wave</a> is touted as the best thing for this, and I think it&#8217;s true. SAP has a <a href="http://www.sapweb20.com/blog/2009/10/sap’s-gravity-prototype-business-collaboration-using-google-wave/">version of its business process editor</a> in Wave, and Google <a href="http://www.whatisgooglewave.com/2010/02/11/my-extension-wish-workflow-in-wave/">itself sees a need for it</a>.  Its collaboration feature is useful, but I don&#8217;t think it is the real advantage &#8211; yet. Right now, Wave&#8217;s support for version control (well, history) and its ability to incorporate agents/bots and arbitrary Javascript extensions is more useful. For example, someone has written <a href="http://www.scienco.org/2009/watexy-latex-robot-for-google-wave/">&#8216;Watexy&#8217;</a>, a Wave bot which can interpret Latex equations.</p>
<p>It&#8217;s truly an exciting time to be working in science. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/fink08.wordpress.com/1062/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/fink08.wordpress.com/1062/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/fink08.wordpress.com/1062/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/fink08.wordpress.com/1062/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/fink08.wordpress.com/1062/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/fink08.wordpress.com/1062/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/fink08.wordpress.com/1062/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/fink08.wordpress.com/1062/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=neilernst.net&amp;blog=62241&amp;post=1062&amp;subd=fink08&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://neilernst.net/2010/02/13/more-on-open-notebooks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>43.659070 -79.396681</georss:point>
		<geo:lat>43.659070</geo:lat>
		<geo:long>-79.396681</geo:long>
		<media:content url="http://1.gravatar.com/avatar/141272f109fbf660ffa001647f17d368?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fink08</media:title>
		</media:content>
	</item>
		<item>
		<title>Open science and workflows</title>
		<link>http://neilernst.net/2010/02/01/open-science-and-workflows/</link>
		<comments>http://neilernst.net/2010/02/01/open-science-and-workflows/#comments</comments>
		<pubDate>Mon, 01 Feb 2010 16:41:43 +0000</pubDate>
		<dc:creator>Neil</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[climate change]]></category>
		<category><![CDATA[literate programming]]></category>
		<category><![CDATA[science 2.0]]></category>

		<guid isPermaLink="false">http://neilernst.net/?p=1039</guid>
		<description><![CDATA[I was talking to Jon Pipitone about scientific computing. For a long time this field was mired in the relatively obscure (yet vitally important) field of numerical analysis. Now,  however, with the relative interest generated by `ClimateGate&#8217; and open-source software, interest in scientific computing &#8212; by which is typically meant computing for scientific disciplines, such [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=neilernst.net&amp;blog=62241&amp;post=1039&amp;subd=fink08&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I was talking to <a href="http://skoolr.blogspot.com">Jon Pipitone</a> about scientific computing. For a long time this field was mired in the relatively obscure (yet vitally important) field of <a href="http://en.wikipedia.org/wiki/Numerical_analysis">numerical analysis</a>. Now,  however, with the relative interest generated by `ClimateGate&#8217; and open-source software, interest in scientific computing &#8212; by which is typically meant computing for scientific disciplines, such as biology, chemistry, physics, and in particular, the software supporting that computing &#8212; has grown, particularly with respect to the repeatability of these experiments.  An excellent resource to read for an introduction is the <a href="http://research.microsoft.com/en-us/collaboration/fourthparadigm/">Microsoft research report</a> on &#8220;4th Paradigm science&#8221;.</p>
<p>Spurred on by a <a href="http://blog.okfn.org/2010/01/28/clear-climate-code-and-data/">post by programmers who have converted</a> relatively opaque C/Fortran code to Python, I wondered what other such projects might be around. The goal being to make the procedures followed more open and understandable by laypeople (as much as that might be possible &#8212; just because we know what rain is doesn&#8217;t mean we are all climatologists).</p>
<p>I asked him what might be worth trying to convert:</p>
<blockquote><p>A &#8230; particularly nasty, but possible idea would be to convert a single fortran module from an existing climate model over into python, and then use some fancy python-fortran bridge to make they two talk to each other.  That way you could slowly convert a model over to python. You&#8217;d be forced into, at least partially, keeping the original model architecture.  That wouldn&#8217;t be ideal, but at least you&#8217;d know you were being true to the model (because you could compare output).</p>
<p>Sounds nasty to me.  If you were considering rewriting a chunk of a model, I&#8217;d suggest starting with NASA&#8217;s ModelE (or a newer version). It&#8217;s the simplest and littlest, big GCM I&#8217;ve seen.</p></blockquote>
<p>But then I realized that moving code from C/Fortran to Python gains you a little bit of readability, a lot of maintainability, sacrifices speed, and leaves you, ultimately, back at the same point you started: computer code (procedural at that).</p>
<p>There&#8217;s a parallel to <a href="http://en.wikipedia.org/wiki/Literate_programming">&#8216;literate programming</a>&#8216;. What we  would really like to do is write these tools in a language that is  platform independent and language independent.</p>
<p style="text-align:center;">Here&#8217;s how I see  the transition:<br />
<img class="aligncenter" title="Science workflow" src="https://dl.dropbox.com/u/340814/sci-workflow.png" alt="Science workflow" width="299" height="165" />1.  Cognitive understanding &#8212;&gt; 2. Language of science  (mathematics, with bio/phys/chem extensions) &#8212;&gt; 3. language of  platform (R, mathematica, custom code) &#8212;&gt; 4. bytecode ==&gt; 5. computer processing &#8211;&gt; 6. output representation</p>
<p>We would  like to get rid of having to do the second translation, right? So that  you can just write in the language of mathematics and have the output (prediction, in  the form of graph, chart, numbers) be correct. So I guess there should  be two sides to this workflow: one from the natural language to the  bytecode, and the other from the bytecode back out to natural representation.</p>
<p>The assumption I&#8217;m making is that the further away from bytecode you  get the more people have a chance of understanding your work.</p>
<p>Some  of this discussion is (uncomfortably) similar to model-driven approaches,  of course. The challenge there, for me, has always been that you cannot  represent *all* the problem in the model &#8211; so you end up with a bunch  of custom code anyway. Jon again:</p>
<blockquote>
<div>Yup.  And the climate scientists will tell you that all the time.   There are all sorts of optimisations and workarounds that have to be  specified in the code.  Not to mention the fact that the way you decide  to discretize the mathematics in the papers and which algorithms you  choose as implementations are also dependent on the rest of the  model/compiler/platform, etc..  So it&#8217;s not that we&#8217;re trying to replace the second  step, but just make it clear what&#8217;s happening along the way.</div>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/fink08.wordpress.com/1039/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/fink08.wordpress.com/1039/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/fink08.wordpress.com/1039/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/fink08.wordpress.com/1039/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/fink08.wordpress.com/1039/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/fink08.wordpress.com/1039/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/fink08.wordpress.com/1039/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/fink08.wordpress.com/1039/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=neilernst.net&amp;blog=62241&amp;post=1039&amp;subd=fink08&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://neilernst.net/2010/02/01/open-science-and-workflows/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<georss:point>43.659070 -79.396681</georss:point>
		<geo:lat>43.659070</geo:lat>
		<geo:long>-79.396681</geo:long>
		<media:content url="http://1.gravatar.com/avatar/141272f109fbf660ffa001647f17d368?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fink08</media:title>
		</media:content>

		<media:content url="https://dl.dropbox.com/u/340814/sci-workflow.png" medium="image">
			<media:title type="html">Science workflow</media:title>
		</media:content>
	</item>
		<item>
		<title>A better scientific notebook</title>
		<link>http://neilernst.net/2010/01/11/a-better-scientific-notebook/</link>
		<comments>http://neilernst.net/2010/01/11/a-better-scientific-notebook/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 17:00:27 +0000</pubDate>
		<dc:creator>Neil</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[msr]]></category>
		<category><![CDATA[r]]></category>
		<category><![CDATA[science 2.0]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://neilernst.net/?p=1008</guid>
		<description><![CDATA[Cameron Neylon is an advocate of open science (along with others like Michael Nielsen). Among other things, open science or Science 2.0 means keeping track of your mundane day-to-day activities (I like to call it &#8220;sciencing&#8221;) using some publicly accessible repository. This way other researchers (your competition?) can see what you are doing and comment/question/improve [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=neilernst.net&amp;blog=62241&amp;post=1008&amp;subd=fink08&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:left;"><a href="http://friendfeed.com/cameronneylon">Cameron Neylon</a> is an <a href="http://blog.openwetware.org/scienceintheopen/">advocate of open science</a> (along with others like <a href="http://friendfeed.com/michaelnielsen">Michael Nielsen</a>). Among other things, open science or Science 2.0 means keeping track of your mundane day-to-day activities (I like to call it &#8220;sciencing&#8221;) using some publicly accessible repository. This way other researchers (your competition?) can see what you are doing and comment/question/improve your work.  This also has benefits for the researcher herself, of course, in keeping track of what steps were followed. Nothing is worse than getting a good result and not being able to replicate it (&#8220;I swear it was there! I saw it!&#8221; &#8220;Yes of course you did, here, just take this pill..&#8221;). Actually there is something worse and that is losing data and not having a backup. Like me.</p>
<p style="text-align:left;">Recently I was working on an project &#8211; <a href="http://neilernst.net/tag/msr/">MSR/data mining of open source software</a> &#8211; and saw one aspect of this that could be improved. I&#8217;m working with <a href="http://www.r-project.org/">R, the open-source statistics toolkit</a> to do data analysis. I&#8217;ve never used it before, so there&#8217;s a fair bit of reading the manual, copying/pasting of examples, and experimentation involved. This can be dangerous: am I selecting a result from my analysis just because it &#8220;looks right&#8221;, or because I really understand what the analysis is saying? I am also going to want to repeat what got the particular graph I end up with (e.g., change my numeric yearweek dates to R date objects).</p>
<p style="text-align:left;">Helpfully R has a command line history you can store in a &#8216;<a href="http://www2.warwick.ac.uk/fac/sci/moac/degrees/modules/ch923/r_introduction/workspace_scripts/">workspace</a>&#8216;. But one thing I would like to be able to do with this, and any command line environment, is to somehow identify the <em>key</em> commands. For instance, I modify an online example to <a href="http://www.cyclismo.org/tutorial/R/linearLeastSquares.html">plot a regression line</a> for my data. I would like to somehow send that command to a &#8220;repeatable steps&#8221; repository to preserve a sense of  useful workflow (not just history). The vanilla history typically has a lot of missteps, and going back through it can be a challenge. To make this tool even better, it should somehow parse out the dataset-specific pathnames and variable names. Then I would be left with something like:</p>
<p><pre class="brush: matlabkey;">
&gt; &lt;data.frame.variable&gt; &lt;-  read.csv(file=&quot;&lt;csv file name&gt;&quot;,head=TRUE, sep=&quot;,&quot;)
&gt;summary(data.frame.variable)
&gt;data.frame.variable$&lt;index column&gt; &lt;- sequence(nrow(data.frame.variable))
&gt; &lt;dates.variable&gt; &lt;- strptime(as.integer(data.frame.variable$dateweek.variable), &quot;%Y%U&quot;)
&gt; plot(dates.variable,data.frame.variable$&lt;values.variable&gt;</pre></p>
<p>Which would allow me to insert my own variable definitions (a meta-language, I guess) and have that run automatically as a script in R.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/fink08.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/fink08.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/fink08.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/fink08.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/fink08.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/fink08.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/fink08.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/fink08.wordpress.com/1008/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=neilernst.net&amp;blog=62241&amp;post=1008&amp;subd=fink08&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://neilernst.net/2010/01/11/a-better-scientific-notebook/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/141272f109fbf660ffa001647f17d368?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fink08</media:title>
		</media:content>
	</item>
	</channel>
</rss>
