Some notes on integrating Mendeley, Scrivener, MultiMarkdown and (Xe)Latex
I’m using Scrivener for my thesis. It has good outlining options, full-screen mode, rich text, and this neat feature that lets you mark things as comments (annotations). I can use Multimarkdown to mark up the text (commonly ** to do italics) and lists are much simpler than \begin{itemize}\end{} format of Latex.
To manage my references I have been using Mendeley. It’s a little alpha still. There’s no way to search for specific fields, for example. And it doesn’t integrate at all with Scrivener, although it will with other editors. There is, however, a web API and it uses a SQLite backend, so hacking is possible.
For example, if there’s one thing I hate it is trying to remember the citation key for a reference – it completely kills my flow. So I’ve hacked up an Apple automator script that will take the highlighted text, search Mendeley’s SQLite db for that text, and return a list of possible matches (yes, this is how I procrastinate about actually writing). Then I can stick that in. Here’s the gist (code).
Generally the Latex export works quite well. You must set an XSLT transform in Scrivener which takes the MultiMarkDown export and converts it to Latex. The code I have online shows how to export this to the UofToronto thesis class.
A few tips:
– generally Latex code, e.g. math mode, is passed through with no problems. However, I’ve found that it is a good idea to surround complex Latex with HTML quotes.
– special characters like % and & get escaped, which is not always what you want, particularly if you have cut and paste.
– you can’t use double-dash inside HTML comments, so switch to using single dash or Mac’s en-dash (Opt-dash). If you use XeTex, you can specify a font like Times New Roman which will display this properly.
– You can’t directly show characters like φ too easily in Latex, yet. You are better off with $\phi$ until math fonts improve.
IT failure statistics
There’s an excellent IT project dashboard from the US government reporting on success/failure rates, project size, and amount of spending (which is frankly jaw-dropping). It is a very useful site, because most of the information we have seems to come from self-interested consultants. It’s certainly in their interests to emphasize how projects are always in a perpetual state of failure. Which doesn’t mean they aren’t, of course. If we look at the very coarse-grained US data, it’s clear that an uncomfortably large number of projects are in trouble. E.g., the figure below shows 7% of projects are in serious trouble, and 34% need attention. 
There’s a reporting bias in the press, too, of course: Man Bites Dog syndrome. If the project works and saves money, it won’t make headlines. For example, the Ontario Telemedicine Network connects physicians, nurses and patients online every single day, with high uptime rates. Every patient that doesn’t have to travel to Toronto from Sudbury saves the government — and the patient — mucho dinero.
There’s another comparison I rarely see, as well: how many projects of any kind are successful? If we look at the recent decision by the Canadian government to dole out ‘stimulus’ money for infrastructure projects (which invariably mean building new roads, for some reason), we see failure rates which are comparable to those in IT. For the 2 year, $4 billion fund, only 25% of projects were finished before the March 2011 deadline, with the most likely scenario being that some 900 projects will not meet the deadline. Manitoba is apparently hoping for favorable weather to meet the target.
Perhaps the difference is that when it is something physical, like a road, it is much harder to leave it half-finished then a piece of software.
The relevance of CS research
I came across a post by Seb Paquet on Quora.com about the relevance of CS research(ers). Seb’s position seems to be that academics are doomed to failure when it comes to innovation. I think he comes up with some good reasons why academia will struggle to innovate, but he is comparing oranges to pomegranates. I should preface by noting I’m not enamoured with the current situation, but he is unfair – I don’t see CS research as being in any danger of being pushed into a corner of irrelevance.
First problem is who we are comparing. Most people will point to the successful companies in industry as an example of innovation, such as Apple with the iPad. Then, they will dredge up some aging prof who publishes in the same obscure journals each year as an example of academia and irrelevance. But we should really look at the most successful academics, like Duncan Watts or Jim Hendler. Then, we have a survivor bias in industry –companies which don’t innovate get pushed aside (although this is in support of Seb’s position). And finally, I’m not convinced industry is all that innovative anyway. I mean, how many Y Combinator companies are doing some form of social media site? Often, what we mean by innovation is a combination of marketing and excellent product engineering (definitely realms to which innovation is useful, but not what we usually mean).
The second issue is what we are innovating. Do we want research labs at university to compete with Google or Apple on product development? Of course not. In fact, if a lab did come up with a killer product, it would probably move straightaway into industry – witness Bumptop, OpenText, or Google itself.
What we want from research labs are the long-term innovations, things like ArpaNet, hypermedia, REST (a Ph.D. thesis that is just now being fully understood). These enable whole new areas for research and industrial innovation. And let’s not discount the merits of heading down the wrong path. A lot of academic research consists of proving that something is in fact the case (beyond anecdote) or establishing what doesn’t work.
Finally, regarding the original question (about learning from industry), I think in fact academia is highly responsive to industrial innovations (perhaps too much so!). At our department, we have embraced Python, Subversion, Scrum, Wikis, etc. all within a few years of their development. Keep in mind that teaching needs to focus beyond what is immediately useful, unlike a career college. Professors here have worked with IBM to understand DB2 provisioning, used blogging to understand IR, and leveraged Hollywood to create new animation algorithms. There’s a new academic conference on “Xtremely Large Databases” to keep up with Google-scale problems. I think the key is that the tools produced are not necessarily the most useable – and it is this last mile that industry is great at.
If there is one thing that concerns me as a researcher it would be access to data. In the earlier days of computing, a researcher could claim to be working on similar problems to industry, because universities sank millions of dollars into computing infrastructure to maintain this parity. Today, though, it seems as though universities are falling behind when it comes to ‘real-world’ data to work with. Unless you are privileged to work with Google, you will have a very hard time duplicating that scale of problem. I’m not sure what the last multi-million dollar investment by a university in cutting-edge computing infrastructure was, perhaps next generation cloud systems like SHARCNET.
REFSQ summary
The Working Conference on Requirements Engineering (REFSQ) just concluded. It is a great conference with plenty of discussion and provocative ideas.
I tweeted periodically about the conference, and here are some final thoughts:
Statements from the concluding plenary I disagreed with:
- social scientists never do anything with their theories; social science theories are not generalizable; RE should avoid social science techniques.
- You cannot gather data without a theory.
- We shouldn’t wait for data to start creating requirements engineering (RE) theories.
- Replication refers to repeating an experiment, not re-doing a case study.
- Studies shouldn’t just collect data; they should also propose theories.
It was refreshing to be involved in general discussions about the role of theory and empiricism in requirements, as it is something the field has long ignored. Jorge would be happy: there seems to be acknowledgement that we ought to be working towards better theory building in RE. There was also some muted acknowledgement that whatever we did in the past did not work, and that those ‘theories’ — better to call them ‘conjectures’ or just ‘wild guesses’ — need revisiting.
However. Some people don’t seem to understand that there are many ways of doing science in RE. Nearly everyone agrees new techniques are NOT needed; what is necessary is better ways of understanding how existing tools work or don’t work. And social sciences have a lot to teach us here, as a cursory examination of the literature would reveal. This is not physics! And we can’t use “just wing it” as our epistemic theory. Some feel we should jump to wild conjectures about what ought to work, and seek to test that. In fact, what often works better is to adopt grounded theory approaches.
Case in point. Someone mentioned that often in interviews you go to a person and ask about X, and they respond by cursorily mentioning X and then talking about Z five times. Z is the thing you should be interested in! And indeed a grounded theory approach will allow this to appear.
But these are quibbles. I think in general, there is broad acceptance of the need for rigorous empirical techniques, and also acceptance that we need to aim as a community for comprehensive, well-verified explanatory (and perhaps predictive) theories.
I’ll end with a few provocative statements of my own (a theme of the working conference):
I think in requirements it is easy to mistake the trees for the forest. We seem to focus so much on “making RE better” that we lose sight of the ultimate goal, which is to make better (software) products. Every RE theory should tie in to this goal, in my opinion.
And perhaps more controversially, although everyone at the conference probably fears it, it might be the case that all our tools and techniques are irrelevant in the face of the human aspect of the problem. That is, I wager it is easier to remedy poor tools when you have a mature and intelligent organization, that in fact it doesn’t matter what tools you choose. You could do waterfall and be successful (like NASA seems to do). Here’s a great quote from Watts Humphrey to conclude:
[During my time managing complex projects at IBM] I found that the problems were never technical; they were always management problems.
Climate models and computing talk with Balaji
Yesterday the U. Toronto Atmospheric Physics group hosted a talk by Balaji1, head of the modeling systems group at the Geophysical Fluid Dynamics Lab (GFDL) at Princeton University, a school in New Jersey. Balaji’s interests include High Performance Computing (HPC) and computer modeling. His Ph.D. is in Physics but I get the impression he is ‘computationally-oriented’.2 I don’t have his slides, but it looks as though he posts talks on his website.
The gist of his talk was that climate modeling, while of critical importance to the world, is facing several challenges:
- Scale: increasing the resolution of the model in X/Y dimensions means a factor of 4 increase in problem size, and often we want to increase the temporal resolution as well (for a factor of 8 increase).
- Climate models need to integrate new science components. For example, we should model hurricane formation by integrating sea-temperature models with atmospheric models. These integrations pose several challenges.
- Models need to be reproducible, according to current scientific dogma, but this is a tough challenge when a model run can take many days to run, and is subject to hundreds if not thousands of parameter choices. There are research efforts underway to understand what ‘reproducibility’ ought to mean at this scale: is probabilistic reproducibility enough? One challenge is that even understanding the results of the model run can be challenging.
- More and more, products of climate modeling are being sought as input into other models or decision-making. For instance, policy makers need to know drought predictions for the next 50 years in order to do land-use planning. The problem is that a) there are more of these requests than GFDL, for instance can handle; b) the models are not suitable for these predictive tasks, and require expert interpretation; c) selecting a single model is not desirable when the average of all models gives better results. Balaji mentioned a proposal to create a ‘climate service’, akin to the weather service, for doing this sort of thing.
A few other notes:
Balaji described the FRE, a configuration management system (of sorts) for recording experimental parameters and workflows. This is how GFDL tries to keep track of model runs, and maintain reproducibility. He did mention that the system can still be tweaked at the instance level, so the FRE may not capture everything that was done for that run.
I asked Balaji why these research centres were so intent on building and maintaining supercomputer clusters. After all, it isn’t something they should be experts in. I suggested the real experts were companies like Google and Amazon who routinely operate thousands of processors in data centres around the world.
His response was that they needed the control. The models need control over configuration, for reproducibility (after all, they are interested in bit-level reproducibility); they also needed control over core cycles, so that the models could run uninterrupted. He gave as an example the Department of Energy supercomputer centre, where other needs (more processing intensive) would bump climate models from the queue. Furthermore, he thought it likely that running on Google App Engine, for example, might cost even more than maintaining and running your own cluster.
That answer is understandable, but it does seem solvable. These are essentially business problems that can be negotiated: cost of cpu time, service level guarantees, etc. It’s hard to see how GFDL can compete with Google’s engineers in maintaining and building massive clusters. As an example, DNA sequencing is now taught on Amazon EC2 ‘machines’.
Finally, I would think from a reproducibility standpoint that relying on knowledge of specific machine configurations is way too detailed. It shouldn’t matter to your model that this machine runs VMS 3 while this other machine ran Linux 2.6.24. I know it *does* currently matter; but it shouldn’t.
It was a fascinating talk; he managed to tailor it to the diverse group of people listening in very well. I wonder if Steve Easterbrook will get to visit the GFDL lab as part of his sabbatical research.
BACK 1. I believe this is his last name, but used in the Brazilian fashion, a la Ronaldo/Pele.
BACK 2. I apologize for being sleepy mid-way through!

