Thoughts from a CodeFest

This past weekend was the Steel City Codefest. The idea is that community non-profits present some problem for which an “app” would help them, and coders spend 24 hours coming up with some solution. It was a lot of fun. You can see our team’s solution at Our challenge was to create an easier way for people to find the city of Pittsburgh’s GrubUp food program, which offers free lunch and breakfast at 80+ sites around the city in the summer (sadly, a lot of Pittsburgh youth are food insecure).

Me and my teammate Monica starting off Saturday morning.

We didn’t win the challenge, but I learned a lot on the way.

We created tons of  technical debt : code clones, code comments, no testing, no design. It was code as fast as possible, get it working, fix the obvious user facing bugs. We shipped. But even during that 24h span the design hit us, as it became harder to change things since logic and UI were wrapped together. Even something as trivial as renaming a media folder became a massive headache. We had no tests, so any change had to be “tested” by running the app and running through a few scenarios. Error handling was likewise left for later work, so if faulty input was entered the whole thing crashed.

It took a long time simply to do infrastructure setup: what Github repository, what web host, what database, how do we communicate together. Part of it was this was only my 2nd time building a node application, so I was unfamiliar with its internal expectations and capabilities. Things like “don’t send headers twice” caused problems for me that a more experienced developer would not have had. In a 24h period this stuff needs to be like riding a bike, so deciding on a framework that I had little experience in was costly. It’s like going to a marathon without having trained at all.

We were three people: two coders and a designer/QA person. Three was the minimum, and it really wasn’t enough. There were tasks like entering data into the database (the Citiparks staff provided excel spreadsheets) that took me a few hours but had zero payoff. In a codefest, data quality is not a factor in the judging (the judges don’t come from the clients). In an enterprise situation, the data is probably as important as anything else, but here it was wasted effort, and sample data would have worked fine.

We had somewhat of an idea how things would work, but wireframing it beforehand, and being much clearer about what steps were necessary, would have been better (you could not write code before, but this sort of sketching was allowed). A simple design plan and backlog would have been easier to work off, and help to resist the temptation to simply start hacking away. A number of times I would push back from the table, and say to myself “do I even need to do this?”

Writing code this way is a great way to learn these lessons. I have a number of academic publications about finding requirements, for example, but it is only when you do it yourself that you realize how much is lost between the quick IM conversations you have with teammates and the actual issue tracker. I do wonder, however, if these 24h codefests promote ‘code first’ over the value of design. For example, my sense is that a lot of what we did simply wouldn’t work in an enterprise environment: there are design guidelines, authentication, security, data integration, lifecycle maintenance concerns, none of which you have the luxury to spend much time with. The cool thing about the Steel City event, however, is that the organizers do make a series of $10k grants available, in order to take the app to a more integrated and polished version.

It was a great event – very well organized, with great food and volunteers. And the Citiparks staff were amazing, sending their director and deputy director to do user testing at 7pm Saturday, and bringing amazing treats for us twice during the event. It also focused on an underserved area, in my view: social justice and not-for-profits. Many have quite simple needs, that in many cases amount to adding data to a Google Map, but even that is beyond their budgets.


Filed under Uncategorized

Frameworks, libraries, and dependencies

I’ve been doing a little thinking about frameworks lately. They fascinate me as 1) a realization of the vision of ‘pluggable software’ and reusable components desired since probably 1968; 2) what you are getting into when you rely on one. This is prompted by this great post on libraries vs frameworks.

Now, we’ve used libraries for ages, viz. glibc etc. And the notion of ‘code that someone else wrote and maintains that I need’ was likely established in the design of Unix and pipe and filter architectures. But it really seems like the past 10 years have seen this wonderful explosion of creativity in writing ‘little libraries’ for various different systems.

I’ll take a common example. I’ve previously used Node.js for a small visualization I did for my brother’s work on genetics (in progress!). Although an academic, I like to try to stay on top of things, so I tried out Node, the Javascript web server. Now JS itself has at least 60+ frameworks and libraries, and that list doesn’t even include Node or some of the ones I’ll describe below. This is amazing considering although JS has been around a long time, only recently (would we say JQuery is the prototypical case?) has this explosion happened.

The trouble is that like the Cambrian explosion, some of these libraries and frameworks are doomed to extinction. If you are BigCo, that makes choosing one very tricky, in addition to the licensing and security questions you will need to ask.

Consider. I wrote the application for the Node server, using Express as a web framework (that means it automates some of the routing and layout of files and directories for you). To get to the database I used the Node PostGres library. To do UI I relied on JqueryUI and Stylus for CSS, with Jade for templating. Then I used Morgan for logging, Gulp to automate the style generation from the Stylus files, and was toying with D3 to do the display. Not to mention I need a Platform as Service from Heroku, so I have their command line tools installed as well.

So that gives about 10 different libraries to run this app. On the plus side, they automate a ton of code I no longer have to worry about, letting me focus on the key value-add of the app (realized in the SQL code I write and custom request handling code).

But I just upgraded to Express 4, and they’ve broken the back-compatibility, so I must now understand what the changes mean and how to retrofit them. Who maintains these libraries? Will he or she keep updating it? These are by no means new questions, but I think what has changed is that now it is very hard to avoid using them. And once you commit to it, re-architecting for the problems you will inevitably face with leaky abstractions seems challenging, because everything is deeply connected. You cannot just drop in a new back end server with the same libraries.

Now imagine that multiplied times 10 years and instead of my simple app, a mission critical information system, and you start to get a sense of the problem that legacy applications can pose. Fortunately, I work at a place with lots of experience solving those problems, so give us a call if you need help!

1 Comment

Filed under Uncategorized

The Gap Between User Requirements and Software Capabilities as Technical Debt

One of my favorite graphics is from Al Davis, in 1988. Aside: it is depressing how often we re-invent the wheel in this business.

Al Davis requirements growth

The nice thing is how one can map various software development concepts to parts of the diagram. I actually think there is another thing you can grab there. Well, two things. One, the environment is not captured in this picture, but only user needs and the specification. In most cases (maybe this is what wasn’t clear in 1988) the user requirements are constrained by the environment, that is itself changing. This is part of our re-definition of the requirements problem of Zave and Jackson.

Two, I think you can use this to show how the rate of growth in the gap between needs and system (what Davis calls “inappropriateness”, the shaded area) is also an issue. I think this captures the technical debt problem more succinctly. You will see a growth if, for example, you chose a technology solution that constrains your use of web browser (eg. Activex controls mandating IE8). That forces your red line (development/specification/software) to grow slower. Now the question becomes, at what point do you refactor/reengineer so that the rate of adaptability (the slope) increases again?

(I don’t actually know where I got this — maybe Steve Easterbrook, he likes Comic Sans a LOT — or the original source for this but maybe here.)

1 Comment

Filed under Uncategorized

Measuring programmer productivity is futile.

(I’ve typically posted long-form entries but so infrequently … )

The arguments and debates about 10x productivity in “programmers” rage on (this time to defend/reject H1B visas). This debate is doomed to never be concluded. I think the reason why is nicely captured in Andrew Gelman’s post on p-values: they work best when noise is low and signal is high, something which can never be the case when we talk about productivity. As he says,

If we can’t trust p-values, does experimental science involving human variation just have to start over?

Given a random sample of (let’s say) Microsoft software developers, can you devise a test that would show the statistical differences? Are you convinced you would have high power? A big effect size? One person online (via HackerNews) says it is about tool competence. But the recent Latex/Word study leaves me doubting even that conclusion (although I have trouble with that study too, which just reinforces my overall point).

More importantly, I think this calls into question almost any controlled experiment in software engineering. Short of replicated results, I’m skeptical the information content is very high. Instead, I would like more qualitative research. Why do people say there is this difference? What traits are important? Can they be taught? How do we share productivity improvements? These questions seem much more important than trying to attach a p-value to whether one group is better than another.

Leave a comment

Filed under Uncategorized

Cults of Personality and Software Process Improvement

I’m a fan of the Cynefin framework. I find it a great tool for understanding what type of problem you are trying to solve. The notion of complex/complicated/simple is quite helpful. You could do worse then to read Dave Snowden’s blog, as he explores each of the domains in the context (most often) of software projects.

Recently Mr Snowden has been critiquing the Scaled Agile Framework (SAFe) put together by Dean Leffingwell. This attack on SAFe is not unprecedented. It’s hard to take attacks like this too seriously when their proponents don’t put forth data, but merely theory.

One of the most difficult parts of doing research in Software Engineering is its inherently uncontrollable, one-off nature. Sure, in some cases—like websites for restaurants, for example—we see repeatability. But the most interesting projects, the ones SAFe is applied to, the complex or perhaps complicated ones, there is no repeatability (by definition). This makes it impossible to say with any degree of accuracy what factors are contributing to the success or failure of the project. 1

In particular, when you have strong, intelligent, experienced consultants like Mr Snowden, or Mr Leffingwell, or various other graybeards, I don’t think you can control for the ‘personality’ factor. That is, what portion of the success of the initiative (say, applying Sensemaker or SAFe or Scrumban or what have you) is due to the tool/process improvement/methodology, and what portion is due to the smart person effect of the consultant? This is made more difficult when that consultant has a very strong economic incentive to point to the methodology as the distinction, since their business is inextricably tied together with that methodology. Furthermore, just the fact that a company has reached out for help indicates some level of self-awareness.

My feeling is that given a successful team, led by an enlightened manager, it wouldn’t matter what methodology they used (which I mentioned previously in the context of tools). And there is some evidence to support this: Capers Jones suggests RUP and TSP have higher quality than Scrum or other approaches. Now that is just one dataset, but it is exactly one more than Mr Snowden has produced, as far as I can tell (the plural of anecdote is not data).

Does all this mean it doesn’t matter if we choose RUP, SAFe, Scrum, Kanban, Six Sigma, or Sensemaker? To some extent, I think that is true. I would guess that your measurable outcomes after implementing TSP would be similar to the outcomes after implementing SAFe. But the point is, one cannot measure these things in isolation! You will never know (Heisenberg-like) whether something else would have been better. The local project context is so important that the principles are more important than the specific practices (e.g., the agile manifesto, Cynefin domains, good organizational practices, etc).

  1. With one exception I am aware of: this paper from Simula in Norway. They paid 4 different companies to develop to the same set of requirements in order to understand the maintainability characteristics of different approaches. But even there, the results are difficult to generalize. Anda, B.C.D.; Sjoberg, D.; Mockus, A, “Variability and Reproducibility in Software Engineering: A Study of Four Companies that Developed the Same System,” IEEE Transactions on Software Engineering, vol.35, no.3, pp.407,429, May-June 2009 doi: 10.1109/TSE.2008.89 

Leave a comment

Filed under Uncategorized

Software research shouldn’t be about the tools

It comes down to essential vs. accidental complexity, as outlined by Fred Brooks. What we research is new ways to ‘nibble’ at the accidental complexity: new languages (GO, Swift), new abstractions (Actors vs. functional programming in distributed systems), new methodologies (random test case generation). It’s what nearly every story on Hacker News is about.

But ultimately, I think most problems come down to two factors: the problem complexity itself, and the team tackling it. To me, many of the problems highlighted as software/IT failures, like the FBI registry, have nothing to do with a lack of good tools or techniques. These are ultimately management failures: scope creep, poor leadership, insufficient budget, too much budget, negative work environments, etc. It is ‘executing’ that is the problem, not the technology. How many errors have been caused by the US reliance on imperial units?

Look at this quote by a senior VP at Oracle on failures in implementing CRM projects:

[M]y comments apply to ALL CRM vendors, not just Oracle. As I perused the list, I couldn’t find any failures related to technology. They all seemed related to people or process. Now, this isn’t about finger pointing, or impugning customers. I love customers! And when they fail, WE fail.

I’d be willing to say that software engineers have all the tools they need. We need some form of continuous integration and deployment, abstraction mechanisms to simplify the problem, tests to verify our solution, version control to maintain a history of changes, and some form of requirements (whiteboard, paper, spreadsheet, what have you) to keep track of what needs to be built. I don’t even think it particularly matters how you use those tools. If you have a mature organization and process then you can all into the following matrix (James Montier via Jonathan Chevreau):

Good Outcome Bad Outcome
Good Process Deserved Success Bad Break
Bad Process Dumb Luck Poetic Justice

But just having the right tools, the good people, and a mature process is not enough to guarantee success, of course. You could be tackling a ‘wicked problem‘. You could have a team of misfits and losers. You could have a manager who refuses to accept responsibility or make decisions. Most software research does not address those issues. I’m not convinced there is any research that addresses those issues: leadership, management, sociology… nothing can help when your team lead is having a marital crisis and can’t devote any time to product development.


Filed under Uncategorized

Evidence in Software Engineering

This post is spurred by a line in a paper of Walker Royce, son of Winston Royce, he of the “waterfall model” (misunderstood model). He says

without quantified backup data, our software estimates, proposals and plans look like long-shot propositions with no compelling evidence that we can deliver predictably or improve the status quo.

Bold text is mine, to emphasize this notion of evidence. The question then becomes what evidence is acceptable. And here I think we get into some hoary philosophical questions concerning truth in science (epistemology, really).

I think one of the fundamental impedance mismatches in software engineering for large-scale systems, or software engineering in a systems engineering environment (e.g., airplanes, military software, ultra-large-scale software, safety-critical software) is that a number of people on those teams have a positivist view of evidence. They subscribe to the notion that sufficient “data” can show whether something will work or not. So if you design a missile system’s rocket engines, those engines either deliver the necessary thrust, navigability, etc. or they do not (actually, I think this is probably not the case in those so-called hard sciences either, but the point is that people from those domains think it is the case).

Agile software delivery works much more from the falsificationist or even outright constructionist approach. I would estimate that the majority of agile practitioners believe in the test and refine approach, where you deliver an increment, test it to determine how well the ‘theory’ of that software matches reality, and then iterate. The key difference with the positivists being, of course, that there is no a priori evidence you can use to show things will work. It is hard to do simulations in Simulink or Matlab as to how well software will perform. This is why Alistair Cockburn calls software development a cooperative game. And some people, I would say, would go even further and treat software development as a post-modernist exercise in building the reality you want to see and dispensing with evidence altogether (“if it works, it is right”). Those people probably don’t get a lot of government contracts, however.

Back to the quote: the issue remains, how easy is it to produce “compelling evidence” and what does it consist of? In Royce’s view, evidence takes the form of historical context for productivity, function points, etc. In that case we are almost measuring the team’s capability to deliver more than any particular artefact.

At some level, cynically one might say, there is a need to show the ‘evidence’ that will get the job or contract. People hire companies like Thoughtworks because they have a track record of getting the job done to people’s satisfaction. We don’t much care how long it took, or how many lines of code were written, if the software was valuable.

Which is fine, but as an engineering discipline one would like a bit more to chew on.

Leave a comment

Filed under Uncategorized