Over on my employer’s blog, I’ve written up our survey results on technical debt.
This past weekend was the Steel City Codefest. The idea is that community non-profits present some problem for which an “app” would help them, and coders spend 24 hours coming up with some solution. It was a lot of fun. You can see our team’s solution at http://citipark.herokuapp.com. Our challenge was to create an easier way for people to find the city of Pittsburgh’s GrubUp food program, which offers free lunch and breakfast at 80+ sites around the city in the summer (sadly, a lot of Pittsburgh youth are food insecure).
We didn’t win the challenge, but I learned a lot on the way.
We created tons of technical debt : code clones, code comments, no testing, no design. It was code as fast as possible, get it working, fix the obvious user facing bugs. We shipped. But even during that 24h span the design hit us, as it became harder to change things since logic and UI were wrapped together. Even something as trivial as renaming a media folder became a massive headache. We had no tests, so any change had to be “tested” by running the app and running through a few scenarios. Error handling was likewise left for later work, so if faulty input was entered the whole thing crashed.
It took a long time simply to do infrastructure setup: what Github repository, what web host, what database, how do we communicate together. Part of it was this was only my 2nd time building a node application, so I was unfamiliar with its internal expectations and capabilities. Things like “don’t send headers twice” caused problems for me that a more experienced developer would not have had. In a 24h period this stuff needs to be like riding a bike, so deciding on a framework that I had little experience in was costly. It’s like going to a marathon without having trained at all.
We were three people: two coders and a designer/QA person. Three was the minimum, and it really wasn’t enough. There were tasks like entering data into the database (the Citiparks staff provided excel spreadsheets) that took me a few hours but had zero payoff. In a codefest, data quality is not a factor in the judging (the judges don’t come from the clients). In an enterprise situation, the data is probably as important as anything else, but here it was wasted effort, and sample data would have worked fine.
We had somewhat of an idea how things would work, but wireframing it beforehand, and being much clearer about what steps were necessary, would have been better (you could not write code before, but this sort of sketching was allowed). A simple design plan and backlog would have been easier to work off, and help to resist the temptation to simply start hacking away. A number of times I would push back from the table, and say to myself “do I even need to do this?”
Writing code this way is a great way to learn these lessons. I have a number of academic publications about finding requirements, for example, but it is only when you do it yourself that you realize how much is lost between the quick IM conversations you have with teammates and the actual issue tracker. I do wonder, however, if these 24h codefests promote ‘code first’ over the value of design. For example, my sense is that a lot of what we did simply wouldn’t work in an enterprise environment: there are design guidelines, authentication, security, data integration, lifecycle maintenance concerns, none of which you have the luxury to spend much time with. The cool thing about the Steel City event, however, is that the organizers do make a series of $10k grants available, in order to take the app to a more integrated and polished version.
It was a great event – very well organized, with great food and volunteers. And the Citiparks staff were amazing, sending their director and deputy director to do user testing at 7pm Saturday, and bringing amazing treats for us twice during the event. It also focused on an underserved area, in my view: social justice and not-for-profits. Many have quite simple needs, that in many cases amount to adding data to a Google Map, but even that is beyond their budgets.
I’ve been doing a little thinking about frameworks lately. They fascinate me as 1) a realization of the vision of ‘pluggable software’ and reusable components desired since probably 1968; 2) what you are getting into when you rely on one. This is prompted by this great post on libraries vs frameworks.
Now, we’ve used libraries for ages, viz. glibc etc. And the notion of ‘code that someone else wrote and maintains that I need’ was likely established in the design of Unix and pipe and filter architectures. But it really seems like the past 10 years have seen this wonderful explosion of creativity in writing ‘little libraries’ for various different systems.
The trouble is that like the Cambrian explosion, some of these libraries and frameworks are doomed to extinction. If you are BigCo, that makes choosing one very tricky, in addition to the licensing and security questions you will need to ask.
Consider. I wrote the application for the Node server, using Express as a web framework (that means it automates some of the routing and layout of files and directories for you). To get to the database I used the Node PostGres library. To do UI I relied on JqueryUI and Stylus for CSS, with Jade for templating. Then I used Morgan for logging, Gulp to automate the style generation from the Stylus files, and was toying with D3 to do the display. Not to mention I need a Platform as Service from Heroku, so I have their command line tools installed as well.
So that gives about 10 different libraries to run this app. On the plus side, they automate a ton of code I no longer have to worry about, letting me focus on the key value-add of the app (realized in the SQL code I write and custom request handling code).
But I just upgraded to Express 4, and they’ve broken the back-compatibility, so I must now understand what the changes mean and how to retrofit them. Who maintains these libraries? Will he or she keep updating it? These are by no means new questions, but I think what has changed is that now it is very hard to avoid using them. And once you commit to it, re-architecting for the problems you will inevitably face with leaky abstractions seems challenging, because everything is deeply connected. You cannot just drop in a new back end server with the same libraries.
Now imagine that multiplied times 10 years and instead of my simple app, a mission critical information system, and you start to get a sense of the problem that legacy applications can pose. Fortunately, I work at a place with lots of experience solving those problems, so give us a call if you need help!
One of my favorite graphics is from Al Davis, in 1988. Aside: it is depressing how often we re-invent the wheel in this business.
The nice thing is how one can map various software development concepts to parts of the diagram. I actually think there is another thing you can grab there. Well, two things. One, the environment is not captured in this picture, but only user needs and the specification. In most cases (maybe this is what wasn’t clear in 1988) the user requirements are constrained by the environment, that is itself changing. This is part of our re-definition of the requirements problem of Zave and Jackson.
Two, I think you can use this to show how the rate of growth in the gap between needs and system (what Davis calls “inappropriateness”, the shaded area) is also an issue. I think this captures the technical debt problem more succinctly. You will see a growth if, for example, you chose a technology solution that constrains your use of web browser (eg. Activex controls mandating IE8). That forces your red line (development/specification/software) to grow slower. Now the question becomes, at what point do you refactor/reengineer so that the rate of adaptability (the slope) increases again?
(I’ve typically posted long-form entries but so infrequently … )
The arguments and debates about 10x productivity in “programmers” rage on (this time to defend/reject H1B visas). This debate is doomed to never be concluded. I think the reason why is nicely captured in Andrew Gelman’s post on p-values: they work best when noise is low and signal is high, something which can never be the case when we talk about productivity. As he says,
If we can’t trust p-values, does experimental science involving human variation just have to start over?
Given a random sample of (let’s say) Microsoft software developers, can you devise a test that would show the statistical differences? Are you convinced you would have high power? A big effect size? One person online (via HackerNews) says it is about tool competence. But the recent Latex/Word study leaves me doubting even that conclusion (although I have trouble with that study too, which just reinforces my overall point).
More importantly, I think this calls into question almost any controlled experiment in software engineering. Short of replicated results, I’m skeptical the information content is very high. Instead, I would like more qualitative research. Why do people say there is this difference? What traits are important? Can they be taught? How do we share productivity improvements? These questions seem much more important than trying to attach a p-value to whether one group is better than another.
I’m a fan of the Cynefin framework. I find it a great tool for understanding what type of problem you are trying to solve. The notion of complex/complicated/simple is quite helpful. You could do worse then to read Dave Snowden’s blog, as he explores each of the domains in the context (most often) of software projects.
Recently Mr Snowden has been critiquing the Scaled Agile Framework (SAFe) put together by Dean Leffingwell. This attack on SAFe is not unprecedented. It’s hard to take attacks like this too seriously when their proponents don’t put forth data, but merely theory.
One of the most difficult parts of doing research in Software Engineering is its inherently uncontrollable, one-off nature. Sure, in some cases—like websites for restaurants, for example—we see repeatability. But the most interesting projects, the ones SAFe is applied to, the complex or perhaps complicated ones, there is no repeatability (by definition). This makes it impossible to say with any degree of accuracy what factors are contributing to the success or failure of the project. 1
In particular, when you have strong, intelligent, experienced consultants like Mr Snowden, or Mr Leffingwell, or various other graybeards, I don’t think you can control for the ‘personality’ factor. That is, what portion of the success of the initiative (say, applying Sensemaker or SAFe or Scrumban or what have you) is due to the tool/process improvement/methodology, and what portion is due to the smart person effect of the consultant? This is made more difficult when that consultant has a very strong economic incentive to point to the methodology as the distinction, since their business is inextricably tied together with that methodology. Furthermore, just the fact that a company has reached out for help indicates some level of self-awareness.
My feeling is that given a successful team, led by an enlightened manager, it wouldn’t matter what methodology they used (which I mentioned previously in the context of tools). And there is some evidence to support this: Capers Jones suggests RUP and TSP have higher quality than Scrum or other approaches. Now that is just one dataset, but it is exactly one more than Mr Snowden has produced, as far as I can tell (the plural of anecdote is not data).
Does all this mean it doesn’t matter if we choose RUP, SAFe, Scrum, Kanban, Six Sigma, or Sensemaker? To some extent, I think that is true. I would guess that your measurable outcomes after implementing TSP would be similar to the outcomes after implementing SAFe. But the point is, one cannot measure these things in isolation! You will never know (Heisenberg-like) whether something else would have been better. The local project context is so important that the principles are more important than the specific practices (e.g., the agile manifesto, Cynefin domains, good organizational practices, etc).
- With one exception I am aware of: this paper from Simula in Norway. They paid 4 different companies to develop to the same set of requirements in order to understand the maintainability characteristics of different approaches. But even there, the results are difficult to generalize. Anda, B.C.D.; Sjoberg, D.; Mockus, A, “Variability and Reproducibility in Software Engineering: A Study of Four Companies that Developed the Same System,” IEEE Transactions on Software Engineering, vol.35, no.3, pp.407,429, May-June 2009 doi: 10.1109/TSE.2008.89 ↩
It comes down to essential vs. accidental complexity, as outlined by Fred Brooks. What we research is new ways to ‘nibble’ at the accidental complexity: new languages (GO, Swift), new abstractions (Actors vs. functional programming in distributed systems), new methodologies (random test case generation). It’s what nearly every story on Hacker News is about.
But ultimately, I think most problems come down to two factors: the problem complexity itself, and the team tackling it. To me, many of the problems highlighted as software/IT failures, like the FBI registry, have nothing to do with a lack of good tools or techniques. These are ultimately management failures: scope creep, poor leadership, insufficient budget, too much budget, negative work environments, etc. It is ‘executing’ that is the problem, not the technology. How many errors have been caused by the US reliance on imperial units?
Look at this quote by a senior VP at Oracle on failures in implementing CRM projects:
[M]y comments apply to ALL CRM vendors, not just Oracle. As I perused the list, I couldn’t find any failures related to technology. They all seemed related to people or process. Now, this isn’t about finger pointing, or impugning customers. I love customers! And when they fail, WE fail.
I’d be willing to say that software engineers have all the tools they need. We need some form of continuous integration and deployment, abstraction mechanisms to simplify the problem, tests to verify our solution, version control to maintain a history of changes, and some form of requirements (whiteboard, paper, spreadsheet, what have you) to keep track of what needs to be built. I don’t even think it particularly matters how you use those tools. If you have a mature organization and process then you can all into the following matrix (James Montier via Jonathan Chevreau):
|Good Outcome||Bad Outcome|
|Good Process||Deserved Success||Bad Break|
|Bad Process||Dumb Luck||Poetic Justice|
But just having the right tools, the good people, and a mature process is not enough to guarantee success, of course. You could be tackling a ‘wicked problem‘. You could have a team of misfits and losers. You could have a manager who refuses to accept responsibility or make decisions. Most software research does not address those issues. I’m not convinced there is any research that addresses those issues: leadership, management, sociology… nothing can help when your team lead is having a marital crisis and can’t devote any time to product development.