The fuzzy notion of “business value”

Software development is rife with references to business value, particularly in agile approaches: the Agile Manifesto declares that “Our highest priority is to satisfy the customer
through early and continuous delivery of valuable software.”

The trouble is that it isn’t clear what ‘valuable’ means. I’m sure that the point of this phrase, as with most of the Manifesto, is to start a discussion rather than to behave as a prescriptive methodology. I believe “value” is inherently context-dependent, so in that sense it is reasonable to leave it vague.

On the other hand, many people refer to business value as the Holy Grail of software development: this is what you are supposedly optimizing in Scrum. Other methodologies help focus on “impact”. Lean approaches have one remove ‘waste’ from the value stream. And yet no one has ever pinned value down, as Racheva et al. have shown [1] (in the software domain, anyway – many attempts have been made in economics).

Business value does have the nice property of communicability, though. It gives developers that one number to sell the project to the business, and allows for a conversation about scope, cost of delay, and prioritization that is difficult to do with purely qualitative methods. And for the mathematically inclined, it lends itself to algorithms like linear programming for optimization.

One paper does try to break business value into more reasonable components, which I quite liked. It is by Heidenberg et al. [2]. They break business value into five dimensions, each of which is ranked on an ordinal scale with four possible categories:

  1. Monetary Value – this is the number calculated by, say, a business analyst.
  2. Market Enabler – does delivering this feature create new market opportunity?
  3. Technical Enabler – does this feature help prepare the company for other features?
  4. Competence Growth – measures how much the work will improve the team’s skill.
  5. Employee Satisfaction – do the developers like working on this feature?
  6. Customer Satisfaction – how much will customers appreciate this work?

One of the big problems with agile planning is that making categories 2, 3, 4, 5 visible is often hard. It is comparatively easy to sell a customer a feature that ranks highly in monetary value or customer satisfaction – these are the slick and cool UI widgets, the mission-critical Word reporting function, and so on. But making architectural work (category 3) visible is very challenging. If we had this six-factor model, prioritizing important architectural work would be easier.

[1] Z. Racheva, M. Daneva, and K. Sikkel, “Value Creation by Agile Projects: Methodology or Mystery?,” presented at Product-Focused Software Process Improvement, 2009, vol. 32, no. 12, pp. 141–155.

[2] J. Heidenberg, M. Weijola, K. Mikkonen, and I. Porres, “A Model for Business Value in Large-Scale Agile and Lean Software Development,” presented at EUROSpi: Systems, Software and Services Process Improvement, 2012, pp. 49–60.


Filed under Uncategorized

Obtaining a Pennsylvania Driver’s Licence with an H1-B

In case this helps other people:

PennDOT rules on what paperwork is needed can be found here. In addition, keep in mind the following:

  • We needed a letter from my employer, I94+passports, old licences, 2 proofs of residence (lease+bills), and a rejection letter from Social Security for the SSN (for my wife) and a letter with the number on it for me (haven’t got the physical card yet).
  • If you have a H1-B, and your spouse has an H4, you will need to go with your spouse if s/he is getting a licence as well – you can’t go separately.
  • PennDOT does not take cheques drawn on foreign banks, only US banks. Fortunately you can get money orders easily at grocery stores. There is a Giant Eagle that does this near the Penn Hills licence centre.
  • Staff are pleasant but extremely over-worked, so be patient. Downtown Pittsburgh was less busy during the week than Penn Hills on the weekend.

Leave a comment

Filed under Uncategorized

Teaching Advanced Software Engineering


The course covers software architecture, with a focus on quality attributes, security, and formal methods. I liked the range of material, even though my expertise is limited with formal methods. It is difficult to teach architecture to students in a 3 month time frame, so we expanded using the AOSA textbooks. Students did a presentation for five minutes as a way of exposing them to various different architectures.

The other large component of the class is a course project. In this semester they had to build a location-aware, social application. There were great projects including my personal fave, a zombie fighting location-based game.

My favorite part of this course, like the third-year course, is seeing how the students approach the project. Some are truly excellent coders and put an enormous amount of effort into the project.

I introduced a few new lecture topics in addition to the ones pre-existing. I added a topic on Service Orientation and SOA, important trend in particular in enterprise architecture; a new topic on REST, which was well connected to the project; and a topic on agility and architecture, based in part on the book by Dean Leffingwell, Agile Requirements. I thought all of these were useful, although they tend to be less easily tested than e.g. model checking, so perhaps students are just forgetting them. I even mentioned CMMI!


I feel that these types of courses in SE should be more about reflective apprenticeship than the lecture and project model. In other words, there should be more focus on feedback about the way in which students do design, more experiential learning, and less memorization of specific techniques such as formal methods (which should really be in a separate course, in my opinion). Mark Guzdial called this reflective apprenticeship and points to the work of Donald Schön.

A parallel might be drawn with the way law schools operate. If you want to learn about how one practices law, argues cases, etc., you hire experienced practitioners as adjunct faculty. There seems to me to be a real difference between the skills of an academic SE faculty member and a person who has spent years building high-availability, mission critical software. There are a few of the latter at UBC, such as Philippe Kruchten, but in general it is exceedingly difficult for them to be hired without academic credentials. Not to mention the increasingly large salary gap between industrial SE and academic SE.

The other thing I disliked is that it seemed to me that a few students hid out during the project, latching on remora-like to their more capable teammates to secure a good mark with no work. It irritates me to pass students who are not able to write code (not good code, or even mediocre code – just not write code! See Fizzbuzz). It is very difficult to (defensibly) identify these people, however. One technique which I should have used is to ask questions of the indifividual students during their final demos. This would help to identify who actually knows what the heck is going on.

Academic dishonesty and software engineering

On the one hand we cannot prohibit collaboration and code re-use: these are fundamental practices in software engineering. On the other hand, we need to assess the student’s actual contribution. I had a few interesting cases that suggest our pollicies in this area need more attention.
1. One group used a sample Ruby on Rails project to bootstrap their application. It came with most of the controllers they needed. They then customized the UI and logic to implement the functionality (poorly).
2. Another team hired a third-party designer to do custom artwork for the project (which looked fantastic).
3. A different team had a friend with web design expertise work on the CSS for the project.
4. Several teams used Twitter’s Bootstrap UI library or JqueryUI to simplify their efforts on the design end.
5. Many groups used third-party libraries to simplify their life, like JQuery, Rails, image libraries, etc.

Obviously, most of these are exactly what would happen in industry. On the other hand, it definitely gains one an advantage. The Twitter Bootstrap apps all looked an order of magnitude better than the custom apps.

My principle was the remixing and reuse was fine, as long as it was properly acknowledged. In the design case, we could try to discount that aspect of the UI in the marking. But it is almost certainly the case that some groups did NOT acknowledge their use of other people’s IP, and yet benefited from it. I don’t have a good solution to the problem of detecting code reuse. And furthermore, the burden of proof is pretty high to call something cheating, and requires more than a gut feeling or a commit to Github that touched hundreds of files at once (i.e. a bulk commit of 3rd party code).

IT role

One of the things which I think will only become more prevalent is the use of third-party services to manage the course. In the past, students would use CS department machines and servers to do their assignments, a CS database server, and store code on the department subversion or IBM RTC servers.

This semester I don’t think we used a single department resource, save for email (and that only because I was forced to for privacy reasons) and the course webpage. Class discussions took place on; code and issues were managed with Github, and students nearly always have their own laptops and Android devices (there was not a single group that chose iOS, incidentally, although nearly half the class has Macbooks. I think the 99$ fee is a real stumbling block – that and Objective-C).

I did not get any support or materiel from the department, apart from the classroom and photocopier. I could just as easily have run this course from my home. So what should the IT section do? They could manage Github for me (they were extremely reluctant to do this, and very hesitant about even installing Bugzilla, apparently). They could provide more AV services to record classes. They could manage virtual machines for me, so that each student could install the same setup — things like Puppet and Vagrant will be key in the coming years.

Finally, the UBC wireless infrastructure is truly terrible. You get better wifi at the Starbucks. Latency between two machines in my office was 200ms! The connection is constantly dropping or extremely slow, such that even demos are affected by the web performance.

Student perceptions

In an unscientific survey, I asked the following questions:

1. How could the TAs and myself improve your experience?

Students were either positive (but they had names attached to their responses, so that isn’t unexpected) or asked for more help. One of the big challenges they face is sorting out silly configuration problems. They would like more advice on design choices as well. I think this is a real opportunity to make the project more like an apprenticeship model, a la Software Craftsmanship: take some senior developers, get them to do an hour of code review, an hour of design feedback, etc. And there seem to be many companies eager to help out (and recruit) for whom this might be doable.The other issue was that due to 4) below, TAs and myself often did not know much about the technology (e.g., Microsoft’s C#/Azure platforms). However, this is definitely a learning objective in the course. Admittedly in industry one would often be able to ask senior devs these questions. However, the ability to track these answers down is invaluable, I feel.

2. Were the AOSA readings useful?

Most students responded that they appreciated the opportunity to present to a large audience. But the overall lessons of the architecture in these systems was lost on them, because it did not have a lot of relevance to the project, which consumed the majority of the time. Asking exam questions was difficult, as there was a lot of material that would have to be studied. I think I would keep this module but be more strict about the time limit (5mins) and give some introductory examples/prep before hand.

3. Did Github work for you?

Students loved Github. Egit was less good (and personally I find it less usable than the command line). A major improvement over RTC.

4. Was the freedom to choose language good or bad?

Most students loved this aspect as well. In the past the project, worth 40% of the course, has been in e.g. Java+Tomcat for everyone. Feedback here indicated that main problems were finding team members with similar interests (in, e.g. RoR), getting help from TAs, and a possible penalty on the final, where the code snippets are in Java.

5. What annoyed you about the project?

Unsurprisingly, most complaints were about the time it took – one student spent 80 hours over two weekends on what was, however, a really cool UI – and the vagaries of group work with fellow team members, some of whom get sick, abandon their teammates, or simply are not good at programming. Students would appreciate more help on scoping the project, and getting the thing started earlier. We tried to address this by insisting on an early ‘project idea’ review in the first 3 weeks, and by doing a 30 minute design review midway. However, some people have to learn the hard way, and ultimately, we are constrained by how many teams there are – 27 in this case. Multiply that by 30mins and you can see the magnitude of the challenge. I had 3 TAs to help, but that is still a ton of work. And I think students got frustrated, since they see a 1-1 interaction, not 27-1 that I see. 


Filed under Uncategorized

A stitch in time…

This blog post from the excellent complexity blog Godel’s Lost Letter is on the theory behind branch and bound search. One of my favourite things about this sort of analysis is how it it can eliminate, with mathematical certainty, hours and hours of programming effort. Consider this statement:

There is an issue of being odd or even, which matters but not hugely, since pruning the bottom layer is not so valuable.

I have spent many hours working on problems that might fall into the “matters, but is not so valuable” category. A few hours of analysis might well have saved me a lot of trouble.

1 Comment

Filed under Uncategorized

One aspect of an “engineering discipline” …

In the ACM Software Engineering Code of Ethics, part 8, “SELF”, refers to one’s ethical duty to oneself. Software developers should ensure they “Further their knowledge of developments in the analysis, specification, design, development, maintenance and testing of software and related documents, together with the management of the development process.”

Recently Tim Bray tweeted “Software engineering needs consensus on Best Practices in terms of tools and architecture”, linking to his blog in response to Steve Yegge’s post on liberalism and conservatism in software engineering.

I respect immensely the work that engineers do in the field. You cannot fault the quality and longevity of Tim’s accomplishments as a software engineer. But as a researcher I do find it incredibly frustrating that there remains this perception that software is not an engineering discipline, or that best practices are not codified. It sounds to me like Tim is calling for a conference of people facing similar software challenges.

The problem is, we’ve already had this discussion. In 1968. And arguably in 2001. And while there is an awful lot of garbage and noise in academic research, we have some really good, empirically justified understanding of what these best practices are.

They aren’t hard to find: you can find excellent case studies and experiments in numerous sources like the book Making Software, IEEE Software magazine, Microsoft’s Empirical SE group, ICSE papers, Crosstalk Journals, Cutter Journals, and of course practitioner blogs. This need for evidence is a big reason for our efforts in the Never Work in Theory blog.

Leave a comment

Filed under Uncategorized

The Trouble with Data Mining in Software Engineering

A lot of work has gone into applying statistical tools to software engineering artifacts, like code, tests, and social networks. The premiere venue for this is the International Working Conference on Mining Software Repositories (MSR).

As an example, myself and other researchers have been trying to automatically extract requirements from project artifacts, including emails, issue trackers, and code commit messages. You can read more about my work here, and it was recently extended by Abram Hindle here. Another interesting tack was taken by Radu Vlas at Georgia State University, using ontologies and part of speech tagging.

The trouble is all of the techniques we’ve used to date only give us a precision/recall of at best 60/60 [1], meaning we miss 40% of the actual requirements and only 60% of the results are in fact requirements. This isn’t very satisfactory, because it means you miss out on possibly important requirements, and still have to wade through a lot of noise. We’d like to get this to 100/100, of course, but anything would be an improvement. The benefits would be immense: requirements traceability would be simplified, allowing us to talk about whether a requirement was implemented, what requirements interact, describe the current requirements a project satisfies, and many others.

Unfortunately, I wonder if we haven’t hit on the fundamental limit to the natural language parsing/machine learning toolkits (in this domain). I say this because the most obvious successes of machine learning, such as spell-checking or flu tracking, are due at least as much to the vast amount of data involved as to the novelty of the techniques themselves. The problem is that in software projects, the amount of information is measured in the tens of thousands of data points, when it really should be millions or tens of millions to be really successful.

For example, one of the longest-lived open projects, Mozilla, only has about 800,000 issues in its issue-tracker. Which is not bad, but we need to remove duplicates and bug-reports to find the ‘new features’ or ‘requirements’ – making it more like a few thousand. And it seems unlikely that commercial sources would have orders of magnitude more to offer. This is like training a spell-checker using a corpus of ten thousand English sentences—it will perform terribly. For example, on the Google Flu Trends link above, you can see that for the Canadian provinces/territories of PEI, NWT and Nunavut, there is not enough data (due to small populations) to make any predictions.

Getting more data isn’t obvious, either. We might aggregate the projects together, but each individual project tends to be very different in how they use language, so essentially it is like configuring ten different languages for your spell-checker each with ten thousand sentences. Impossible!

So assuming automating requirements extraction from project data is useful, how might we do this? I think there are two approaches. One, we downgrade our definition of ‘automation’ to allow for human judgement. This might mean asking developers to define a common project lexicon, or to be more diligent about annotating requirements automatically (many projects already separate these). However, asking developers to do things other than their core activities (writing code!) is usually doomed to failure unless it is very painless.

I think the other approach is to move past the bag of words model. In most statistical learners, you throw documents or sentences into a huge corpus. This works great for standard information retrieval examples like the Reuters corpus. But in software projects, it feels like by doing this we are losing a lot of the metadata, like dates or people, that might be relevant. Perhaps if we somehow annotate the training data with this information, and feed that into a learner, we would have more success.

  1. I may be ball-parking these numbers!  ↩

Leave a comment

Filed under Uncategorized

RE-KOMBINE: Paraconsistent Reasoning for Requirements

Here I try to summarize the work and motivation for my dissertation, or parts thereof. One of the bigger components of my thesis is RE-KOMBINE, which is a reasoning engine for requirements models. You can find the code and examples here.

A central problem in requirements engineering (RE) is to understand what to build next. Requirements models are representations of what your system needs to accomplish. They encompass both that which exists, the current payroll system, for example, and what needs to be created—the interface between the payroll system and the point-of-sale system, for example. Like any model, requirements models are imperfect copies of the real world, with the degree/extent of imperfection reflecting cognitive support characteristics. For example, some of the most popular requirements models are informal lists, kept on paper or in a spreadsheet.

My dissertation research was about better understanding what a good model of evolving requirements ought to do. And I’ve come to the conclusion (somewhat diametrically opposed to my beliefs at the start of my Ph.D.) that formality is not only NOT harmful, it is under-utilized.

Formality considered essential

One’s representation of a problem is either informal or formal. I don’t understand people who say they are ‘semi-formal’. That’s like being a little pregnant. Formality means very specific things in this domain. Furthermore, if we want to do anything with a machine like a computer, we cannot give it inconsistent instructions. Most of the machines we’ve built do very badly with inconsistent information (although we are working on that). So even though you may claim to have a ‘semi-formal’ language, at some point it will assuredly be formal. It’s just that the process of translation may be found at a different point in the chain. E.g.

problem domain → req. elicitation → informal model → Rational requisite pro/Blueprint → knowledge/software engineer (makes many decisions that were ‘implicit’ → formal representation (code, typically)


problem domain → elicitation → formal model → translation engine (human/automated) → source code

But what do we mean by ‘formal’?

Opponents of formalization often characterize it as unwieldy, non-scalable, confusing for end-users, etc. Is it difficult for humans to represent things in a formal system? Sure.
But as this book says, “the power of formalization is that, once formalized, an area of interest can be worked in without understanding” (Reeves, S., & Clarke, M. (2003). Logic for computer science. Addison Wesley. Retrieved from

This makes formalization particularly important for sharing with others, or for using computer programs. Is reading a formalization challenging? Yes. There is much abuse of notation in many formal research papers, and even worse, the demand formality places on clarity of presentation necessitates an unwieldy presentation (you cannot just wave your hands and ‘claim’ something is true). But that is also the beauty of it – once you understand the formalization (for example, propositional logic), it becomes impossible (or rather, very difficult) to hide wooly thinking. Is formalization subject to scalability challenges? Sure. If a problem is inherently NP, formalization won’t remove that. But neither will being informal. I actually think titles like “formal methods” or “formalization” are not helpful. They have so much baggage that the terms have become pejorative. Also, if you are a researcher in software, and don’t understand propositional logic, lambda calculi, complexity theory, etc., shame on you. Even if you work on human-computer interaction, these topics will make you a better researcher.

Techne (τέχνη)

We have been working on another requirements modeling language, Techne (see this paper). Why another language? We have plenty of requirements languages: KAOS, Problem Frames, i*, Tropos, UML (sort of) … In Techne we tried to write a language that was minimal and captured the essence of the requirements problem: given your requirements, find the implementation choices that will satisfy them (originally proposed by Zave and Jackson. We are working on ways in which Techne can be extended with concepts for more expressive models, using actors, numeric weights, and so on. But the initial language is merely requirements, tasks, and constraints (domain assumptions).

Requirements and tasks (nodes) are encoded as propositions, and the only formalization necessary is the relationships between propositions. These come in two flavours: implications, which capture the notion that (for instance) implementing task T will satisfy requirement R. We didn’t want to have propagation of ‘partial’ anything, like other languages, because it is unclear what partial satisfaction means. So Techne allows for conflict relationships to represent the situation where doing one thing means another cannot be done, which we write as D ∧ E → ⊥. (Read as satisfying requirements D and E will be inconsistent).

Why is this stuff useful? Techne makes two main contributions. One is to present a formal requirements modeling language with very well-understood proof theories and theorem provers (using Horn logic). It is decidable and polynomial to find inconsistencies (not the case in full SAT). The second contribution is the addition of the following notions: mandatory and optional states for nodes; preferences between minimal solutions in a set of requirements; and approximations of softgoals using quality constraints.

Node labels

Typically in requirements models we want to associate with each element a label reflecting the satisfiability of that node (hence the use of SAT solvers). We might monkey around with this notion, for example by using four-valued logic to handle ‘partial satisfaction’ (e.g. Sebastiani et al.).

Some nodes, certainly domain assumptions, and possibly tasks we have chosen, will start with labels (let’s call them T or F for now, although the mapping from these letters to the notions of Truth or Falsehood is not straightforward). The outcome of ‘evaluating’ a goal model (or other requirements model) is a labeled model, which tells the analyst which elements can be satisfied (hopefully the high level requirements!). We can also call this a ‘solution’ to the requirements problem – it tells us which elements will satisfy our optative properties.

Types of reasoning

I’m primarily considering logical reasoning; there are other pseudo-logical algorithms available. I am trying to collect these algorithms on GitHub. I’d love pull requests.

All requirements languages rely on consistent models. That is, if an inconsistency is found (that bottom, ⊥, can be derived), the entire model is trivialized; the inconsistency must be removed.

Two main approaches are forward and backward reasoning. In forward reasoning we start with a set of ‘facts’, and try to determine what ‘goals’ we can fulfill. Expert systems work like this, too. Considerations: can you support cyclic graphs? Does the algorithm terminate? Is the algorithm scalable?

Backward reasoning starts with a goal, or goals, and works to find the facts that make that rule true. This is how Prolog works: resolution proofs. I’m not aware of any requirements languages that support backward chaining; datalog might be an example though, and if we generalize requirements models as KR systems, there is a lot of work here.

A final way to reason about goal models is to try to ‘label’ the graphs consistently. I don’t hold this to be either backward or forward reasoning: instead, you are just brute-forcing the problem into a set of conjunctive normal form (CNF) formulas, and then trying to satisfy the resulting wff. This problem is NP-complete, but SAT solvers have advanced to the point where most requirements problems should be readily solvable. The nice thing about the CNF representation is that there are a variety of twists on the boolean satisfiability problem, such as WeightedSAT, MaxSat, MinCostSat, etc.


To extend these approaches, we noted that it is often desirable to support paraconsistency, that is, tolerating inconsistency. There are at least four reasons for allowing inconsistent statements and working around them (after Nuseibeh et al.):

  • to facilitate distributed collaborative working,
  • to prevent premature commitment to design decisions,
  • to ensure all stakeholder views are taken into account,
  • to focus attention on problem areas [of the specification].

RE-KOMBINE is the name of the tool I wrote to support paraconsistent reasoning over Techne models. You can view a presentation which summarizes it on Slideshare, or read the CAiSE 2012 paper for more in-depth discussion.

If we continue with the Techne notion that we should have propositions which are either requirements, tasks, or domain assumptions, and then allow for refinements and conflict relations between them, then our paraconsistent approach simply says that we credulously accept minimal solutions which entail the desired requirements. That is, we are merely looking for a subset of the tasks which satisfy our requirements.

This is nice, because it means that even if there is a possible conflict between two requirements, or between a domain assumption (like “Don’t Use WEP”) and a requirement (“Use WIFI for remote terminals”) we can ignore that conflict as long as there is a ‘workaround’ solution. We like this, because it means we can be more flexible (agile) by looking for the immediately implementable solution, and worry later about how we might actually make the conflict disappear.

The only constraints we impose is that our domain assumptions are internally consistent, and that the requirements we are seeking to satisfy are also consistent with each other (if this isn’t the case, then presumably the operator is confused).

We used RE-KOMBINE on the Payment Card Industry case study (PCI-DSS) as a proof-of-concept. Our next focus is to make this tool integrate with existing requirements management and work tracking tools, in order to seamlessly fit into existing workflows.

It’s possible this post was too long.

Leave a comment

Filed under Uncategorized