Knowledge and complexity

Somewhat inspired by +Rob England, I tried a mapping of Rumsfeldian terminology to Cynefin (yes, i know this predates the SecDef!).

Known knowns – either a simple or complicated case. If simple, we do it routinely. E.g. landing an aircraft in good weather.

Known unknowns – we have a plan for accommodating it. E.g. landing an aircraft in Edmonton with high crosswinds.

Unknown unknowns – we are in a complex domain and we have to see how things should work with experiments e.g. edge of envelope flying. One of the things that the Kennedys made clearer for me is how experiments have to be well conceived, in particular, by controlling variables properly (see their SysEng Journal paper).

Unknown knowns – we didn’t realize we could plan for this but now that we sense it, we can delegate to existing routines, or perhaps adapt to it. That pathway is available but unused. The example would be .. looking through the manual and realizing the system can actually do this (maybe the Apollo 18 case? There they did the exaptation that Snowden often mentions.)

Snowden would no doubt criticize my limited understanding, but there is some use in seeing how these frameworks co-exist, for me at least.

Update: Dave Snowden replies with a pointer to his HBR article with Cynthia Kurtz in which “knowns” are discussed. Summary: simple contexts are the domain of known knowns, complicated contexts are the domain of known unknowns, since experts are required. Unknown unknowns are the domain of complex contexts. Interestingly, they categorize the Apollo 13 case as being in the complex context. In the sense that there was no clear answer, that makes sense, but to me, it also highlighted that idea of unknown knowns: that is, these skilled engineers did “know” the answer (since the astronauts survive), but not consciously. So we could perhaps characterize that as relying on the “expert within”.

Leave a comment

Filed under Uncategorized

Some Advice on Doing a PostDoc in Software Engineering

Post-doc positions in CS are a growing part of the research landscape, as seen in this figure from the CRA:

CRA 2011 Research positions

So if you are a senior doctoral student, should you take a post-doc offering? Herewith a few tips based on my own experience (7 year doctoral student in Toronto, 1.5 years postdoc at UBC, now with SEI as researcher).

1. Figure out your long-term goal. Do you want a research intensive faculty post? Or a teaching-intensive job? Industrial research lab? Industry development job? I would not bother with a post-doc if I wanted a programming job (even a PHD is a hindrance here, in most cases).

2. Think about networking. Most of the people who get the top jobs in the field are well-connected to the main community, via supervisor connections, industry internships, collaborations. You will need to secure 3-4 people who will write highly of you. You need to get onto the short list. I am assuming you already understand what the bar is for high quality research.

3. Avoid teaching positions if you want to do research, and vice-versa. I did a dual position, and while I love teaching, research suffered. There is plenty of time to think about teaching later, and in my experience, top research schools almost never ask about teaching experience. On the other hand, if you apply to teaching universities, then demonstrable ability to manage a large class should serve you well, as will good evaluations.

4. Evaluate how well the position will accommodate your existing research contributions. You simply do not have time in the standard 2 year postdoc to shift your research interests dramatically. To me this is one big difference with life science postdocs, where you get lab experience and the positions are typically for 3-4 years. Ideally, you will be able to submit to ICSE, CAV, FMCAD, PLDI etc. immediately after starting. I’m not convinced that hiring committees are at all sympathetic to *any* gaps in your publication record.

5. Be realistic about the quality of the lab’s past research. Are they publishing in the top venues? Is there a history of collaborative work, where you might be able to tail along on a paper as you start your post?

6. Finally, all the other criteria: is it a nice city? Are the people friendly? What is the salary? 50K Canadian is at the upper end for most positions.

These tips are mainly pragmatic and from a research hiring point of view, where the main reason to do a post-doc is to improve your publication record and increase your social network. That being said, one of the things I most liked about my postdoc was meeting and working with great people. That sticks with you longer than any paper submissions will.

EDIT: I’ve been reminded of another criterion, namely, does your partner (if any) support your decision!

2 Comments

Filed under Uncategorized

The fuzzy notion of “business value”

Software development is rife with references to business value, particularly in agile approaches: the Agile Manifesto declares that “Our highest priority is to satisfy the customer
through early and continuous delivery of valuable software.”

The trouble is that it isn’t clear what ‘valuable’ means. I’m sure that the point of this phrase, as with most of the Manifesto, is to start a discussion rather than to behave as a prescriptive methodology. I believe “value” is inherently context-dependent, so in that sense it is reasonable to leave it vague.

On the other hand, many people refer to business value as the Holy Grail of software development: this is what you are supposedly optimizing in Scrum. Other methodologies help focus on “impact”. Lean approaches have one remove ‘waste’ from the value stream. And yet no one has ever pinned value down, as Racheva et al. have shown [1] (in the software domain, anyway – many attempts have been made in economics).

Business value does have the nice property of communicability, though. It gives developers that one number to sell the project to the business, and allows for a conversation about scope, cost of delay, and prioritization that is difficult to do with purely qualitative methods. And for the mathematically inclined, it lends itself to algorithms like linear programming for optimization.

One paper does try to break business value into more reasonable components, which I quite liked. It is by Heidenberg et al. [2]. They break business value into five dimensions, each of which is ranked on an ordinal scale with four possible categories:

  1. Monetary Value – this is the number calculated by, say, a business analyst.
  2. Market Enabler – does delivering this feature create new market opportunity?
  3. Technical Enabler – does this feature help prepare the company for other features?
  4. Competence Growth – measures how much the work will improve the team’s skill.
  5. Employee Satisfaction – do the developers like working on this feature?
  6. Customer Satisfaction – how much will customers appreciate this work?

One of the big problems with agile planning is that making categories 2, 3, 4, 5 visible is often hard. It is comparatively easy to sell a customer a feature that ranks highly in monetary value or customer satisfaction – these are the slick and cool UI widgets, the mission-critical Word reporting function, and so on. But making architectural work (category 3) visible is very challenging. If we had this six-factor model, prioritizing important architectural work would be easier.

[1] Z. Racheva, M. Daneva, and K. Sikkel, “Value Creation by Agile Projects: Methodology or Mystery?,” presented at Product-Focused Software Process Improvement, 2009, vol. 32, no. 12, pp. 141–155.

[2] J. Heidenberg, M. Weijola, K. Mikkonen, and I. Porres, “A Model for Business Value in Large-Scale Agile and Lean Software Development,” presented at EUROSpi: Systems, Software and Services Process Improvement, 2012, pp. 49–60.

2 Comments

Filed under Uncategorized

Obtaining a Pennsylvania Driver’s Licence with an H1-B

In case this helps other people:

PennDOT rules on what paperwork is needed can be found here. In addition, keep in mind the following:

  • We needed a letter from my employer, I94+passports, old licences, 2 proofs of residence (lease+bills), and a rejection letter from Social Security for the SSN (for my wife) and a letter with the number on it for me (haven’t got the physical card yet).
  • If you have a H1-B, and your spouse has an H4, you will need to go with your spouse if s/he is getting a licence as well – you can’t go separately.
  • PennDOT does not take cheques drawn on foreign banks, only US banks. Fortunately you can get money orders easily at grocery stores. There is a Giant Eagle that does this near the Penn Hills licence centre.
  • Staff are pleasant but extremely over-worked, so be patient. Downtown Pittsburgh was less busy during the week than Penn Hills on the weekend.

Leave a comment

Filed under Uncategorized

Teaching Advanced Software Engineering

Material

The course covers software architecture, with a focus on quality attributes, security, and formal methods. I liked the range of material, even though my expertise is limited with formal methods. It is difficult to teach architecture to students in a 3 month time frame, so we expanded using the AOSA textbooks. Students did a presentation for five minutes as a way of exposing them to various different architectures.

The other large component of the class is a course project. In this semester they had to build a location-aware, social application. There were great projects including my personal fave, a zombie fighting location-based game.

My favorite part of this course, like the third-year course, is seeing how the students approach the project. Some are truly excellent coders and put an enormous amount of effort into the project.

I introduced a few new lecture topics in addition to the ones pre-existing. I added a topic on Service Orientation and SOA, important trend in particular in enterprise architecture; a new topic on REST, which was well connected to the project; and a topic on agility and architecture, based in part on the book by Dean Leffingwell, Agile Requirements. I thought all of these were useful, although they tend to be less easily tested than e.g. model checking, so perhaps students are just forgetting them. I even mentioned CMMI!

Overall

I feel that these types of courses in SE should be more about reflective apprenticeship than the lecture and project model. In other words, there should be more focus on feedback about the way in which students do design, more experiential learning, and less memorization of specific techniques such as formal methods (which should really be in a separate course, in my opinion). Mark Guzdial called this reflective apprenticeship and points to the work of Donald Schön.

A parallel might be drawn with the way law schools operate. If you want to learn about how one practices law, argues cases, etc., you hire experienced practitioners as adjunct faculty. There seems to me to be a real difference between the skills of an academic SE faculty member and a person who has spent years building high-availability, mission critical software. There are a few of the latter at UBC, such as Philippe Kruchten, but in general it is exceedingly difficult for them to be hired without academic credentials. Not to mention the increasingly large salary gap between industrial SE and academic SE.

The other thing I disliked is that it seemed to me that a few students hid out during the project, latching on remora-like to their more capable teammates to secure a good mark with no work. It irritates me to pass students who are not able to write code (not good code, or even mediocre code – just not write code! See Fizzbuzz). It is very difficult to (defensibly) identify these people, however. One technique which I should have used is to ask questions of the indifividual students during their final demos. This would help to identify who actually knows what the heck is going on.

Academic dishonesty and software engineering

On the one hand we cannot prohibit collaboration and code re-use: these are fundamental practices in software engineering. On the other hand, we need to assess the student’s actual contribution. I had a few interesting cases that suggest our pollicies in this area need more attention.
1. One group used a sample Ruby on Rails project to bootstrap their application. It came with most of the controllers they needed. They then customized the UI and logic to implement the functionality (poorly).
2. Another team hired a third-party designer to do custom artwork for the project (which looked fantastic).
3. A different team had a friend with web design expertise work on the CSS for the project.
4. Several teams used Twitter’s Bootstrap UI library or JqueryUI to simplify their efforts on the design end.
5. Many groups used third-party libraries to simplify their life, like JQuery, Rails, image libraries, etc.

Obviously, most of these are exactly what would happen in industry. On the other hand, it definitely gains one an advantage. The Twitter Bootstrap apps all looked an order of magnitude better than the custom apps.

My principle was the remixing and reuse was fine, as long as it was properly acknowledged. In the design case, we could try to discount that aspect of the UI in the marking. But it is almost certainly the case that some groups did NOT acknowledge their use of other people’s IP, and yet benefited from it. I don’t have a good solution to the problem of detecting code reuse. And furthermore, the burden of proof is pretty high to call something cheating, and requires more than a gut feeling or a commit to Github that touched hundreds of files at once (i.e. a bulk commit of 3rd party code).

IT role

One of the things which I think will only become more prevalent is the use of third-party services to manage the course. In the past, students would use CS department machines and servers to do their assignments, a CS database server, and store code on the department subversion or IBM RTC servers.

This semester I don’t think we used a single department resource, save for email (and that only because I was forced to for privacy reasons) and the course webpage. Class discussions took place on Piazza.com; code and issues were managed with Github, and students nearly always have their own laptops and Android devices (there was not a single group that chose iOS, incidentally, although nearly half the class has Macbooks. I think the 99$ fee is a real stumbling block – that and Objective-C).

I did not get any support or materiel from the department, apart from the classroom and photocopier. I could just as easily have run this course from my home. So what should the IT section do? They could manage Github for me (they were extremely reluctant to do this, and very hesitant about even installing Bugzilla, apparently). They could provide more AV services to record classes. They could manage virtual machines for me, so that each student could install the same setup — things like Puppet and Vagrant will be key in the coming years.

Finally, the UBC wireless infrastructure is truly terrible. You get better wifi at the Starbucks. Latency between two machines in my office was 200ms! The connection is constantly dropping or extremely slow, such that even demos are affected by the web performance.

Student perceptions

In an unscientific survey, I asked the following questions:

1. How could the TAs and myself improve your experience?

Students were either positive (but they had names attached to their responses, so that isn’t unexpected) or asked for more help. One of the big challenges they face is sorting out silly configuration problems. They would like more advice on design choices as well. I think this is a real opportunity to make the project more like an apprenticeship model, a la Software Craftsmanship: take some senior developers, get them to do an hour of code review, an hour of design feedback, etc. And there seem to be many companies eager to help out (and recruit) for whom this might be doable.The other issue was that due to 4) below, TAs and myself often did not know much about the technology (e.g., Microsoft’s C#/Azure platforms). However, this is definitely a learning objective in the course. Admittedly in industry one would often be able to ask senior devs these questions. However, the ability to track these answers down is invaluable, I feel.

2. Were the AOSA readings useful?

Most students responded that they appreciated the opportunity to present to a large audience. But the overall lessons of the architecture in these systems was lost on them, because it did not have a lot of relevance to the project, which consumed the majority of the time. Asking exam questions was difficult, as there was a lot of material that would have to be studied. I think I would keep this module but be more strict about the time limit (5mins) and give some introductory examples/prep before hand.

3. Did Github work for you?

Students loved Github. Egit was less good (and personally I find it less usable than the command line). A major improvement over RTC.

4. Was the freedom to choose language good or bad?

Most students loved this aspect as well. In the past the project, worth 40% of the course, has been in e.g. Java+Tomcat for everyone. Feedback here indicated that main problems were finding team members with similar interests (in, e.g. RoR), getting help from TAs, and a possible penalty on the final, where the code snippets are in Java.

5. What annoyed you about the project?

Unsurprisingly, most complaints were about the time it took – one student spent 80 hours over two weekends on what was, however, a really cool UI – and the vagaries of group work with fellow team members, some of whom get sick, abandon their teammates, or simply are not good at programming. Students would appreciate more help on scoping the project, and getting the thing started earlier. We tried to address this by insisting on an early ‘project idea’ review in the first 3 weeks, and by doing a 30 minute design review midway. However, some people have to learn the hard way, and ultimately, we are constrained by how many teams there are – 27 in this case. Multiply that by 30mins and you can see the magnitude of the challenge. I had 3 TAs to help, but that is still a ton of work. And I think students got frustrated, since they see a 1-1 interaction, not 27-1 that I see. 

3 Comments

Filed under Uncategorized

A stitch in time…

This blog post from the excellent complexity blog Godel’s Lost Letter is on the theory behind branch and bound search. One of my favourite things about this sort of analysis is how it it can eliminate, with mathematical certainty, hours and hours of programming effort. Consider this statement:

There is an issue of being odd or even, which matters but not hugely, since pruning the bottom layer is not so valuable.

I have spent many hours working on problems that might fall into the “matters, but is not so valuable” category. A few hours of analysis might well have saved me a lot of trouble.

1 Comment

Filed under Uncategorized

One aspect of an “engineering discipline” …

In the ACM Software Engineering Code of Ethics, part 8, “SELF”, refers to one’s ethical duty to oneself. Software developers should ensure they “Further their knowledge of developments in the analysis, specification, design, development, maintenance and testing of software and related documents, together with the management of the development process.”

Recently Tim Bray tweeted “Software engineering needs consensus on Best Practices in terms of tools and architecture”, linking to his blog in response to Steve Yegge’s post on liberalism and conservatism in software engineering.

I respect immensely the work that engineers do in the field. You cannot fault the quality and longevity of Tim’s accomplishments as a software engineer. But as a researcher I do find it incredibly frustrating that there remains this perception that software is not an engineering discipline, or that best practices are not codified. It sounds to me like Tim is calling for a conference of people facing similar software challenges.

The problem is, we’ve already had this discussion. In 1968. And arguably in 2001. And while there is an awful lot of garbage and noise in academic research, we have some really good, empirically justified understanding of what these best practices are.

They aren’t hard to find: you can find excellent case studies and experiments in numerous sources like the book Making Software, IEEE Software magazine, Microsoft’s Empirical SE group, ICSE papers, Crosstalk Journals, Cutter Journals, and of course practitioner blogs. This need for evidence is a big reason for our efforts in the Never Work in Theory blog.

Leave a comment

Filed under Uncategorized