Avoiding Broken Windows in Software Teams
I once asked a Senior Engineer before her departure from the company what was the main problem with our project. The project has been facing multiple hurdles with massive tech debt and a high turnover rate. She asked me if I knew what was the Broken Window theory.
The broken windows theory states that visible signs of crime and disorder create an environment that further encourages more crime and disorder.
One broken window, left unrepaired for any substantial length of time, instills in the inhabitants of the building a sense of abandonment — a sense that the powers that be don’t care about the building. So another window gets broken. People start littering. Graffiti appears. Serious structural damage begins. In a relatively short span of time, the building becomes damaged beyond the owner’s desire to fix it, and the sense of abandonment becomes reality.
Just as the flap of a butterfly’s wings may cause a tornado, seemingly small events or actions from individuals can drastically alter a team’s behavior and norms in the long run. Thus, small actions matter much more than one expects. As I look back on many of my teams over my career, I realised that many of them did suffer from broken windows.
Examples of Broken Windows in IT
In a Software Engineering project, broken windows usually materialize as occasional lapses in team norms or processes by a few individuals. Below are some examples:
- Team members consistently join team meetings late for various reasons due to meeting overruns or other errands. Every meeting is delayed by 5 to 10 minutes or more and if it happens frequently, other team members start joining meetings late too.
- Certain team members do not participate in the ceremonies, leading to other team members skipping them too.
- A meeting gets postponed over and over at the last minute by the organizer, which undermines the importance of that meeting and their reliability to plan their schedules properly.
- Someone fails to follow up on a meeting or action items.
- Team duties rosters become disorganized and untracked.
They could also be technical problems:
- Team members do not adhere to a coding style or code reviews.
- The existing codebase is messy without any refactoring, leading to poorer code as team members cannot add new features easily.
- New security warnings or issues are ignored because it is no one’s duty.
- Common pipelines or tests are failing but no one fixes them.
- Unused code branches or libraries are left in the repositories with no one cleaning them up.
- Someone raising an issue with performance or architecture but no one else looks into them.
Why Fix Broken Windows?
Broken windows may seem like insignificant problems initially and one may seem like a jerk to be continually pointing them out. However, they must be swiftly dealt with by a team for two main reasons.
The first and most insidious problem of Broken Windows is that they lead to a gradual deviation from standards when team members start to normalize them over time. This is a common problem that also happens in many other fields.
In a common fictitious example taught in Engineering classes, one maintenance engineer noticed that a boiler was deviating from its expected temperature reading of 100.0°C by just 0.1°C and accepts it as normal since it was only a small deviation. After some weeks, the boiler’s reading became 100.2°C, it was again deemed acceptable since it was another small deviation from the previous reading. Again a few months later, the deviation grew and the reading became 100.3°C and no one raised any warnings. The process continues until one day a major mishap happens.
Broken windows are like the small warnings that are always around but no one paid attention to them until a major problem happens. In IT, broken windows like poor adherence to team processes or clean coding are often less life-threatening but they often fester into massive tech debt down the road.
The second problem with broken windows is that they lead to nonchalance among team members. Imagine if a diligent team member has been painstakingly pointing out the issues, but no one cares at all to take any action. This is extremely exhausting and discouraging for that team member. For example, a unit test suite has perhaps previously taken 3 minutes but grown to 30 minutes due to poor code changes. Or the architecture is designedly wrongly for a specific problem but no one pays attention to change it. Such situations often lead to more tech debt over time. If no one cares about such issues, why should this individual? In the worst case, it can also lead to the attrition of talented team members since talents are often craftsmen who care about the quality of their work.
Quality is a team issue. It is important to stem broken windows as soon as they surface. This can be simple acts like reminding team members to correct their course (be it bad code, bad designs, or bad decisions) as quickly as possible before it is too late and becomes a team norm.