Reflecting on Close Calls
Root Cause Analysis (RCA), also called postmortems, examine unexpected system outcomes following incidents. These should be blameless—directing focus toward systemic improvements rather than individual blame. Atlassian has excellent guidance on conducting them properly.
Close Calls in Software Engineering
Through conversations with my partner, an ecologist, I learned that her field required reporting near-misses—situations where harm nearly happened. This Health & Safety Executive requirement prompted me to consider parallel scenarios in software engineering: catching database migration errors mid-execution, revoking problematic changes before deployment, or cancelling destructive builds.
The Key Argument
Just because nothing bad happened, doesn’t mean there isn’t an opportunity to fix the process, allowing us to continue our mission of delivering high quality software at speed.
I advocate treating close calls as legitimate incidents worthy of investigation, not as non-events simply because actual damage was averted. The same systemic weaknesses that allowed a near-miss could easily result in a real incident under slightly different circumstances.