Unblocking a Project
I’ll tell you about how I helped a project that had gone off the rails.
First, I’ll give a bit of background, talk about my diagnosis of the situation, talk about how I addressed the 2 key issues, and then talk about the final result of the project.
Background
I had just started working with a very senior team. I realized that their key project was in trouble: it had already been delayed by 3 months, the engineers were anxious about their productivity, and my org’s leadership didn’t know where the project would be by the end of the quarter.
I started with a listening and learning tour with my directs, stakeholders and partners. I learned there were two separate issues at play: a tense relationship with another team at the company and some fear of disappointing the senior leaders in the org.
The Diagnosis
The team was behind because they were writing a bespoke solution rather than using an off-the-shelf one for a critical componentFor the curious, they were rolling out a generic communication layer responsible for the practical details of distributed systems (a “service mesh”). There are two components in a modern service mesh: a high performance network proxy that sits between services (the “data plane”) and a command service that configures the proxies (the “control plane”). They were using the industry-standard data plane (Envoy), but they were writing their own control plane (rather than using Istio, etc.).
. Worse, this bespoke component was written in a language that was new to everyone on the teamThe choice to use Go was actually quite reasonable given that they were going to write a bespoke control plane: there was a great library for building this component in Go and no equivalent library in any other language.
and was using a network protocol that had been deprecated by the rest of the industryThey were using the deprecated v2 version of the xDS protocol instead of the v3 version. They had gotten so far behind they weren’t even going to ship before the v2 version was end-of-lifed.
. This choice required them to reinvent a whole series of wheels and to become deep experts in a complicated new technology stack.
From talking to the engineers on the project and other team historians, I gleaned that the team had probably made these decisions to work around an organizational dysfunction. In particular, my team historically had a tense relationshipThis relationship had deteriorated over the previous two years due to missteps on both sides. The other team was constantly overloaded and thus had consistently failed to execute in a timely manner on projects to help my team. My team believed they were responsible for the highest priority project in the company and had sometimes ignored traditional team boundaries to achieve their goals. On top of that, previous management on both sides had not sufficiently prioritized building trusting relationships between the two teams.
with the team that would have otherwise maintained the off-the-shelf solution and the previous project lead had probably decided that writing a bespoke component was a better approach than relying on that team.
Every engineer on the team universally agreed this was the wrong approach in hindsight, but they felt trapped in their initial decisions. The previous project lead left the company after failing to make headway on the project and there was immense pressure from our org leadership to ship something before a major migration started at the company the following year. My team thought it was safer to stick with the current broken solution than to start from scratch.
Unblocking the team
To unblock the team, I had to work on two issues: the relationship with the other team and the fear of disappointing my org’s leaders.
I threw myself into improving the relationship with the team in the other org. I had one-on-one meetings with everyone from that org ranging from engineers to the vice president. I asked for their perspectives and listenedThis overwhelmingly helped: one engineer said that in his 5 years at the company I was the first time “anyone from my org had asked for his opinion.”
. I showed respect for them by teaching myself their tools and personally sacrificing alongside themAt one point I stayed up until 3am helping them diagnose an incident that affected the whole company.
. Last, I encouraged my team to build deeper relationships with that org and they started a technical working group between engineers on both teamsI coached my engineers to put in the initial “proof of work” to demonstrate to the other team that we were willing to listen to them.
. These steps improved the relationship to the point where the other team was open to the idea of joint project and my team was starting to brainstorm ideas that involved doing just that.
The fear of disappointing my org’s leaders was comparatively “easier” to work on. Through conversations with them, I understood that their chief worry was that the upcoming migration would bet the company on a new, untested technology that wouldn’t be able to handle the scale of the company. They wanted to answer two quesitons: “will the project be able to handle their performance requirements” and “is the team able to reliably deliver on the needs of the business.”
I got my org leadership to agree to a clear short step towards both goals: for this quarter, we would perform a large load test that would definitively show the project could handle the most demanding services in the company. For future quarters, I laid out a roadmap that would stay ahead of the migration. I then came back to the team with this new goal and helped them identify key org stakeholders that we would need to convince with our load test.
Final result
Ultimately, the team wrapped up the quarter with a successful load test and presentation that assuaged the concerns the most important stakeholders in my org. We also entered the new quarter strong with a clear joint roadmap with our partner team to deliver the project using an off-the-shelf solution for all critical components.