DebuGGing tech teams, not so straightforward

Aug 29

Recently someone asked me a question “How do you debug an engineering team?” By debug, they meant how do you troubleshoot what’s wrong with the team and how do you fix those issues?

It’s not an easy or a play-by-play answer. It DEPENDS!!!

It depends on YOU, your team(s), and the business goals. Let’s break it down.

The first step is the quantification of this seemingly overly-used word “performance“. How do you measure or define what good and bad means for you?

The ones I truly care about are

Planning and delivery accuracy - How often the tech teams accurately estimate the work and deliver on time.
Quality issues - Is the work of high quality and is the team upholding them to a high bar? This is very important for multiple reasons. Bugs are not only a drain on productivity, but also one of the biggest reasons for a bad customer experience. Imagine going to Google, searching for something but getting no results or other wrong results.
Productivity - An overly defensive team is worse than a hasty team. Sometimes the teams while trying to optimize for the metrics stated above, get into an overly defensive mindset. Another reason for low productivity can be a lack of tooling. Is the team deploying code by hand? There’s too much manual testing? Too much tech debt?
ROI - I don’t like to use lines of code or PRs as a measure of impact. I use ROI as the measure of impact. A team can do a bunch of busy work which doesn’t add value to the business. This needs to be avoided at all costs. Some people ask me, how do you quantify ROI for a tech debt or an on-call task? To be clear, every team should set aside some time for KTLO work. But anything the team does should map to either incremental revenue, cost optimization, or strategic investment
Morale - This one is the hardest to measure. As a leader, you should be able to read the team's pulse. Are they players or pawns? Listen to your team, provide the context, and build consensus to get their buy-in, rather than just telling them what to do.

OK, so we have laid out a few metrics, but how do we use them to fix issues? Remember, Disney World wasn’t built overnight, your team won’t be fixed overnight either.

Build KPIs and Monitor

Debugging is an ongoing process. Continuously monitor the organization’s performance. Regularly check in with teams, track key performance indicators (KPIs), and remain open to feedback. If new symptoms arise, revisit the debugging process to address them.

Iterative improvement is key. The landscape of software development is always evolving, and organizations need to be agile in adapting to new challenges.

Identify the Symptoms

Pick a few areas where the teams are consistently missing the mark. Don’t try to attack every problem at once. Go for the ones that matter the most to your company.

Trace the Root Cause

Once symptoms are identified, the next step is to trace back to the root cause. This often involves looking at various aspects of the organization:

Processes: Are the development, testing, and deployment processes optimized? Are they well-documented and followed by all teams? Inefficient or poorly defined processes can lead to delays and quality issues.
Team Structure: Examine the organization’s structure. Are there too many layers of management, or is the team too flat? Are responsibilities clearly defined? Misalignment in roles and responsibilities can lead to confusion and duplication of effort.
Communication: Investigate how information flows within the organization. Are there clear channels for communication? Are teams working in silos? Poor communication can lead to misunderstandings and misaligned goals.
Tools and Technology: Assess the tools and technologies in use. Are they outdated, overly complex, or not suited to the team’s needs? The wrong tools can hamper productivity and lead to frustration.
Culture: Consider the organizational culture. Is there a blame culture, or do teams feel safe to take risks and make mistakes? A toxic culture can stifle innovation and lead to burnout.

Isolate the Problem Areas

After identifying potential root causes, it's crucial to isolate the problem areas. This involves prioritizing which issues need immediate attention and which can be addressed later. For instance, if communication breakdowns are leading to missed deadlines, it may be necessary to address this before tackling process inefficiencies.

Engage with team members at all levels to get their perspectives. Often, those on the front lines have valuable insights into the problems and potential solutions.

Test and Implement Solutions

With a clear understanding of the issues, the next step is to implement solutions. This may involve:

Redesigning Processes: Streamline workflows to eliminate bottlenecks, automate repetitive tasks, and ensure that everyone understands their roles and responsibilities.
Restructuring Teams: Consider reorganizing teams to improve collaboration and reduce hierarchy. Cross-functional teams, where members from different disciplines work together, can often be more effective.
Improving Communication: Implement regular check-ins, create clear documentation, and establish open channels for feedback. Encouraging transparency can help prevent misunderstandings and align everyone toward common goals.
Upgrading Tools: Invest in tools that enhance productivity, streamline workflows, and reduce friction in the development process.
Cultivating a Positive Culture: Foster a culture of continuous improvement, where experimentation is encouraged, and failures are seen as learning opportunities.

It's essential to monitor the impact of these changes and be prepared to iterate. Just like debugging code, solutions may not work perfectly on the first attempt, and adjustments may be necessary.

Remember, this is not a fit-all solution. Start with your OKRs, and identify the strategic goals for your teams. That will define what’s most important to your leaders. Speed, Quality, Innovation, etc. GOOD LUCK!!!

viplav mishra