Useless metrics

I had an interesting interaction in an EE store the other day, which got me thinking about the metrics we get judged by and the purpose they serve

9/14/2023 metrics

Sat in the EE store with my wife and child, talking away to the salesperson about a possible contract upgrade when my 14-month-old daughter started to get a bit fed up. She's bored, she's a confident walker and doesn't want to be strapped into a buggy and is letting everyone know.

"I'll go next door, keep her moving." My wife says. "Then maybe nip into the garden centre".

"OK," I reply. "Just pop back in before the garden centre and see how we're getting on."

The wife stands up and goes to walk out.

"Just so you know, every time you come into the store it registers you as a new customer." Comes the call from a different salesperson than we've been so far interacting with.

"OK, so what does that mean for us?" Replies my wife.

"No, it's just that it messes with our numbers if you keep coming in and out, we've been told to ask people not to do that."

My wife and I look at each other quizzically.

...

Other than being exceptionally rude in how the salesperson went about this, I found it a very interesting request from the EE employee.

Essentially, it turns out the employees at this store are judged by how many people they convert to paying customers.

It seems reasonable enough on the face of it and as developers and product-oriented people we're almost certainly all accustomed to conversion rates, drop-offs and funnels, but this particular store is now gaming the system by cutting down the number of people walking through the door. Imagine that, imagine restricting the number of people returning to your website in the hopes that only people who have pre-determined a purchase find it and give you a high conversion rate. Does this ultimately drive up the number that really counts: sales aka revenue?

It then got me thinking about software engineering (as most things do) and the "metrics" that we like to capture and review.

So often we capture metrics that are meaningless to our ultimate goal: deliver value and increase revenue (the second of which we may not like to admit to ourselves, but unless you're working for a charity it is your ultimate aim).

Take velocity for instance: the measure of how many story points we achieve per sprint. Let's start with the fact that story points are intended as a measure of how long something will take to complete. I hear a lot of "Agilists" screaming at me that "you can't equate story points to time" but, it turns out, you can. Not only can you, but story points started out life as "ideal days" with a 3 "point" task taking 9 days to complete.

If we then look at this metric through the lens of what a "story point" really is, what we're measuring here is how much complexity we are delivering on average. That's great if all a team cares about is churning out complex solutions. But if team A spends a sprint delivering a complex solution, churning through 30 story points, that no customer gives a damn about, have they had a better or worse sprint than team B who did 5 points, took the rest of the 2 weeks off and increased customer satisfaction, retention and revenue?

1 team will likely get a severe talking to, and it won't be the team with the "high velocity".

So what can team B do? Do more tickets? Engage in busy work to increase their velocity? Maybe just start estimating things higher? Will any of these increase velocity? Yes. Will any increase the value they ship? Unlikely.

All of those suggestions are rather cynical and most teams wouldn't jump at those solutions. But they are solutions to the issue of "increase velocity" and show just how weak metrics can easily be gamed.

Even solid metrics such as lead and cycle time can easily be gamed if you measure them wrong.

Lead time is the measurement of how long it takes for a customer to receive value after it has been "noticed". By noticed I mean an order placed, a problem identified etc. Cycle time is the measure of how long it takes us to produce that value.

How do we improve cycle time and lead time? The obvious answer is smaller stories. Deliver smaller chunks of work more frequently and those times will come down. Great.

However, imagine you have a large feature that you deliver in chunks because no one likes a 3000-line, 80-file change PR to review. But you deliver those chunks behind feature flags. You congratulate yourself on a job well done, cycle times are below some arbitrary goal set by management and everyone is happy. Except for your customer, who still has their problem and still hasn't seen your chunked solution.

This is the gamified way to artificially improve your metrics. You have the appearance of improving but really your customer sees no real benefit.

Imagine instead that you worked to slice your stories vertically, in the true elephant carpaccio way. The improvement in your metrics may not be as drastic (vertical story slicing is much harder than "deliver the frontend behind a feature flag until the backend can catch up) but instead of your customer waiting on the entire project for a big bang improvement, they'll see incremental improvements gradually making their life better. They may even stick around on your platform as they can see it improving, instead of jumping ship because your competitor beat you to it.

Back to EE though. Rudeness aside, you have to wonder if this is going to increase or decrease the amount of sales they make. I didn't buy the phone on that occasion, I had to get back to work as I was on my lunch break. Am I likely to go back in? No. I can get the same deal online and the human touch or personal experience you look for when going into a shop tends to go out the window when you know they're looking at you thinking "this guy again, he'll mess up our numbers." So the question is: is tracking the ratio of people walking in converting to paying customers the right metric? If, on average, you have 30 people come in and sell to 15, does increasing footfall to 60 and selling to 20 mark a failure? Or a success? Your revenue has increased by 33% but your conversion rate has decreased by 20%.

I guess the thing to take away is that all metrics shouldn't be taken in isolation and that velocity is never useful. But that last bit might be its own post.