Measures worth doing | | DJS

Measures worth doing

Taking the previous criticism of the measures that a business takes, this does not mean at all that all data is useless. It instead means that all data should be treated with circumspection. 

We have two opticians practices within a group practice; one is in a sea-side town, one is within a much larger inland town. Both collect data that they send 'upwards'. Their area manager, Stefan,  has never been a practising optician but is considered to be an able manager, in the sense that he does what is expected of him. Most of his role in the course of a year is centred around moving staff to fill temporary spaces. He thinks of himself as good at this.

At no time would it occur to Stefan to attempt to discover whether he could do his job 'better'. An observant elderly customer can see that the 'temporary' staff moved from the seaside to the town has noticeably different skills; this transferred optician is personable in a different way and in a sense is well-suited to the older customer, taking sufficient time, leaving many ways to exit the transactions. The equivalent observation at the coast comes from a twenty-something when the transferred townie matches the hurry-up pace and is, on reflection quite as keen to maximise the sale as anything else, perhaps more so.

If Stefan was able to discover these two contrasting observations, what might he do? There is a range of possibility here. Much depends on how Stefan sees the business and what he judges are its core objectives. If, for example, his perception is that maximal profit is the essential, then he will, consciously or not, prefer the 'townie' characterisation. If perhaps turnover is the key (and you should ask if that is not the same as profit), then maybe the length of time in consultation matters. And in both cases the amount of time that the optician declares as 'down time', time not spent with customers, becomes something he is likely to make assumptions about. 

Yet it might be more useful to look at the queueing of customers and how that queueing is managed. At the seaside practice the population is generally older and a high proportion are retired. They rely upon there being available nearby parking and they combine a visit to the optician with a number of other activities in town. They are not in a hurry; they want the glasses to be 'right' but they are not automatically going to approve of every suggestion. Because the customers include a relatively high proportion of the elderly, each consultation takes longer than the national norm. But this does not automatically make the optician less efficient or less effective if the waiting time is close to zero. We might also look at drop-in business, of which there is relatively little, but the receptionist at the coastal practice has a welcoming chat for every customer and so the older folk do tend to drop in. This is quite likely to meant that there are little tasks that keep the customers happy, so that the level of repeat business is high.

Inland, in the larger town, rent is somewhat higher and the expectation is 'throughput'. In several senses, customers are dropped into categories and given the 'standard' treatment for that category. That is not saying that the standard is low at all, just that there are so many customers that in some sense they blur into sameness. The choices of frames that are placed in front of customers leans towards the range on which the practice gets the biggest return. The length of the queue for attention is, as at the supermarket, treated as a key measure and the few unqualified staff are instructed to ensure that no customer is left unattended for whatever that acceptable time is, say three minutes. Relative to the national norms, the distribution of customer age is a good match. All measure that Stefan might find to apply work at least as well here as anywhere else. He therefore assumes that this is good information.

Indeed, however Stefan is likely to look at the data he collects, the coastal practice appears to be under-performing; it has fewer customers per hour, it has fewer new customers. Yet turnover is steady and the costs are low. 

It is sad that it is unlikely that Stefan can find nice things to say about the coastal practice, because he is not going to chat with the customers — heavens, he's a manager, he doesn't meet the customers! If he did, he would perhaps find that the customers are happy. If he had the data, he might discover that the level of repeat business is high and, if he only knew it, a steady stream of referrals come from the customers talking to each other. How is he ever going to learn about this?

Stefan has a load of data, but he is most unlikely to even begin to ask questions as to what it is that the data is telling him. It is likely that the data he has will tell him that a customer has been on the books for years, but it is unlikely that there is a count of the number of times each customer comes through the door. There is very little chance that the data shows anything of 'customer satisfaction'. If Stefan began to wonder whether the customers were actually different at his two sites, would he have any ways of describing this to himself, let alone to his manager, even more remote from the business?

Each time I discover a situation such as this, I fail to understand what any of the layers of manager actually think their job is. I despair over the apparent gap between any job description and what could be done in terms of learning and understanding. Not least, there is no obvious connection between what it is that the customer might see and of value and what the impersonal business sees as of value. Which is what happens when the driving force is to give the investors a steady return bigger than they'd have from other investment opportunities.

Thus scenario raises several issues: 

There may well be data but it doesn't respond well to interrogation. One should therefore expect that any question will result in needing to ask a load more questions and to collect a lot more data.

Data collected for one purpose may be used for other purposes. But any bias in data may well cause new question to give bad (incorrect, misleading) results.

Very often an aggregation of data hides the variability of that data.

Things we learned eventually from covid.

There are many assumptions made when an analysis of data occurs. One appears to be that the audience is remarkably stupid and quite definitely inept with numbers. In May of 2022 the case rate in England was at 1in 60 [evidence, ONS report


This is at a similar level some 48,000 positive tests, to the case count in march a year earlier, when we were hoping such a 'low' figure meant we were coming out of the winter peak. But the number of tests occurring might be expressed as positives out of the number of test occurring. Given how the landscape has changed, it might be very much more useful, now, to look instead at the numbers of cases sufficiently serious to require hospitalisation (shown). But this graph fails to give us any idea at all of the numbers with long covid, still under-performing physically and likely to have societal costs both now (not measured) and later (as yet unmeasurable). Yet I'd like to explore this to see if there are indirect ways one might count this.

However, within this very large data set some of us learned about variability. While for some, catching covid was almost a non-event, to too many it lead directly or indirectly to death. We have been persuaded, largely thanks to having successful vaccines in this country, to currently treat covid as just another respiratory ailment, maybe no worse than the 'flu. There was a lot written about relative risk; this included the relative risks attached to catching covid – one number that has stuck with me is that the risk (of hospitalisation or death, I forget which) doubled for every six years of increased age. Assuming, of course, that the person with whom you are comparing is otherwise similar in health, et cetera.

I've inserted one of the better covid graphs; it is 'better' because it answers a question. This shows how the numbers of tests have changed and also shows how many of these tests prove positive. That the two lines coincide on the right does not mean that there is 100% positivity; that is around 4%, while the testing level is around 200 thousand. But we should also recognise that testing is no longer freely available as it was, that there are pressures to return to work irrespective of health issues, that working from home is no longer encouraged as it was and so there are very much fewer reasons to be tested at all.


Customer satisfaction

A very bad measure indeed.

An example from education. A school at which I worked asked the students every year a whole load of questions that were supposed to measure how good a teacher was. In practice this produced complete rubbish. The questionnaire was handed out to all at the same time, through the very same teachers that were to be assessed; the teachers were given advance warning of the event. The 'smart' teacher with whom I shared an office made sure that her class had a round of sweets (she called them candies) at least once the week before. I said to her that the result measured temporary popularity. Under protest, the same questionnaire was handed out to my pupils, a different sub-set of the larger school. No notice was given and I collated the results. The questionnaire was handed out at a time of day when the students would be working, not tutor time and not lesson time; many saw this as therefore an intrusion upon very valuable study (rather than as an excuse to do less study) as so, quite visibly, many had chosen to answer in whatever way made the interruption as short as possible. The culture of the country in which this occurred was such that the role of teacher is considered very superior and that, in the main, teachers were not to be challenged. Our sub-set of the school was trying to change that approach so that the students might better survive at university in the US or the UK. Even so, the majority reaction was, on recognising the questionnaire, to dash through the several pages, maybe 12, with a blanket response of 'excellent'. So the very few students who saw this as an opportunity to render criticism, observation or suggestion for improvement had response whose effect was extraordinary. In effect, the outliers controlled the perception of the result. So one teacher with three non-perfect responses (out of at least 100) is supposed to be told off by me, his boss. The teacher who I was already convinced was not going to have a renewed contract had no non-perfect responses. 

What did this tell me? I decided that having fewer than 5% of the students give me any non-perfect responses told me nothing about the student body opinion. It told me perhaps who was prepared to engage in a conversation about how teaching and learning might be done. It told me nothing about the local staff, who clearly understood the system and could make it work for them. 

What did it tell those further up the hierarchy? This depended where they were physically; the resident in place understood my explanation and eventually applied the commentary to (her) wider remit. The non-resident managers decided that the evidence said that the foreign teachers were evidently not as good as the locals. This 'result' flew in the face of the primary objective, preparation for overseas university. Eventually it resulted in there being very much fewer foreign staff (which would be a much cheaper business model) and, while no doubt the local exam grades were in the short term not much different, this inevitably meant that the operating objective was no longer to be equipped for overseas university, but instead to have the best possible grades. I continue to see this as misunderstanding what such an institution should be offering. I am not at all saying that you cannot have both results, grades and preparation for beyond. I am instead trying to ensure that both occur; if the local culture works very well for one and not the other, then shifting the emphasis towards what is relatively easy is, in my view a short-term strategy. Further, it assumes that the customer base, here the fee-paying parents, is no cleverer than such a strategy. My experience says this is absolutely not so, but that if one was to pursue any short-term strategy, the clientele would soon fit that model. This might be good business but I say it is poor education.

How could one measure customer satisfaction better?

One suggestion, from a study by the Welcome Foundation on what makes happiness, is that we adjust our expectations. As with the student survey above, many users of surveys are encouraged to interpret 'no complaint' as 'excellent', when it might be that it would be more useful to ask whether some particular facet, such as wether the "temperature of the fries from McDonald's at the point when they are sampled" is among a range of responses. My own opinion here is that if I want the fries (chips, in my version of English) to be hot, then (i) I time my order to fit with there being none ready to serve and (ii) I eat soon after receipt of the food. That is, some of this issue is down to 'them' and some down to 'me'. So I address my expectations.

I had a little hunt for advice from those in business for how one might be expected to measure customer satisfaction. It seems to be consensus that a happy customer will return and an unhappy customer will not. If so, then one wants happy customers and we should ask what might make customers 'happy' and what 'happy' might mean when asked about it. It assumes that any criticism / complaint is taken on board openly and acted upon and that this is done in the hope of converting an unhappy customer into one sufficiently satisfied to return and, perhaps more importantly, that the treatment of an 'unhappy' customer is seen by other customers as being a positive experience.

McKinsey, [5, below] indicates that happier customers are balanced by reduced costs and increased revenue. I liked the use of descriptors here and the brevity of the range of points. I have paraphrased:

• Overall satisfaction, which subdivides into perception of quality, perceived reliability and fulfilment of the customer's needs. We could call this a matter of attitude throughout the experience.

•  Loyalty, which is often asked as whether you would recommend the service or product. The perception os that this attribute of loyalty expresses the likelihood that the product or service will be purchased again. If broken into sub-questions this will ask about overall satisfaction, continuation of use of the brand or product, and then that question about making recommendations.  My immediate family is useless in this regard, because we won't offer a recommendation at all unless specifically asked. Even then, the response is likely to be a description of what the product is and what it does, perhaps with admissions of misunderstanding the product – thinking it would do some things that it clearly cannot. We could label this as consequent behaviour.

•  Details of attributes, satisfaction with attributes, calls for fine decisions such as whether the fries were hot enough. As ever, these are subjective, measured against some unexplained personal judgement that is also based upon preconceptions. Not least among the attributes here is the emotional effect that the product or experience provides. While this is all very subjective, it is also the critical part of being 'happy' about the whole (of the product, or the service, or the sales experience).

•  Intent to repurchase is what, as supplier, you want to hear, but this is entirely hypothetical, I could have a wonderful experience at a restaurant on holiday, but if I am unlikely to revisit the country, then I am automatically unlikely to revisit that restaurant. Many experiences are wonderful because they occur just once;  I might wish to visit Machu Picchu, but not twice. If there is any useful question to ask then it would be whether I found the experience so extremely good that I'd recommend it, even though it cost an arm and a leg (i.e., a lot)  to go at all. An associated question might attempt to rank Machu Picchu against say the Grand Canyon or the South Pole and to attempt to judge which might be more worthwhile. Intent to repurchase is an expression of satisfaction: discovering whether a repurchase occurs is a different datum altogether.

[3]   gives you some pointers how to sample customers. 

[4]  Points to having happy customers. 


Are there any metrics really worth the bother? This  Google response showed, on the first page when I tried this, a list of 5 one of 12 and one of 64 metrics that you 'really must know'. I suggest immediately that these are things that every manager thinks every other manager ought to have. Also, I can think of several matters that are more important, such as being clear quite what business you are in and what its objectives are. If these objectives are clearly expressed, then what measures can you apply to check that you really are achieving them? It may be that you include some factors such as making responsible decisions, that you are sufficiently green, that your business integrity is intact; you may consider your colleagues in regard to their happiness, ethos, integrity or honesty; you might apply some of those attributes to your customers too. You might declare that you have only one objective, to make money. If so, then personally, I hope you fail.

Bias in data

Data can be bad, in that it is misleading. The most likely cause of this in unconscious bias.  If your data was consciously biased then you deserve whatever misleading result occurs. This can occur accidentally when you trust a source that you do not realise is biased. An example here would apply to almost anything you use as some sort of measure of 'normal', so that you hope to convince yourself that your own data is on the 'right' side of the norm.

Unconscious bias is, obviously, that of which we are unaware. Much can be attributed to what I want to call herd behaviour (and, googling the term, I found I'm not alone).  We show that positivity bias and skewed risk/reward assessments, exacerbated by the insular nature of the community and its social structure, contribute to underperforming investment advice and unnecessary trading. Discussion post sentiment has a negligible correlation with future stock market returns, but does have a positive correlation with trading volumes and volatility. Our trading simulations show that across different timeframes, this misinformation leads 50-70% of users to underperform the market average. We then examine the social structure in communities, and show that the majority of market sentiment is produced by a small number of community leaders, and that many members actively resist negative sentiment, thus minimising viewpoint diversity. [6, discussing financial markets].

Much of the material about unconscious bias is directed at representation, the bias towards 'people like me'. Mostly that can be translated as white and male, but has an unconscious spread both toward a particular group of people and away from people who are in some way seen as different.  In general, this sort of bias, separating people into groups on the basis of something like gender or colour or background or some other basis including a political preference, is frowned upon. If you think your product or service is open to all, then you really do need to establish that you are meeting the needs of society as a whole; conversely, if you are targeting one particular sub-group of the population unconsciously, you rather need to be aware of this. In a biased society this might mean that, if you are succeeding in developing a market with one particular group, this may affect your market with another group that sees itself as different or somehow superior. In the sense that you will wish to protect your customer base, this can lead to some situations that look very odd. [Good example; cosmetics for a variety of skin tones. Different packaging a product aimed at a different market.]


Is it a good thing or a bad thing to employ more people like yourself? As an employer, is diversity in the workplace automatically a good thing? Is this relevant here? Potentially, it is very important. Studies [7,8] show that employees who perceive that bias is occurring are: three times more likely to be disengaged while at work; three times more likely to confirm that they are planning to change jobs in the next year; two-and-half times more likely to have (said that they) withheld ideas / solutions over the last half-year; five times as likely to be negative about their company on social media.

The authors of [8], now Coqual, give a short list of what a leader could do to "disrupt bias" as they put it. Several of these have relevance to MfU:- making it safe to propose novel ideas, ensure everyone is heard, giving actionable feedback, implementing feedback, empowering team members, sharing credit with team members.

A lot of what we argue in this work assumes that the management is as yet disinterested in seeing the value of change and so Management from Underneath is to an extent about action that may be seen as subversive, causing movement in the desired direction without setting up confrontation. This is an odd form of bias, one which separates Us from Them without needing to be in any way precise about that bias; it may be characterised as simple as old-school Management and old-school Workers. This is fundamentally flawed because it assumes that the workers are disinterested in the well-being of the business. Worse, it encourages such an attitude. So, while it would disrupt bias if it were possible to propose a new idea, one suggestion here has been to persuade a higher-up that they had the idea, that they own it and that therefore you are giving support to them pursuing this. In the language of defeating bias, this is then encouraging the immediate boss to see themselves as an innovator, but at the same time to be given opportunity to see the team (below, in an old sense of management structure) as a source; with only a little more persuasion, the cycle of feedback will occur, encouraging more ideas, more voices heard and a steady move in a direction that all 'below' will appreciate.


[8]  "a recent study from the non-profit Center for Talent Innovation measured the impact on employees who perceive implicit bias in the workplace" [7].  I think this is it:-

[9]  Direct, accessible language, clearly stated results. Book form, 2017. The Center for Talent Innovation has become CoQual. There is some excellent material in the archive, too.

Possible to rewrite the last few paragraphs as beginning from the concept of an unconscious division such as Us/Them = Management/Workers. Then to describe this as a form of bias and that therefore all ideas to combat bias in any form are applicable. Within that, MfU is attempting to describe how to move towards such a situation. Perhaps ideally to proffer both in different chapters and bring them together towards the end?

Also, [8] and [9] point to larger issues, such as the imperative of innovation, which has to include both encouraging employees to share ideas and for leaders to action these. Thus MfU is about coping strategies where your perception of the leadership and its layers precludes the movement of ideas and any innovative actions.

Side issues from previous chapter:-   People are bad with data. Data bias. Unconscious bias. Pointers.

From the last chapter:    With all such metrics, we should ask, frequently, whether this is correctly measuring what we think it does. If we are clear what it is that is being measured, we are then able to resist attributing the resulting measure with properties it doesn't reflect. For example, in education there is a natural correlation between results gained and the standard of the incoming student. If a 'good' school produces superb results, is that a direct result of the teaching, or do the entrance criteria factor into this? 

Omit. After I'd left, the advertised objectives remained much the same, but some 30-50% of lessons were in 'local' and 90% of the staff were local (as opposed to 100% and 50% while I was there); the objective was grades and only grades, to the point where I accidentally caught a previous boss creating certificates. Of course, I reported this behaviour and discussed the cultural lean towards this sort of behaviour with the next visit of (UK exam board) inspectors at some length.


Remainders from Measures of Success

Suppose your office has a timesheet whose objective is to record what projects (jobs, chargeable accounts, label-able tasks) you have worked on and how much time you spent on these tasks. What checks occur? If no checks are possible, are there jobs that you are not allocated to which you may have contributed? Are there jobs we might call 'sensitive' that, if you record time against them, will cause questions from on high? Are there jobs in the opposite direction ('hospital jobs') to which it is understood that loads of otherwise useless or unproductive time will be allocated? Is it permissible to record that you spent ANY time being unproductive, or working on something personal, or looking at a task that we might call speculative? For what purposes does the recipient of the timesheets put them to use? Is that the same as what those who fill in the timesheet believe? Is there any feedback to the workers about these timesheets? (should there be?) What are the genuine reasons for having these measurements?

So if you're not allowed to be idle, you are indirectly required to find a labelled task to which you can attribute useless time. Let's further assume that you're not allowed to quit for the day even when you have nothing to do. Of course you can find things to do, but is it acceptable to find something against which to charge this time?

Examples and case studies to find

back to MfU Central

Covid            Email:      © David Scoins 2021