Measures of Success | Scoins.net | DJS

Measures of Success

Many businesses have a need for metrics, ways of measuring what is being done. 

"You can't manage what you can't measure." This is treated as true, when it is perhaps one the great lies of business.

Despite that, there is a lot to be said for having metrics, provided attention is paid to what they reveal. Or, more significantly, what they fail to reveal.

What all too often happens is that this metric tail ends up wagging the business dog. The metric—and there may be more than one, but very often there is a KPI, a key performance indicator—makes the life of the managerial staff simple and they fall into the trap of believing that this indicator tells the truth, tells all of the truth and tells all that they need to know.  By which time the metric is in charge.

When the metric is sufficiently important in an local environment, behaviour adapts to perform in accord with the metric. So work which will not be reflected in the metric is skimped or not done; work which will clearly reduce the metric score is dumped on targeted people or dumped altogether; it causes conflict as people fight over the obvious cherries that will add well to the metric. And throughout this, and way too often at no time does anyone dare question whether the metric is as all-seeing and wonderful as management obviously thinks it is.


Side issues:-   People are bad with data. Data bias. Unconscious bias. Pointers.  Put in Data chapter


Slow-time story needed to exemplify this.


In the context of the school environment, where the metric is resulting grades, there are fights to persuade the academically able to join your course and, at the other end of the spectrum, to push the disadvantaged and noticeably less able away from your course, so that some subjects become dumping grounds, especially those that are perceived as requiring different skills. So all of those with difficulty in English as a second language are pushed towards courses such as art, graphics, design and technology, occasionally maths — places where the perception is that language is less important. Whether this deals with the underlying problem is irrelevant, it moves the problem student ('this student is a problem for me and my subject') away from your departmental protected space (Chapter reference) and that is enough to protect the measure of success. Someone else's problem.

So this exemplifies several flaws: conflict, gaming the system, masking problems and, often, unintended consequences.

In the same scenario and later in the course, there are efforts to persuade candidates to withdraw so as to protect the subject score; there are delicate nudges to scores around the grade boundaries, there are grade estimates that are played (gamed) in line with departmental policies. These moves demonstrate a different sort of conflict, between protecting an interest (the grade score for the department  and the education of the student (including the possible benefit of a failure as prompt into different behaviour) which generally brings into question the ethos of the school, the integrity of the staff and falls into what I call the 'political' class of decision. 

No part of such behaviour measures whether the students have learned enough to cope with the changed style of learning required for the next phase of their education. The metric doesn't measure ability to adapt, to have ideas, to work independently or in a team. Instead it measures the ability to pass exams and, often, ability to conform and to follow instruction. Which in turn begs the question whether this is what we want from education.

I worked in a quantity surveyors office for several years, but actually the relevance is that this was an office job. At the end of every week, but in practice every month, we each filled in a timesheet that allocated work to jobs. It took a very short time to learn that having time left over was considered very bad and only a little longer to discover that a job on which very few people worked was going to be scrutinised more closely, so in the same way one soon learned to identify certain large contracts as the place where you dumped all the time lost, including the time spent filling in the time sheet. If I worked only on one job that month then obviously I must have spent all of those hours on that job. But this is not true and, chasing that idea and clocking myself as off-task when true, I soon discovered that 90% on-task was hard work —and rare. Time would disappear into coffee breaks, what Americans call water-cooler conversations and the genuinely approvable business-related conversations such as sharing experience, learning what was better performance (training, in effect). So my whole month on Job A was, at best, 90% accurate and probably nearer 75% of a month, after subtracting me helping others with calculations (the only mathematician on the staff, used as substitute calculator) or being sent out for sandwiches, off to do the photocopying and so on. Things improved when the boss was persuaded that perhaps there was some general office administration that we ought to be booking time to, but he visibly viewed this as an admission of failure in some way.

It was later, working on a civil engineering site, that the phrase 'hospital job' cropped up, describing a task which, on the critical path analysis, had such slack in it that the resources allocated to it would cause no problems if they failed. So, for example, the JCB that needed a lot of maintenance would be used to do the support work on the land compulsorily purchased (from farmers, in this case); it didn't matter how long it took. It was as if the fault-prone JCB was in hospital. Which had me confused, because the jobs I'd learned to throw all the leftover hours that had to be attributed to a big enough job were, you guessed it, on hospitals.


With all such metrics, we should ask, frequently, whether this is correctly measuring what we think it does. If we are clear what it is that is being measured, we are then able to resist attributing the resulting measure with properties it doesn't reflect. For example, in education there is a natural correlation between results gained and the standard of the incoming student. If a 'good' school produces superb results, is that a direct result of the teaching, or do the entrance criteria factor into this? 

I worked at a school where a (very small) part of our entrance requirement was an IQ test. We had learned that an IQ below 88 pointed at someone we could not successfully teach. I'm not saying that such people cannot be taught, I'm saying that at this particular school the way we worked failed these students. Of course, measures of IQ only form one part of the picture and so one learned to look at the students in school with IQs at the bottom of the spectrum, to wonder what we could do to better their learning (without, ideally, spending more than the matching income). And as a result we showed that we were successful with IQs of 90 and upwards and that IQs of 88-90 consistently gave us problems both in teaching and in learning. At no time did this simple test become the sole factor, but over the years I was on the staff the yardstick was tested often enough to show that the below-88 score meant that this child would not succeed with us. Which, being the place we were, was couched as 'we will fail as educators', not that the child would fail. Not suitable for here and here not suitable for you, if you like. After more decades in education I look back and marvel at the open honesty in those observations.

At another site and in another country, our significant test lay in a correlation between scores in English and Mathematics. We were quite careful to couch the mathematics to test necessary skills and to evaluate as best we could ability to think without depending on fine meaning of language, while in the English we explored the adequacy of language and, again, we were up-front about the vocabulary size we needed so as to be able to teach in English. Some people took several tests and showed that they were progressing toward the standard we required, so we repeated the test at entry. Yes, we wrote many equivalent tests and we worked hard to keep them secure. We then produced a correlation chart for each year-group and for course applicants on which we could see quite clearly the elliptical region that encompassed those with whom we could succeed (in reaching the desired standards), in this case achieving good enough A-level grades to enter university in the US or UK. For a given success in maths there was a minimum of English and vice versa. We had so many students on the edge of the envelope that it became relatively easy to show a fairly fine distinction between probable success and failure; to show an ability for our processes to cause learning to occur sufficient for the targeted grades and result. No, it was not the only criterion, but it gave us, like the IQ test above, a bottom edge.  Unsurprisingly, the marketing department found it very easy to come up with candidates who fit their requirements (able to pay the fees) but not ours. It mattered not how much we complained, every year we would be overridden and more students below the success line would join us – and, time after time would fall away. The very few who changed their behaviour and became successes would be less than one in ten of those at the edge. But just one candidate doing so was enough for Marketing (really, Sales) to produce an extra ten hopefuls the following year. And it must be said, the hope lay with Marketing. Picture of example chart, perhaps? A sotto voce comment from within Marketing observed that every failure produced a year's fees, which they would call a win. My persistent response was that this was not a win, since we spent more resource on every one of these weak candidates, plus the losses connected to having failed another customer. Marketing didn't care, because their measures did not include provision of education, only the sale of the possibility of education. Thus the superior management allowed conflct to occur and showed that its primary goal was not education, but sales. I always said that I thought sales would improve if the quality could be assured; my bosses (all outside edication) preferred the short-term solution, which served their own individual metrics and so buffered the opinions held by those higher still.

I should point out that this correlation chart was produced when trying to identify if there was any correlation between intake standard and result (if you like, exit standard) and in attempting to discover if we could identify early the third of students who quite clearly were not learning in lessons delivered in English. The whole point of the course on offer was to prepare for university study in English, so there was, in my view, no point at all in allowing use of the local language (I'm resisting saying which language) except for the occasional shortcut to understanding, equivalent to the use of a dictionary. I was not prepared to have lessons delivered in 'local' since the objective was to learn overseas and we were advertising that we did not do so. That's integrity and honesty from a different chapter. We learned that we could do much more about English failure (lists of necessary vocabulary) than we could about the maths, which was our route to measuring thinking skills. Later, we spent effort on a third aspect, being able to ask questions, which turned out to have cultural implications.

Omit. After I'd left, the advertised objectives remained much the same, but some 30-50% of lessons were in 'local' and 90% of the staff were local (as opposed to 100% and 50% while I was there); the objective was grades and only grades, to the point where I accidentally caught a previous boss creating certificates. Of course, I reported this behaviour and discussed the cultural lean towards this sort of behaviour with the next visit of (UK exam board) inspectors at some length.

_____

Remainders

Suppose your office has a timesheet whose objective is to record what projects (jobs, chargeable accounts, label-able tasks) you have worked on and how much time you spent on these tasks. What checks occur? If no checks are possible, are there jobs that you are not allocated to to which you may have contributed? Are there jobs we might call 'sensitive' that, if you record time against them, will cause questions from on high? Are there jobs in the opposite direction ('hospital jobs') to which it is understood that loads of otherwise useless or unproductive time will be allocated? Is it permissible to record that you spent ANY time being unproductive, or working on something personal, or looking at a task that we might call speculative? For what purposes does the recipient of the timesheetsput them to use? Is that the same as what those who fill in the timesheet believe? Is there any feedback to the workers about these timesheets? (should there be?) What are the genuine reasons for having these measurements?

So if you're not allowed to be idle, you are indirectly required to find a labelled task to which you can attribute useless time. Let's further assume that you're not allowed to quit for the day even when you have nothing to do. Of course you can find things to do, but is it acceptable to find something against which to charge this time?


Examples and case studies to find

Repetition; can it be turned into useful repetition?

There are several issues attached to metrics:

(i) they are susceptible to gaming. 

(ii) they don't measure what management thinks they measure

(iii) there is sufficient imprecision in what is recorded that the information produced is not actually useful. In such cases, any action taken on the basis of what is recorded is going to be flawed, perhaps fatally so.

Yes, Fred is here for 35 hours every week. That tells you how long he is onsite, not what he did. if he worked only on Project Alpha, then that doesn't recognise the time spent doing other stuff, such as internal administration, personal administration and office habits like coffee and chatter. Management might well need to recognise that the effective time onsite is quite considerably lower. This supports the argument for reduction of the working week, or for changing the methods of working more radically. If Fred's job is to work on Project Alpha until that project is complete, with incentives to produce quality and timeliness (without censure, etc) then perhaps Fred (and the team to which he contributes) is capable of causing the project to be done more effectively, or with fewer hours (or, in a wider sense, fewer resources). Does Fred have any incentives to finish this sooner? Or the opposite? How does Fred measure his progress? Woudl it be better if Fred had some input or response to any measures of progress?

Schools are often measured on exam grades. There is little dispute what the grades are. But these things are capable of manipulation; students can and are encouraged to 'drop' subjects at which they will not 'succeed', where, very often this 'success' is something that reflects on the school or department or teacher, such that these individuals have opinions at odds with the interests of the student. This has multiple effects and causes (or should cause) recognition of these conflicts. So the grades can be 'improved' by entering only those students who will produce whatever is judged as 'success'. Some subjects are obligatory, such as Maths and English, but there may be wriggles the school can offer, such as English as a foreign language, or commercial arithmetic, so that the possibility of a 'better' grade occurs and at the same time some vaunted metric ('90% top three grades at Key Stage 4'; '98% pass rate', 'no failures') is protected. 

'If it can be measured, it must be measured.' Great lies of business.

_____________

Some things cannot be measured. That does not make them valueless. In Britain, age 5-19 education is free; just because it is valued at zero cost does not make it without value. Some of the things that make a job worth doing and some of the things that a business purports to do can quite easily be difficult to measure. Customer satisfaction for example: people are 'happy' when performance exceeds expectation. If this is taken as true, then the quickest route to happiness is to lower your expectations. If my business claims that 90% of customers rate us as 'excellent', that will fairly soon result in having expectations rise above what is deliverable. You don't promise what you cannot deliver. 

For those who agree that they 'can't manage what they can't measure' it is very tempting to attempt a reversal of the logic and try to say that only what can be measured can be managed. In a post-covid world many people have rediscovered all sorts of well-being, itself very difficult to provide measures for. But if well-being is among those things that are consistently ignored by managers (because they cannot measure it) then gives gives great strength to all those who want to continue working from home as much as possible. I agree with everyone who wants to enjoy their work; I do not demand that all work be enjoyable, nor do I deny that there are bits of many jobs that are difficult to like, but that does not meant that there cannot be satisfaction in a difficult job done well, which can sometimes be a joy and sufficient reward itself. 

______________

The model of manager who reconises that his job is to act as a protective umbrella for 'his' staff includes the work that puts all those staff 'onside', acting for the benefit of the business, interested in it doing well; thus the manager’s job is then to go around and listen, suggest and persuade. That puts the workers at the centre of the business, no bad thing, provided that all are interested not in making their own lives as cushy as possible but to further the (clearly stated) interest of the firm. One of which, in a very recursive manner, is to have happy, motivated staff.

In some places, Higher Education for one, this describes the way the union reps behave, not the managers.

_______________

Resilience 

While discussing content for this, I had a suggestion that the idea that one can effect change in one's workplace is probably beaten out of people quite early on if they work for a large company/institution, as many people do in their first jobs (e.g. a supermarket, a cafe chain etc.). Thus the the initial enthusiasm to want to work is, so runs the suggestion, beaten out of people early. I place this squarely on the desktops of unenlightened mangement and blame them for any such occurrence. It then also becomes a subsequent problem with new employees. It would be interesting to see whether people whose early jobs are in small businesses feel differently about this.

So generating the trust and the ethos that makes (returns, one would hope) the workplace into a positive experience where moaning and whinging is replaced by constructive criticism and can-do attitudes is, I say, entirely down to management. Thus management from underneath, MfU, serves to indirectly educate incompetent and insensitive managers to the possibilities therein.

It is difficult to shed bad habits in the workplace. If one has learned that (for example) one's ideas aren't taken very seriously in Role A, then it's hard to recalibrate when moving to Role B.  Both MfU and resilience say one (we) should work on finding ways to have ideas accepted.

Resilience is a word whose frequency of use has been raised by covid – one hidden benefit of the pandemic. At the same time, resilience (its lack) is a term that's also often used to shame individuals for being unable to withstand problems not of their making. One hears a lot about 'resilience' and 'robustness' at university, while hearing a lot less on what the university is doing to reduce the things that one is trying to 'resist'. Thus, has some utter twit has managed to turn resilience into a negative, derogatory term? Not quite; one is often asked to 'build resilience' as individual employees, placing the burden on us as supposedly weak or brittle, when in fact the organisation (which is itself the source of whatever onslaught we are supposed to be 'resilient' against) does little to mitigate its own behaviour. So while individuals and employees should be robust, just telling people to 'be resilient' simply shifts responsibility from the organisation to the indvidual in a way that is unfair and unhelpful. 

The Army has been chasing resilience as a desirable for decades and I thought we'd learned from the pandemic that resilience is the primary goal of any business that wishes to survive change. If a lack of resilience is the central criticism in such discussion, then one has to ask what  the business is doing to improve that – it is not the role of the individual to find this, but the role of the business to engender it. Thus, with the ideas of MfU internalised, one must find ways of assisting all concerned to have resilience, to engender it and encourage it, while at the very same time working hard to show that when resilience is needed, there is recognition of the atatched additional work, the extra stress and perhaps appropriate compensation. 

One might distinguish three situations demanding resilience: those factors external to the business that demand change;  internal changes, that perhaps from errors in various classes; and those one might describe as personal to the employee.  These all need different sorts of support but inherently a team spirit will reduce the attached stress. 

Examples?

external to the business that demand change - how about the level of inflation we see in 2022 that wehaven't seen since the 1980s? How about 2020-2 covid pandemic? How about Russia invading Ukraine or fuel prices doubling? How about all the extra paperwork caused by Brexit?

internal changes, that perhaps form errors in various classes;   explain at length

? recognition that a metric doesn't measure what it needs to; a new product line in manufacturing; a revised process within the business; a change of ownership; a change of personnel (e.g. your boss); union unrest; customer unrest; ...... more, leading to better categorisation.

those one might describe as personal to the employee. Anything stressful, from a death in the family, a relationshiup break-up, moving house, having a medical issue or an accident. Some of those int he first category also belong here, where they hit at a personal level, such as the inflationary pressures and the seignificant loss of disposable income.




It is the job of the business to protect its staff from whatever it can. I claim that the principal role of every middle manager is to act as an umbrella for those 'below' in keeping off the shit that falls from on high. I see this as 'protecting one's staff'. That no such situations should be occurring is a quite different matter.


Short version:

Metrics mask problems

Metrics create conflict, they lack credibility and they lead to unintended consequences

People focus on the metric, Staff game the metric. Performance goes out the window because it has become irrelevant. Not only managers focus on the metric.

from [2] less is more and know thyself


[1] Lies, damn lies and metrics Source. Mitchell Osak.

[2] here, essay 84, but also 82, 85, 211, 232, 233 238

other sources to look at.

Academic-style papers  1 (paywall)     2 


Aside: I have issues with all sorts of measurement that mark attainment as extremes and then demand that these extremes be common. An example is the assessment of schools, where Good is barely acceptable and Outstanding is expected. That abuses the word outstanding, which ought to at most indicate the top 5%, two standard deviations above the mean. Similarly, when assisting with an application to university in the US, one was often asked to indicate the class position of the candidate; clearly, one was expected to show that all candidates but a few were outstanding. For my school unit, the candidates were self-selecting to have that outstanding property, studying A-levels taught in English, their second or third language. Outstanding in the general population, but not outstanding in the population that formed that year's cohort. So we can very easily have a problem in terminology, exacerbated by using a familar word in ways which are only accurate within a minimised context.

For marginal costs see the runnerstribe entry and How Will You Measure Your Life? (download available). The relevant part here is marginal thinking, I think the term self-explantory but you might characterise this as not allowing yourself to lose integrity. No 'just this once', no losing sight of what is right. that might be interpreted as not taking short-cuts. Or, jusr perhaps, recognising that when you do there is work left undone that cannot be left forever undone. 


is there a possibility of cross referencing this to diets, or to any habit changing? And to the deceits that go with 'affairs', as discussed between the authors in May 22?


back to MfU Central

Covid            Email: David@Scoins.net      © David Scoins 2021