I have been wondering about data recently. The only data of any size that I think is mine is my personal data. Thanks to the internet I have access to a lot of other data, but rapidly it becomes clear that I can't access someone else's personal data—nor would I want to—but I can often find the aggregated data of the nation. Though it is quite likely that this data is only partial and not a true representation of the population, but instead of the population that has been counted. Quite what causes someone to be included or excluded is far from clear.
We read quite often about 'big data', without it being terribly clear quite who has it, or owns it, or should own it. We read of businesses that sell stuff using 'our' data to inform what ads or services they push in our direction and I'm fairly sure that we are largely unaware (I) what form this selection takes (ii) what data is being used (iii) quite when we agreed to what it is that is being used.
Let's explore that for a while: What I go buy at the supermarket is quite easily collected by the vendor and is their information quite as much as mine [Issue? Care to dispute this?]. The information collected by such a vendor has at least two distinct uses, which I'd like to separate into individual, meaning me or my household, and collected, meaning the aggregated data for that shop, that locality, that nation.
The individual case raises issues such as whose data this is and what rights if any attach to that ownership. I suggest immediately that, unless you go spend only cash and unless you lose all loyalty cards, you have chosen to share your personal data with the vendor. ¹ Further, it is unlikely that this situation can change unless payment can be somehow separated from connection with the content of the generalised shopping basket.
The general case is surprisingly different and has few downsides I can see. Surely it is in everyone's interest that what we in general buy is used to inform the vendor what needs replacing, what is selling or not, how trends in purchasing are changing. That does not help you at all if your choices (opinions, etc) differ radically from the norm, but for everyone who belongs in the bulk of the population, this is a positive.
The aggregated data, such as what all Asda Blackpool customers have bought this month, has obvious value to Asda not only per store but for the business. Knowing also that those who bought A also bought B and C may usefully inform the location of {A, B, C} in the shop and may indeed offer an opportunity to increase sales of all three. This is data still attached to Asda though, and as such is their data. However, once the data is anonymised further, we have opportunity to declare this to be of general ownership and accessible to many. There are obvious issues with the anonymisation process and at some point therein there is a good deal of trust being applied — mostly that the anonymisation performs as advertised, which rather demands that it be clear who owns quite what, or perhaps that information is free. ².
So, if you had a choice, would you share your information? Does it have value? Did you somehow sell it? This pondering rapidly led me to David Siegel, [1] whose material is worthy of a read. He concludes that monetizing your personal data is like trying to win money in a casino. You will not successfully sell your data and gain financially. Most of us 'give' data away in return for increased convenience, in a non-monetized trade. ³ The professionals on the other side of the trade are going to win in the long run, or they just won’t play. They will get their investment back.
Siegel continues: On the other hand, not selling your personal data has many benefits. Your personal digital assistant will get better and better at helping you find bargains, opportunities, and ways to take advantage of the ever-increasing flow of data in our lives. And this assistant will work for no one but you, quickly providing you a return on that investment. To sum up: Generic drugs are never going to buy your data and promote themselves to you, but their higher-priced competitors will. [...] There are plenty of issues to deal with in this exponentially growing world of personal data. Privacy and data sovereignty are important factors. But not directly monetizing our personal data sounds like a win to me.
Quite what data is personal but not already given away is perhaps not immediately obvious. Here are some examples:
• Who has your medical data? Is all of it collected in a single place? Are your dental and optical records separate from that which your doctor might have? Do you have rights to hold that data the medics have and if so can you exercise them? Do you own your DNA? Exclusively?
• Does anyone have ALL of your purchase data? I doubt that you do, though surely you are the one party with the ability to collect this correctly. Do you use that in any way to inform further decisions? Could you? Could you be bothered? Does this information, if collected, have value to you? Would you put a value on, say, all your travel data? As Siegel says, When your full DNA profile is available, drug companies will be happy to rent that data from you and pay you a decent amount each year, as long as you don’t sell it to their competitors.
I think I'd like to have general access to anonymised aggregated data and I don't object in principle to a (nominal) fee, though I'd prefer to have no gatekeeping, much as I think we have for access to national information in Britain.⁴ There is a can of worms here as a battle grows over quite who has any rights of ownership to certain classes of data. In so doing, there is the issue of who may sell (in any sense) data to others. An example that comes to mind immediately is at the heart of the Cambridge Analytica scandal, thence the big players such as Google, Facebook and Amazon. Just where is the line to be drawn as to quite what permission one gives away in using a service? If it has value, then we return to the monetization issue. Or we start to demand that there be a point at which information cannot be shared.
Suppose you're interested in who changed political allegiance at the last election. Last week, for example, we had (local) council elections; several thousand seats changed hands as the electorate—the 30% or so that turned out—demonstrated their mixed opinions of matters political, quite possibly taking out on their local councillors the perceived ills of their national-level representatives. The way we vote should anonymise the data, so you cannot tell that I voted differently from my previous stance. So we cannot tell who individually changed their vote, only that a particular ward (or equivalent grouping) has changed its vote. That can be caused by demography as much as political attitude. 5
You may have come across GDPR, General Data Protection Regulation, which passed April 2016 but only became enforceable in May 2018. [5] sheds a little light on that and the coloured table I lifted from there gives an insight, almost literally, into classes of data you might want to consider. While [5] is actually discussing data tags, an interesting concept on its own, I found the content instructive.
GDPR tries to move control of personal data to the hands of the individual. I say tries because we have yet to see whether it succeeds. It certainly encourages the sort of behaviour discussed above. Quite what form the consent you and I must give takes is far from obvious. [Did you notice it? Were you aware of what consent you gave? What do you perceive as being denied in turn for you denying consent? Will this be like the small print, something you agree to do so as to get on with life and hope it never comes back to bite?] The regulation demands that every data collector clearly disclose what is collected, how long it is kept and who it is shared with. I wonder if that will behave as well as freedom of information demands and how well we will find ourselves informed on the (rare) occasions we decide to test the system. Perhaps we should do so? It would make an interesting newspaper article. 6 9
I wondered if I am allowed to hold my medical records. I looked at [8], which says Under the GDPR, practices will not be able to charge patients for access to their records. The BMA is currently updating its guidance on subject access requests, but in its GDPR guidance it says cases where practices can charge for access to records are likely to be 'rare'. Practices will have to comply with a subject access request within one calendar month of receipt under the new rules. You can extend this period by a further two months if the request is complex or there are numerous requests, but you must inform the individual of this within one month of receipt of the request and explain why the extension is necessary. Requests can be made verbally or in writing, rather than in writing only as is currently the case. Practices should have procedures in place for dealing with requests for information from third parties, including solicitors, and ensure they have clearly obtained consent from patients to share this information.
Actually [9] says that we do have the right in Britain to copies of 'most' of your medical records, under different legislation. I wonder again how many have exercised that right. So, having established that it is allowed and doable and free, is the anonymised data available? No success in that regard as yet. Neither have I yet found an easy way to access my own records.
Looking at this leads me to wonder if I am asking the right question. The general label for this is, I discovered, a subject access request (SAR), which is an application under §7 of the Data Protection Act 1988 (DPA), updated 2018 to enact the GDPR. 7 You request information typically under the auspices of the Freedom of Information Act 2000. Thus I give you the framework for an official and polite request for information. So, for example, [10] shows what happened at the Office for National Statistics (the ONS), usually a good source from my point of view. This suggests that take-up is very low; 15 (=7+5+3) such requests were received in the period 2014-6, of which 3 were refused as exempt under §33 of the DPA. The ONS did not record the time taken to do this, nor does it charge, so there is no cost record. GOV.UK has a similar page indicating requests to the Intellectual Property Office (IPO), matching figures (0+3+2=) 5 in 2014-6, with one refusal and costs under £20 per, so It probably wouldn’t be worth the cost of raising invoices just for very occasional SARs at a maximum of £10 per request. As an employer you should be aware that some requests are onerous (time-consuming, expensive) but one cannot refuse compliance on these grounds. [10] explains succinctly what might be successful grounds for refusing a request (SAR). Recent case law suggests that refusal is not a good idea. [10] says businesses ... have seen the number of SARs increase significantly over the last six months as individuals have become more aware of their rights. But that doesn't lead me to any statistics, so I have written to ask if they might point me to some appropriate data or its source.
Pause for lunch, which will include wondering if this has come to natural end.
I succeeded in finding police reporting: NPCC FOI requests for 2019 came to 72 (=19+28+25) over 3 months, eight of which were refused in full. I didn't find any earlier figures, though I did find a lot of content labelled as on topic. This strikes me a remarkably few at a national level, but I wonder whether in time it is that requested information that i will be able to view. Indeed, I wonder why it is I am not already allowed to see the question and answer; also, if one is allowed to do so, how would you go about that?
It seems that the trick is to ask the right question. 'FOI monitoring' turned up the GLA's' performance review. Barchart mostly in blue to the right. This shows 2016/7 to have 800 requests increasing at perhaps 50 per year as a long-term average. Now this is the sort of analysis I have been looking for. See the link above or go to [12]. NHS Glasgow and Clyde do something similar [13] with fewer pretty pictures and around 1000 requests a year, rising at between 10 and 20% a year. Now we're cooking: As expected UK.gov does keep track and does publish something - it is finding the right button to push that is really the issue.
In 2018 there were 49,961 FOI requests received across all monitored bodies*. This is the highest number of requests received in a year since 2013 and is an increase of 3,280 (+7%) on 2017. Of the 49,961 FOI requests received, 36,498 were resolvable. Of these 43% were granted in full, and 39% were withheld in full. This is down three percentage points for those granted in full and up two percentage points for those withheld in full on 2017 levels. The remaining resolvable requests were not yet processed or were partially withheld. Of the 19,270 requests withheld in full or in part, 34% were withheld due to the cost of response exceeding the statutory limit, 3% were withheld as vexatious or repeated, and the remaining 63% fell under other exemptions. [14]. Half of all requests go to the DWP, the MOD, the MOJ and the HO; I do not see this as surprising. Half of what is left goes to the HSE and the National Archives. 8 But no-one is sharing what was asked or answered. Nor is anyone obviously telling what class of body was asking questions. There is not even any comment I could see that says that, as a result of a FOI request, reporting has changed in some way that might be described as 'better'. A minor point is to wonder what is recorded as a FOI request, where one might expect that the Act itself makes sufficiently clear what should be counted. I am not at all convinced that a Freedom of Information request is the same as an Subject Access Request (FOI≠SAR, indeed).
I looked at OfWat, whose report [15] does indicate the subject matter of FOI requests - and allows one to read the response, which I think brilliant. As the letter I read says, Once an FOI request is answered, it is considered to be in the public domain. To promote transparency, Ofwat may publish the response and any material released on our website in the FOI disclosure section. Any personal information in the letter will be removed before publishing.
I found that OfCom behaves much the same way and I link to one such at [16]
I found that the Home Office is a disaster area. I don't find this a surprise but it is a disappointment.
Short of responses and unable to round this off in a sensible way until I have some answers.. I like what OfWat and OfCom are doing with their FOI responses. I failed to find equivalence from UK.gov (i'm not saying they don't, only that I failed to find any). But because SARs are so different, it seems to me that we don't even recognise what it is that is held that can be called personal data and that therefore we should be concerned that we are not doing so. It is one thing to know of some sources, like your doctor; it is another thing entirely to discover who else holds data that you might think of as belonging to you.
DJS 20190508
top pic from thewindowsclub.com [4]
[1] https://blog.usejournal.com/ad-agency-of-one-a5a3085d54d2 You can read more about this in his book, Pull.
[2] https://en.wikipedia.org/wiki/Information_wants_to_be_free
[3] https://www.instituteforgovernment.org.uk/explainers/local-elections-2019
[4] https://www.thewindowsclub.com/why-do-companies-collect-sell-buy-or-store-personal-data
[6] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation implementation date 25/5/2018
[7] https://thenextweb.com/eu/2018/12/27/gdprs-impact-was-too-soft-in-2018-but-next-year-will-be-different/ the most recent review i found in the first ten seconds of looking. A summary of what is supposed to happen. i found very similar content here; is this the echo chamber effect?
[8] https://www.gponline.com/does-general-data-protection-regulation-gdpr-affect-gps/article/1460998
[9] https://www.google.com/search?client=safari&rls=en&q=can+I+have+my+medical+records%3F&ie=UTF-8&oe=UTF-8 . .
See also Health Insurance Portability and Accountability Act of 1996.
[10] https://globaldatahub.taylorwessing.com/article/hr-subject-access-requests-under-gdpr-six-months-on This looks like a good source on this general topic, but I failed to find any hard data. It does not restrict itself to the UK.
[11] https://www.google.co.uk/search?source=hp&ei=ZBnUXNvkOMeBjLsPr-qzoAo&q=public+health+england+subject+access+record+2018&oq=Public+Health+England+subject+access+record+&gs_l=psy-ab.1.0.33i160.11367.45768..48908...8.0..0.191.3357.50j2......0....1..gws-wiz.....0..0i131j0j33i22i29i30j33i21.UugmXlLIpiQ SN07103.pdf provides good background, in my opinion.
[12] https://www.london.gov.uk/sites/default/files/2016-17_annual_foi_performance_report.pdf
[13] https://www.nhsggc.org.uk/about-us/freedom-of-information-foi/annual-foi-monitoring-reports/#
[14] https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/798935/foi-statistics-annual-2018-bulletin.pdf Collected UK.gov statistics.
[15] https://www.ofwat.gov.uk/2017-information-disclosure-log-quarter-1/
[16] https://www.ofcom.org.uk/__data/assets/pdf_file/0017/138050/Complaints-about-fake-news-FOI.pdf
1 Let's not be silly. The (two) parties in any trade and both own their own part of the data; I know I bought, you know you sold. We both know what and to whom. How much more we know and share is perhaps moot. Whether or not we keep and use that information is no business of the other party, however much they might like it to be. Story possibility for the inventive. You might like to play with the idea that this is not so; at what point could you insist that a part of the data (and you probably mean the bit that associates your identity with a trade) be deliberately discarded? What right do you have to hold data on another party reduces rapidly to asking exactly what is yours or theirs or shared. I think that any trade has shared information and I think that the issue worth exploring is the extent to which you permit your identity to be associated in some way to any trade. Yes, John Smith bought this, but which John Smith? Allowing a body (think data system) to associate J Smith with a bank account and an address may or may not have been a choice. Perhaps it should be a conscious one? It's a bit different for me, as almost certainly the only Scoins in the county.
2 Is information free? Should it be? Is there a spectrum to be applied or is it far more binary, free / not free? Not a topic I feel I want to write about right now, but one worthy of your attention, certainly. See [2] as a start-point and I suggest that the reference list is helpful.
3 Monetize, with a z. I agree with the Canadian convention that -ise is the 'correct' spelling form. But to me this monetize is an exception, because I read the word as US American so much that I don't think it belongs in British English except as a loan word, and I think we should preserve the spelling of loan words. I found monetise as a British spelling, but Google was forever shoving me toward a company of the same name. Mind, I hit a very similar problem with anonymisation and chose differently, so that I needed to force my spell-checker to accept this.
4 I continue to use Britain as others outside the country use England, as the general label for this nation. The UK, the UK and NI, just England, somehow merely 'here' — I don't much care to worry about the subtle differences. I feel national attachment to the largest of these national units not the smaller ones, recognising that what I write about Britain does not apply to the larger still unit of 'Europe'. Whatever that might mean these days, it too has vagueness around the edges (UK in or out? Turkey? Ukraine? Switzerland?). I'm pretty sure I mean Britain to mean the largest amalgam, recognising all the while that around the edges the relevance may fade somewhat. So much of what I write may well be true in Eire or the Orkneys or even the Falklands and Gibraltar but is accepted as less so; meanwhile I expect that what I write as being a British attitude or truth or datum is not applicable to neighbouring France or Denmark.
5 I did, I cast a vote for a red candidate for the first time ever, largely in reaction to the printed offering from the other candidates. The Tory paper was a vehicle for bad spelling, poor grammar and worse editing, so much so I went and found a red pen and marked it; the Labour offering was actually well-written. The Tories disagreed with the spending made by the existing council; I disagree with them and wish to support the incumbents — who I think have been doing a difficult job fairly well under very straitened circumstances. So I voted for an incumbent. I had two votes, apparently (it said so on the ballot paper); I used the other vote in a different way more in line with the press perception of penalising the major parties. As a separate issue, I do not see local politics as being connected to national politics, so I find party allegiance to be somewhat unhelpful, in the sense that what <label> might mean locally is not the same as I take <label> to mean nationally. And, to be consistent, I realise I've taken the same stance in MEP elections, where, for example, a green vote might actually cause my choice to be elected.
I struggled to find useful analysis of the results. There were about 8400 positions being contested, with an average (which?) of 50 councillors per council, not all of whom are elected each time. [BBC results]] [council election patterns][electoral cycles]. Perhaps 1600 changed hands in terms of party but I have no idea how that compares with past such events. Analysts persist in telling us about control, which is not helpful when every council is reduced to arguing what can be supported, rather than by how much. I'll keep trying, but it looks to me as though I'm required to do the work myself, in which case it looks like too much work for a small gain in knowledge. MSc in Politics, anyone?
Argument at band about number of crosses to make. I found some detail for Blackpool on the local gov't site; every ward (of 21) had two councillors to elect for the borough council. The turnout varied from 27% to 38% but if I want to know the aggregate for Blackpool I must do this myself. I did; 29.03% turnout. The party workers must be kicking themselves in the several very close results - there's even one where 2nd and third differ by a single vote. Most concerning of all, though, is the dramatic number of votes that had only one cross, not two; this is not recorded, but I'm estimating around 40% of those who voted only used a single vote. I do not believe that the rules perhaps prevent the officials from explaining what ballot is expected. I'm not sure I can be bothered to find out, as I think it is the voter's responsibility to read what the ballot paper says. And to be able to do that.
6 Pause to write to the Guardian with a suggestion. Do an upload of 'to here' while I think where to go next. implementation was a year ago, not this month as I'd misread it, says [6].
7 Short version: 2018 DPA acts on the GDPR. Only extreme requests can be charged for. Insurance companies may access medical records for underwriting purposes and GPs charge for this. Medical records will not be released if they are considered likely to cause harm (physical or mental). A 12-year-old may deny a parent access to their records. Issues over those who don't have sufficient capacity at any age; issues over access to records of the deceased; issues over who can share what and with whom (obviously). Search words you might explore: Caldicott principles, Statutory disclosure of information, public interest disclosures, summary care records, National Data Guardian.
8 Department of Work and pension DWP; Ministry of Defence,MoD; Ministry of Justice MoJ; Home Office HO; Health & Safety Executive HSE. Of the refusals, about half conflicted with §40, personal information, 10% conflicted with §31 on law enforcement. See page 15 of [14].
9 Found in late May, the day after Mrs May indicated she will resign on June 7th, from Tech Republic [17] is an article describing what we can learn from the fines exerted as a result of GDPR. I'll confine this footnote to the conclusions and encourage you to read up on this for yourself. Tech Republic shows you the evidence followed by their conclusion, so you can decide for yourself if you agree.
Lesson 1: It does not matter to the European data protection authorities whether violations of the provisions of the GDPR are unintentional mistakes stemming from neglect, laziness, sloppiness, or ignorance. A violation for any reason is punishable and businesses had better take compliance with the GDPR seriously.
Lesson 2: Willful, deliberate, and blatant violations of the provisions of the GDPR will receive the harshest of fines from European data protection authorities. Businesses who attempt to test the resolve of the regulatory authorities will pay dearly for their arrogance.
Lesson 3: The provisions of the GDPR, particularly amongst citizens of the EU, are well-known and individuals who feel those provisions have been violated are more than willing to report offending behavior to the data protection authorities. Unscrupulous businesses who count on the ignorance or passiveness of individuals are likely to pay a heavy price for that cynical attitude to personal data security and protection.
Lesson 4: While serious violations of the provisions of the GDPR are still subject to fines, timely reporting of security breaches to data protection authorities and quick action to reduce the risk of exposure of personal data by violating businesses could reduce levied fines significantly. All businesses handling sensitive personal data should have appropriate security and compliance policies in place to mitigate the risk from GDPR violations.