HideLast Call: Our Black Friday weekend sale ends tonight! Shop now to save on all Techdirt gear »
HideLast Call: Our Black Friday weekend sale ends tonight! Shop now to save on all Techdirt gear »

16K COVID-19 Cases Go Missing In UK Due To Government's Use Of Excel CSVs For Tracking

from the excel? dept

Yes, yes, you're sick of hearing about COVID-19. Me too. But the dominant force of 2020 continues to provide news, often times with a technology focus. This mismanaged pandemic has already given us an explosion of esports, students gaming remote learning systems, and enough dystopia to make George Orwell vomit in his grave.

But to really get your anger bubbles gurgling, you need turn only to the myriad of ways far too many governments have taken a moment that requires real leadership and forethought, and pissed it all down their legs. America appears to be trying to lead the charge in this, with our shining city on the hill mostly being illuminated by headlights of cars carrying sick passengers looking to get tested for this disease. Still, we're not alone when it comes to sheer asshatery. The UK recently managed to lose thousands of COVID-19 cases... because it was tracking them in Excel CSVs.

The issue was caused by the way the agency brought together logs produced by commercial firms paid to analyse swab tests of the public, to discover who has the virus. They filed their results in the form of text-based lists - known as CSV files - without issue.

PHE had set up an automatic process to pull this data together into Excel templates so that it could then be uploaded to a central system and made available to the NHS Test and Trace team, as well as other government computer dashboards.

Public Health England (PHE) decided to put all of this information into a file using the XLS format. XLS was first introduced in 1987 and was replaced by the XLSX format over a decade ago. Putting aside the use of Excel to monitor positive COVID-19 cases in a major industrialized nation for just a moment, just the use of an antiquated format managed to lose PHE over sixteen thousand positive cases.

How? Well, XLS has restrictions as to how many rows of data it can record.

As a consequence, each template could handle only about 65,000 rows of data rather than the one million-plus rows that Excel is actually capable of. And since each test result created several rows of data, in practice it meant that each template was limited to about 1,400 cases.

When that total was reached, further cases were simply left off.

Which means the people that had COVID-19 weren't tracked for contact tracing. The government and its people didn't have a complete picture as to either the total case count for the disease, nor its positivity rate. In other words, the agency in charge of national health failed to keep the nation informed as to its risk exposure because it didn't know how to properly use a common office application that it repurposed to record COVID-19 data.

Labour's shadow health secretary Jonathan Ashworth said lives had still been put at risk because the contact-tracing process had been delayed.

"Thousands of people [were] blissfully unaware they've been exposed to Covid, potentially spreading this deadly virus at a time when hospital admissions are increasing," he told the House of Commons. "This isn't just a shambles. It's so much worse."

The UK's Health Secretary told the House of Commons that PHE had decided to replace the use of Excel, or what he called a "legacy system", two months ago. But apparently PHE hadn't gotten around to it yet.

And still hasn't, actually. In fact, PHE's plan to temporarily fix all of this is... more Excel!

To handle the problem, PHE is now breaking down the test result data into smaller batches to create a larger number of Excel templates. That should ensure none hit their cap.

But insiders acknowledge that the current clunky system needs to be replaced by something more advanced that excludes Excel, as soon as possible.

When you hear complaints that governments are not taking this pandemic seriously, this is what they mean.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: covid-19, excel, public health england, uk


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • identicon
    Anonymous Coward, 7 Oct 2020 @ 8:07pm

    Hint for other agencies trying to avoid this mistake: SQLite files are well-documented and have size limits rather larger than Excel's.

    reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 7 Oct 2020 @ 9:00pm

    The UK recently managed to lose thousands of COVID-19 cases... because it was tracking them in Excel CSVs.

    This is a misreading. The very next paragraph says the CSV files (presumably not having anything to do with Excel) were fine. The format to blame is XLS; or, better, the software to blame is Excel (any reasonable software would warn the user before throwing away part of their data).

    Still, CSV often leads to trouble, because it's really a family of formats. RFC 4180 attempts to standardize it, but if nobody specified this as the format to use, probably at least one piece of software will be using something subtly different—subtle enough that things will appear to work at first, until there's a string containing a comma, backslash, double-quote, or newline; or the first line starts with "Sep=". And you could still be screwed if someone opens that in Excel, since it doesn't default to 4180.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 7 Oct 2020 @ 9:05pm

      Re:

      It think the point is rather not unrecoverable data loss, but that those cases were in fact lost with respect to anything useful which could have been done with the data in a timely fashion.

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 7 Oct 2020 @ 9:28pm

        Re: Re:

        The claim is that Excel is limited to 65535 or 65536 records in certain circumstances. One would think there would be some hard-to-ignore errors/warnings when dealing with larger datasets, e.g.:

        • ERROR: you're trying to save a file with more rows that this format allows; delete some data so there are no more than X rows and try saving again, or save in XLSX instead
        • WARNING: the file you're trying to open has Y rows, and Excel can only load X: cancel / load the first X rows only

        But, apparently, Excel sometimes silently truncates the datasets. If so, that's bad design.

        reply to this | link to this | view in chronology ]

        • icon
          PaulT (profile), 7 Oct 2020 @ 11:58pm

          Re: Re: Re:

          It's been a while since I've seen spreadsheets that large, due to thankfully being employed in areas where people are capable of picking the right tool for the job. But, I do recall having seen such errors in the past.

          I'd say it's more likely that the errors you described were being displayed, but the monkeys using the spreadsheets just ignored them. I hate to side with Microsoft, but if the choice is between assuming competence on their part or on the part of people who thought that a spreadsheet was the correct tool for this job, I'll side with them.

          reply to this | link to this | view in chronology ]

          • identicon
            Anonymous Coward, 8 Oct 2020 @ 6:56am

            Re: Re: Re: Re:

            I'd say it's more likely that the errors you described were being displayed, but the monkeys using the spreadsheets just ignored them.

            If so, those monkeys are masters of spin to be claiming it's a Microsoft problem. It seems rather unlikely to me, but I don't have a copy of Excel to check with.

            Were they errors, they couldn't have just been "ignored", because Excel would've aborted the saving/loading and nobody would have received the truncated data. Warnings could be ignored, but software should make it hard to do by accident and should make the consequences obvious (e.g., it shouldn't be an OK/Cancel dialog defaulting to OK).

            reply to this | link to this | view in chronology ]

          • identicon
            Anonymous Coward, 8 Oct 2020 @ 12:45pm

            Re: Re: Re: Re:

            The standard error when Excel fails reading a file says that there is an error with the file (not Excel) and would you like Excel to fix the file. Most people just click Yes and Excel throws away stuff until it can read the file.

            Recent case in point: was using XLSX but kept getting errors with an outputted file with 150,000 rows. Turned out XLSX wasn't having a problem with the row count since it will let you have a million rows in the new format, but it was choking on the fact that it only allows 65,535 hyperlinks in a single sheet. And it kept saying the file had an error. No, the file was perfect, Excel just failed and blamed everyone else. How did Microsoft ever become the software giant it is today with bullshit like this?

            reply to this | link to this | view in chronology ]

            • identicon
              Anonymous Coward, 8 Oct 2020 @ 12:55pm

              Re: Re: Re: Re: Re:

              How did Microsoft ever become the software giant it is today with bullshit like this?

              Marketing, and getting its OS on the IBM PC.

              reply to this | link to this | view in chronology ]

            • identicon
              Anonymous Coward, 8 Oct 2020 @ 5:11pm

              Re: Re: Re: Re: Re:

              there is an error with the file (not Excel) and would you like Excel to fix the file.

              Wow. Given the option to fix a file, of course one would want Excel to try and would expect it to show an error if it were unable. Throwing away half the file is an... unconventional definition of "fixing".

              reply to this | link to this | view in chronology ]

            • icon
              PaulT (profile), 8 Oct 2020 @ 11:41pm

              Re: Re: Re: Re: Re:

              "How did Microsoft ever become the software giant it is today with bullshit like this?"

              With office suites, they indulged in a lot of deliberate sabotage that prevented competitors like Novell from having their software work properly on Windows 95, leading customers to buy Microsoft Office instead...

              For some time they had a monopoly in the space, and while free alternatives do exist now, since they had to reverse engineer Microsoft's file formats to be compatible they don't always work 100% or offer the same collaboration tools, so corporate environments refuse to switch.

              reply to this | link to this | view in chronology ]

            • icon
              Scary Devil Monastery (profile), 9 Oct 2020 @ 5:10am

              Re: Re: Re: Re: Re:

              "How did Microsoft ever become the software giant it is today with bullshit like this?"

              Bullying. Also, swarming the market with what, for lack of a better term, they called "standards".

              reply to this | link to this | view in chronology ]

        • icon
          R2_v2.0 (profile), 8 Oct 2020 @ 5:26pm

          Re: Re: Re:

          It does.

          It pops a big box that says 'Significant loss of functionality' and then a decent blurb explaining exactly what's about to happen.

          I suspect the issue is probably that someone wrote a VBA macro that may not have the same safeguards.

          reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 8 Oct 2020 @ 5:13am

      Re:

      That's why they really should just use xml. It's easy to parse. There's plenty of tools available for it so that you can transform the data into whatever format you need it. Plus, it's extensible. so if you find you need some additional information, you don't need to "add another column" possibly messing up existing parsers.

      Also, I really suspect the name of the file where they stored the data was called "Excel.ppt".

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 8 Oct 2020 @ 5:48am

        Re: Re:

        The problem wasn't the CSV file, but rather the import into an old excel format, and xml would not fix that problem.

        reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 8 Oct 2020 @ 7:00am

          Re: Re: Re:

          The problem wasn't the CSV file, but rather the import into an old excel format, and xml would not fix that problem.

          It kind of would—as I recall, Excel can't import XML files. While not a CSV problem per se, it's really easy to import a CSV to Excel incorrectly. XML is harder to deal with in general, which should also make it harder for non-programmers to fuck up (although there's no shortage of opportunity for programmers to fuck it up).

          reply to this | link to this | view in chronology ]

  • identicon
    Paul, 8 Oct 2020 @ 2:35am

    Dev: To create a quality database for tracking these test results will take 6 weeks and X amount of money Project Manager: Piffle, I can knock something up in Excel before lunch Dev: Have at it then, just don't come to me when it goes wrong *some weeks passes* Project Manager: Oi, dev, why didn't you warn me about this

    reply to this | link to this | view in chronology ]

    • icon
      PaulT (profile), 8 Oct 2020 @ 2:50am

      Re:

      "Dev: Have at it then, just don't come to me when it goes wrong"

      It's much easier now that everyone's working remotely, but I quickly learned to routinely demand that any decisions were made via email, ticketing or IM and not verbally. It's amazing how many times that's saved my skin when some middle manager tried blaming techs for not doing something that they had explicitly ordered not to be done. There's not a company I've worked for where some penny pinching management type hasn't ignored all warnings to save a few hundred, then went into full blame mode when a preventable outage cost thousands.

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 8 Oct 2020 @ 6:33am

        Re: Re:

        <OT>

        Documenting decisions, incidents and actions helps with vermin like lawyers too. Lawyers hate opponents and individuasl who document out the wazoo. They like everything to be verbal, that way everything is "he said, she said" in front of the judge. Arguing is easy-peasy.

        On the other hand, document everything, and in writing ask for clarification/confirmation. That way in front of a judge, there is a document and the lawyer has to explain why he didn't clarify/explain when invited to.

        Not a cure all for the scum of law. However, fighting someone who documents thoroughly adds significantly to a lawyers costs and often isn't worth it. Even better, on RARE occasions, proper documentation can lead to a scum-bag lawyer getting what they deserve; unfortunately not often enough due to the lawyer conspiracy of "Professional Courtesy".

        Similarly, managers hate fighting documents. Documenting isn't fun, but unless you are one of those who go through life lucky, then it's better to be prepared. Documents which exist but weren't needed are a small waste of time. Documents who don't exist and are needed can make a big difference in the course of one's future.

        </OT>

        reply to this | link to this | view in chronology ]

      • icon
        Khym Chanur (profile), 8 Oct 2020 @ 3:41pm

        Re: Re:

        Or, as /r/TaleFromTechSupport would say, "CYA" (Cover Your Ass).

        reply to this | link to this | view in chronology ]

      • icon
        Scary Devil Monastery (profile), 9 Oct 2020 @ 5:13am

        Re: Re:

        "I quickly learned to routinely demand that any decisions were made via email, ticketing or IM and not verbally."

        This, right there. Always make sure there's a correspondence chain detailing who asked for what and when.

        Because if such correspondence doesn't exist, it'll be the fault of the grunt who did the work, not the one who ordered the work done to those specifications.

        reply to this | link to this | view in chronology ]

  • icon
    jilocasin (profile), 8 Oct 2020 @ 6:04am

    nuclear facepalm

    Wow just doesn't seem appropriate enough.

    • Using a spreadsheet when a database should be used
    • Using an antiquated spreadsheet when modern versions are available
    • Losing track of thousands of Covid-19 cases, causing more misery and death

    Priceless

    If it was a matter of cost, someone should introduce them to LibreOffice. The standard .ods format can handle 1,048,576 (2^20) rows and it's free.

    reply to this | link to this | view in chronology ]

  • icon
    Stephen T. Stone (profile), 8 Oct 2020 @ 7:12am

    The larger issue isn’t that so many cases went missing.

    The larger issue is that they had that many cases to lose in the first place.

    reply to this | link to this | view in chronology ]

  • identicon
    Mr. Underhill, 8 Oct 2020 @ 8:31am

    The wrong tool . . .

    So they lost Access?

    reply to this | link to this | view in chronology ]

  • identicon
    Dr evil, 8 Oct 2020 @ 9:09am

    This one wonders...

    Hmm a file small enough to email.. and as a file instead of a trackable database query? Guess no worries of.. imdunno..data theft?

    reply to this | link to this | view in chronology ]

  • icon
    Toom1275 (profile), 8 Oct 2020 @ 9:28am

    Top-rated comment on this from Ars, by deet:

    It's nice to be in IT and to know the right way to do everything and to have everything you need in order to do whatever you want.

    It sucks when your job is to get data from one organization to another organization and you're crushed under a pile of conflicting demands where the stakes are high, there's no budget or time to go hiring a database application developer, everyone else has kicked the can down the road, and you're the last person willing to figure out some way — any way — of getting the task done because it must be done and nobody really cares how you do it, as long as you get it sent and can move on to the next thing, which is also on fire. Your only hope in that situation is that any errors can be sorted out later.

    reply to this | link to this | view in chronology ]

    • icon
      PaulT (profile), 8 Oct 2020 @ 11:43pm

      Re:

      "It's nice to be in IT and to know the right way to do everything and to have everything you need in order to do whatever you want."

      ...then you wake up.

      reply to this | link to this | view in chronology ]

  • icon
    R2_v2.0 (profile), 8 Oct 2020 @ 5:10pm

    Not too much self reflection here

    Tech people on here who have worked with large orgs and government should really understand how 'new and emerging requirements' get handled in these situations.
    Firstly, almost no development will be getting done in-house. Your new reporting solution will need to go to an external party who is just dying to milk time and money from your government department.
    People are here talking about databases which might be part of the solution but then how are you extracting data and where's it going to? I mean, do people really think the contact tracers have SQL access/skills?
    No, contact tracers are a hastily assembled call-centre group with no CRM system who probably work by having a team-leader assigning cases out of a spreadsheet (yes, a spreadsheet).
    The real story here is how, in the era of ubiquitous IT, it is still almost impossibly hard for technology solutions to respond quickly to changing circumstances in a way which is reasonable for front-line workers, manages data security, is flexible and is accurate. And that's not even considering cost.

    I'm not suggesting Excel is the 'right way' but there are very good reasons it gets used.

    As a footnote, I notice with interest mention of the XLS 'format'. I'd be interested to know more about that, it's a common data-analyst trick to add an XLS extension to a tab-delimited text file. File associations will cause this to open in a recipients Excel with a warning about file format. In this scenario you are not limited in the number of rows but I wonder what happens if someone tries to save said file. I imagine it will apply the 'correct' XLS format and truncate the file.

    reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Close

Add A Reply

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Insider Shop - Show Your Support!

Essential Reading
Techdirt Insider Chat
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.