The OCR, Tech Stack and Team Behind an (Almost) Fully Digital Vaccination Center

10 min read

“Move fast and vaccinate” could have been the motto of the most digital vaccination center in Europe. Against all odds, the deadline was reached, and the center vaccinated over a million people in 2021.

thumbnail

Behind the scenes, there was an unstructured process driven by a newly formed team. Hardware and software had to be built from scratch. Apart from the vaccination, data security and documentation had to be maintained, resulting in up to 100K sheets printed daily.

In this article, we break down the project:

  • Why they ditched the iPad idea and decided to print and then scan

  • How QR codes saved the day five months later

  • What tech they core team built and how it worked out in production


Flashback: The time we got the vaccines

Displaying informmation

A picture of the vaccination center under construction

It was the end of 2020, and most parts of the world were still in an ongoing lockdown. While people struggled to cope with the new reality, logistical shortages, and working from home, scientists were busy sequencing the SarsCov Virus, finding a vaccine, and starting trials.

Once the trials were completed and the vaccine was deemed effective in November, there was a rush to get the systems ready to perform vaccinations at scale.

The problem was that nobody had built systems like this before. The last worldwide pandemic was a century ago. Although information technology has advanced substantially over the years, no one anticipated this.

Even now, we have not moved into a fully digital world yet. The center operators required the use due to regulatory demands. Meanwhile, all the information needed to be recorded digitally at the Robert Koch Institute (RKI), which is the state agency that kept track of all European vaccinations.

And everything was done in kamikaze action in just 5 weeks!

This is the story of the vaccination center of a big city in a big country in Europe. We try to keep this article anonymous so that we can give you as much information as possible. Our policy is to keep our sources anonymous and give them a safe space to talk about project failures or successes in detail.


Project setup: Facing rough realities and timelines

Displaying informmation

When we began at the end of November 2020, we had nothing—just a deadline and a huge number of people looking to get vaccinated.

Apart from the deadline, we also had a couple of constraints:

  1. The IT infrastructure would need to be completely shielded from the outside world. Compromise: the developers were able to get a VPN connection into the machines to debug + deploy

  2. Every vaccination needed to be signed by a doctor and stored as paper in an archive.

  3. The printouts would need to contain a preliminary proof of vaccination

  4. An expected number of around 8000 people would need to go through the centers - daily.

Displaying informmation

Project timeline

Let’s talk about risk mitigation.

The biggest risk was not being able to open the vaccination center by January 4th. That’s why an agile approach was chosen. Everyone had their responsibilities, and they had to work together to come up with a productive system by January 4th.

The next biggest threat was IT failure. Several things, from a failed network to software errors, could lead to the system not running properly or not running at all. That’s partly why paper was chosen: it would serve as a backup. If the IT stood still, paper could still be used to keep up the throughput of the system.

Of course, the paper introduced another problem: data synchronicity. Whatever got recorded on paper would need to get printed and later scanned. And scanning can have errors.

Displaying informmation

The calm before the storm


Implementation: How to design a print-and-scan pipeline

Displaying informmation

The center was completely shielded from the internet, only reachable via VPN

The basic workflow was this: appointments would get recorded via phone call or applied for online. The hotline was notoriously overloaded, and people waited for hours to get an appointment. The call center recorded the names and gave appointment slots, so that only a certain number of patients will be served every day.

The call center would then send the patient data via an API into the data center, which prepared the list of patients for the day. The data center ran a Ruby on Rails app, which printed and scanned the pages. We’ll talk about the tech stack later.

The problem with this approach was data validity. When welcoming a new patient, the name, address, or birthday could be wrong. The front desks needed to check the passports of the patients and then modify the data in case it deviated.

Displaying informmation

Test run at the front desk

This picture was taken during a test run of the front desk. At 8000 patients per day, each front desk handled 30 to 60 patients per hour. Each patient was handed 10 sheets of paper. That’s 600 sheets of paper per hour processed at the front desk.

Fun fact on the side : printing that much paper led to constant failures and printer exchanges. They frequently ran out of toner and paper, and 8 runners were constantly racing around the centers to refill.

Displaying informmation

After the front-desks the patients were guided through the so-called “Cluster” - the unit of operation with front-desks, waiting room, 4 doctors, and checkout.

At the checkout, their 10 pages were scanned. After the scan, the patients were handed their vaccination document as original (as part of their 10 pages), and could go home.

Scaling fun fact : the center used one type of printer to be able to purchase the toner in bulk. The problem, of course, is the reliance on one certain type of toner. By mid 2021 all worldwide supplies for this toner were bought by this one center.


How to scale a scan process

Displaying informmation

Each scanner worked without much manual setup. Just drop in all pages at once and hit a button. All the bundled pages would then arrive in a SMB file share as a multi-page PDF.

So far so good. The expectation that many pages would arrive as orphans or mixed with other patients happened, but wasn’t a major headache.

The PDFs would then get read by a cron job and fed back into the Rails app. These steps would happen:

  1. Splitting the PDF

  2. Attached each page of the PDF into a database entry

  3. Sending each page into an OCR API, which returned a JSON

  4. Merge the JSON data back into the patient record

  5. Offer the patient record alongside each PDF page in an interface to ensure data validity.


The manual team behind the scanning

At first, we expected to have many errors during the scanning process. In order to keep the data clean, the center operators hired a team of eight to work through the data every evening after closing.

Why? Each center in Europe needed to report correct data to the RKI for statistics. There was a window of four hours between the center closing and sending the final report to the RKI. We wanted to use this window to make sure correct data was being sent every day.

The interface in the Rails app looked something like this. Each of the 8 data cleaners was able to log in and work on their own.

Displaying informmation

The Rails app interface

Later, it became clear that the OCR worked so well that at most one person was necessary to polish all arriving patient records.


Wait… OCR—what?

It’s likely that you have heard of OCR but haven’t worked with it. As our world gets more digital, the less the term optical character recognition (OCR) comes up.

Some smart cookie might say that this wasn’t exactly OCR, as no characters were recognized. The software would merely see if checkboxes were filled and read the label of the vaccination dose.

It basically entails converting a scanned document to a digital one using an optical character recognition system, hence the name OCR. One whole extra team worked on the conversion between the scanned page (as PDF) and the result as JSON.

This is how a typical page from the vaccination document looked:

Displaying informmation

Every page contained a QR code in the same spot. The OCR software expected it to be in the same position to read it. From the QR, the OCR API would figure out what page the image was from. Based on that information, all other data fields would be retrieved. Those other fields were mostly checkboxes.

Why only checkboxes? Well, they did not need the patient information. They already had it in the database. So, in the end, only the vaccination batch number, agreement checkboxes, potential side effects, and signatures were recorded and then returned as JSON to be used by the Rails app.


About batch numbers

One of the most important things in a vaccination is the batch of the vaccine dose injected. In cases of side effects, the batch number could be correlated and inspected. A strict requirement was to get the batch numbers read out correctly via the OCR interface. Those used another dot matrix code. They are like QR, but more compressed and with less data. You might see them on letterheads sometimes.

Displaying informmation

A vaccination room

Side fact: during these million vaccinations, we only had very few people with side effects (in the per-mill area). I leave this to the reader, but I encourage everyone to check the data behind any theory they are reading.

Back to the batch numbers. Every vaccine dose came with a small label and a bar code, but as one vial served five patients, the label needed to be printed many times. The team used small label printers with a barcode scanner. This also helped the OCR process (printed letters are much easier to detect than handwritten ones).

Displaying informmation


Why those QR-Codes saved the taxpayer 5 Million EUR

We have looked at the QR code above. The reader might notice the size of the code. The more data (and redundancy) you store in the code, the bigger it gets. In our case, it was actually a URL that pointed back to the system. The URL was a cool idea, but it was never used in the end.

However, what was used was the data in the URL, namely:

  • The patient ID

  • Cluster ID

  • Page ID

  • Date

  • Checksum

When April 2021 came around, Europe deployed the digital vaccination certificates. This huge effort involved cryptography and a number of certificate authorities to make sure nobody would be able to change or transfer their code.

Other vaccination centers (or the municipalities they operated in) had to pay pharmacies to issue these certificates. Usually, in Europe, pharmacies charge up to 48 euros per patient.

But not for patients of this vaccination center. They could download it to their phone, by using one simple thing others didn’t have: authentication.

Every QR code included the number of vaccinations, and the Rails app contained records of all these. The only thing the patient needed to do was type in the identifier below the QR code, add their personal information, and be handed the vaccination passport right away.

There was no need to go to the pharmacy, stand in line, and send a 48-euro bill to the municipality.

Eventually, Europe didn’t use the vaccination passports much after all. Maybe that’s a good thing. But in that moment, it felt like a huge win and offset some of the huge cost of operating the center.

The center closed at the end of August.

It was the first center to issue the first digital vaccination passport, and allegedly the only center that reported digitally to the RKI every day, without fail.

Over the time, 1.1 million people received 1-3 vaccination doses.


Members of the newsletter

We are thinking about moving details of the implementation into a part only visible for opted-in members.. This is still a WIP, we have to decide what should go in there.

Got any ideas? Send me an email: [email protected] 


The newsletter that analyzes projects and what made them successful