In a very nice visualisation the “Information is Beautiful” people present an overview of the major data breaches in the past few years, categorized in methods of leaks and types of organisation where the original data were stolen: banks, health services, insurance companies you name it. We all have experienced such incidents, when we got a message to change the password of LinkedIn, Dropbox or Evernote after a breach occurred.
The underlying data of this visualisation can be examined in a separate file, describing each incident with a brief explanation, and when available, a reference to the original source. This information offers some interesting examples of things that went wrong in real life. Incidents that also might happen in the world of digital preservation. We are familiar with a list of (security) risks , this visualisation shows the evidence of these risks.
Although I don’t expect many organisations to have their preserved data on laptops that can be stolen, – a frequently recurring cause -, (unhappy) former employees and lack of strict authorisation procedures for hired companies can lead to revealing sensitive personal information. Both small and big organisations can be a victim of hackers, theft, stupidity; incidents that might lead to leaks of information that should be kept private, like credit card numbers, health information, address information and so on.
One could think that this kind of sensitive data is less likely to be present in National Libraries, in contrast to for example data centers preserving social science data. But also (National) Libraries preserve material with a commercial value, for example contemporary e-books, e-journals, movies, music etc. Materials that are in their custody and should not be under threat of these risks. We talk a lot about how the digital objects might be affected by technical risks. But are we sure we take enough measures to prevent preserved collections to appear in this visualisation?