Loss of Croatian web resources

At the last IFLA conference in Cape Town, a paper was published by Karolina Holub and Ingeborg Rudomino  A decade of web archiving in the National and University Library in Zagreb.

My attention was drawn by the following sentence

The NSK [National and University Library in Zagreb] started cataloguing online content in 1998 after the Law was passed and up until 2003 783 resources have been catalogued. Unfortunately, during that period, owing to financial and technical difficulties, and inadequate infrastructure the ingest and storage of these type of resources was not done. This resulted in an irreversible loss of significant part of web content.

I appreciate the frankness in this article , but it would be very interesting to know what really happened. The rest of the article shows that they have learned from this experience as they are “by now the only web archive with its metadata available in Europeana.”

Loss of research data … or not?

There is a raising awareness amongst scientists that their data sets will need attention if they will be able to use them in the future and some of them learnt this the hard way from past experience. An interesting article by T. Vines at all in Current Biology, Volume 24, Issue 1, 94-97, 19 December 2013 describes a study  into the availability of research data years after the article was published . 516 Publicly available articles, published between 1991 and 2011 were used to find the related data sets, via authors email addresses, either from the article or by searching the web. Vines and his colleagues received 101 data sets, and another 20 datasets  were reported to be still in use. Especially for older papers the related datasets were not readily available any more. The original authors were asked for the data and they gave a variety of reasons why they could not.

Responses included authors being sure that the data were lost (e.g., on a stolen computer) or thinking that they might be stored in some distant location (e.g., their parent’s attic) to authors having some degree of certainty that the data are on a Zip or floppy disk in their possession but no longer having the appropriate hardware to access it. In the latter two cases, the authors would have to devote hours or days to retrieving the data.

The article was discussed in Nature and two other cases of lost data sets were mentioned, which will be cited here, as they are too small to put in the Stories part of the Atlas.

Showing that “benign neglect”, after all, often seems to be not the way to preserve digital information.

Agricultural researcher Melvin McCarty, for instance, spent 15 years between 1958 and 1973 recording the life cycles of plants and grasses near Lincoln, Nebraska. Forty years later, ecologist Lizzie Wolkovich went searching for McCarty’s data as part of an effort to tie together experiments exploring how rising temperatures affect plant life cycles. But McCarty had died, and his raw data could not be found. “There is nothing we can replicate now. The loss of the long-term data set is very sad,” says Wolkovich, who works at the University of British Columbia in Vancouver.

A similar fate befell the raw data collected in the 1980s by Otto Solbrig, a biologist at Harvard University in Cambridge, Massachusetts, on species of violets in New England. Plant biologist Sydne Record at Michigan State University in East Lansing wrote to him in 2009 asking for the original data, to test out a mathematical analysis of population viability that she was developing — but Solbrig didn’t have them. “We had at least 20 big folders with those data, but nobody was interested in them so we threw them away,” he says.


Hyves, a social media network


Hyves-logoIn 2004 the social media network Hyves started in the Netherlands and became a big success with at one point in time 10 million subscribers (the Netherlands has a 17 million population).  In 2010 the Telegraaf Media Group (TMG)  took over the media network and added functionalities to it, with a focus on on line gaming. Recently the TMG announced to stop with the social media site and to restrict its activities to the on line gaming. But in the past almost ten years, users had added 1 Peta byte of content to the network. Would this be deleted like happened in other cases? No! Instead, Hyves offers a service to rescue the personal content: users can request via email a copy of their content (images, blogs, conversations etc.) and this will be send to them by the end of the year. Because, so Hyves says in the press release “this content is owned by the users”.

Very quickly after this announcement, another company MijnAlbum.nl offered a way for users to download their images from the Hyves site, even before Hyves will start their service. This immediately became a success, with 2 million downloads a day.

It seems that the awareness of both the general public and the service providers for the preservation of personal digital archives is growing, a good sign!