The news was announced to the wider world on the BBC Breakfast
news programme by Richard Gibby who is currently one of the institution's project managers. It may be fascinating to learn there are nearly 5 million UK websites and that the Library plans to archive one billion pages a year (US billion
, ie 1,000,000,000), but it is a) hardly news and b) only
one billion pages a year.
The British Library has in fact been operating a dedicated web archiving service
since 2004. This is actually an extension of the Legal Deposit Scheme
, which has five other participating libraries in the British Isles.
It remains to be seen if the average lifespan of a webpage is really only 75 days, as Richard Gibby said. It is true though that many pages are either moved or augmented, including blogs, but for those who wish to archive blogs especially, there is Webcite
. When though he says many pages and sites have been lost forever, he is undoubtedly correct, but what he didn't mention is the Internet Archive
, which at the moment claims to have archived 281 billion webpages, and that is without its own dedicated webpages which constitute a repository for books, concerts, documentaries, radio broadcasts and much more.
You can do your bit to preserve the Internet. Although its robots crawl the web constantly, the Internet Archive does not detect and archive every website much less every single page, so when you alight on a blog, especially a recent one, and you want to preserve that particular page, copy its url and add it to the Wayback Machine
- see below. If the page has not already been captured and is not protected by a robots.txt file
, it will be archived in due course.
You can also if you wish donate a CD of your own website to a Legal Deposit Library for posterity; in the UK this is the British Library
and/or the Agency for the Legal Deposit Libraries