Friday 26 March 2010

NLA newspaper digitisation program

Our proud national collaboration project in digitising historic (out of copyright) newspapers (1803-1954) owned by each states. The greatest thing about this project is, it invites public to join the project in correcting errors on digitised newspaper articles. It was initiated by The National Library of Australia in 2007. I have browse and read through its program overview. I was amazed by its Project Details page which has included: project details and progress reports, the processing and OCR (Optical Character Recognition) of newspapers, metadata, system architecture, content management system, search and delivery system, statistics of service usage, public collaborative text correction and tagging of newspaper articles. It also listed which newspapers are available and how we as public can involve in this project.
The public contribution in text correction opened my view towards newspaper digitisation, and how a national digitisation program can benefit the entire country through the collaboration of each states and community libraries. This program not only provides searching capability for historic newspaper articles, it also serves as a teaching tool for primary, secondary and ESL students in learning Australian History, English and etc. It also benefits for researchers especially who are specialised in Australian History and Genealogy.


Here comes the actual site, and its wonderful quick and easy online tutorial on how to search and correct the text. The article that I have chosen was about the launching of Melbourne Symphony Orchestra (MSO) in 1906. I guess the original article must be really old, and the digitised image through OCR is really poor on this particular article. Please refer to the first 13 lines of the article on left. I barely can see the correct spelling, neither can the OCR. The accuracy of OCR technique used in this project averages from 71% to 98.02%. But I still can read over it with zoom function. And I have made some corrections to the first 6 lines of the text. That's why public is encouraged to participate and contribute to the correction of the digitised text.
I love to contribute in this project. It is simple to use and no strict rules as the wikipedia and other scholarly collaboration sites. This is good and suit my current level of collaboration work to public project.
THE PROJECT IS REALLY COOL!!!

1 comment:

  1. I liked being able to edit articles that were of particular interest to ME. In this way both the user and the library benefit from the project. Me, by learning about the history of The Athenaeum and the NLA through the now corrected text.KP

    ReplyDelete