Development of new media like Internet and World Wide Web confronts the libraries with a challenge to identify, catalogue and preserve the Internet resources. Similarly to the traditional resources, the National and University Library in Zagreb is responsible for collection, description, storage and providing access to web resources as an integral part of national cultural heritage.
Web resources differ from the traditional publications in many features: frequent change of location, content and size, short and unpredictable life cycle on the Internet, and so on. With the aim to save these resources for the posterity the National and University Library in Zagreb, in cooperation with the University Computing Centre (Srce) created in 2004 a system for archiving Internet contents – Croatian Web Archive.
Archiving. A process of cataloguing, crawling and providing access to web resources.
Crawler (harvester, gatherer). A robot used for crawling web resources.
Integrating resource. A bibliographic resource that is added to or changed by means of updates that are integrated into the whole and do not remain discrete, e.g. updating web sites.
Legal deposit copy. A legal provision by which publishers and producers have to deliver a fixed number of copies of each publication to a library or a similar institution.
Publisher. The person or corporate body with the financial and/or administrative responsibility for the release of a publication. Everybody who publishes contents on the web is considered to be a publisher.
Web archive. A system enabling long-term storage, protection and access to electronic resources published on the Internet.
Web publication (web resource, online publication). Electronic document made available to the public via Internet.
Web site (web page, Internet page). Location on the World Wide Web; a set of web pages identified by a unique URL that make up a whole.
Same general criteria are applied to printed and web resources:
Catalogued and archived are news portals, thematic portals, portals, web sites of institutions, associations, clubs, scientific and research projects, journals, books, selected personal pages, personal, collective and thematic blogs as sources of information on contemporary culture and economic, social and political trends, blogs with significant influence in the public life the authors of which write under their real name, selected forums that conform to the criteria given above.
Not catalogued are search engines, games, advertising pages, pages of companies and businesses, preliminary versions of publications, mailing lists, chat, resources distributed exclusively by e-mail, resources on the intranet, most personal pages, blogs and forums, pages that contain links to texts from other resources.
Digitised resources in digital collections of other institutions and other web archives are not catalogued.
National and University Library in Zagreb in collaboration with the University of Zagreb University Computing Centre (Srce) crawls the national web domain (.hr) once a year. In addition, NSK periodically crawls websites related to topics and events of national importance.