Prezentace Screen scraping se ScraperWiki z workshopu Big Clean,
Chcete vědět víc? Mnoho dalších prezentací, videí z konferencí, fotografií i jiných dokumentů je k dispozici v institucionálním repozitáři NTK: http://repozitar.techlib.cz
Would you like to know more? Find presentations, reports, conference videos, photos and much more in our institutional repository at: http://repozitar.techlib.cz/?ln=en
Prezentace Screen scraping se ScraperWiki z workshopu Big Clean,
Chcete vědět víc? Mnoho dalších prezentací, videí z konferencí, fotografií i jiných dokumentů je k dispozici v institucionálním repozitáři NTK: http://repozitar.techlib.cz
Would you like to know more? Find presentations, reports, conference videos, photos and much more in our institutional repository at: http://repozitar.techlib.cz/?ln=en
Přednáška z 4. WP konference - bezpečnost Wordpressu. Aktuální statistiky, základní útoky, skenování wordpressu, iThemes Securtiy, Fail2Ban, Web Application Firewall.
Další info na: http://edu.lynt.cz/course/bezpecnost-wordpressu
The National Library of the Czech Republic has been archiving Czech websites since 2000 through its Webarchiv project. It currently archives over 245 TB of data representing billions of digital objects and over 1.6 million second-level domains under .cz. The archive is accessible at the library and selective harvests are accessible online. It focuses on long-term preservation of Czech web content while addressing legal issues around copyright and access. The department employs 3.5 people and uses software like Heritrix for crawling and Open Wayback for access.
Přednáška z 4. WP konference - bezpečnost Wordpressu. Aktuální statistiky, základní útoky, skenování wordpressu, iThemes Securtiy, Fail2Ban, Web Application Firewall.
Další info na: http://edu.lynt.cz/course/bezpecnost-wordpressu
The National Library of the Czech Republic has been archiving Czech websites since 2000 through its Webarchiv project. It currently archives over 245 TB of data representing billions of digital objects and over 1.6 million second-level domains under .cz. The archive is accessible at the library and selective harvests are accessible online. It focuses on long-term preservation of Czech web content while addressing legal issues around copyright and access. The department employs 3.5 people and uses software like Heritrix for crawling and Open Wayback for access.
This document discusses web archiving in the Czech Republic. It provides information on who archives the web, how it is archived, and why archiving the web is important. The National Library of the Czech Republic leads web archiving efforts and works with international partners like the International Internet Preservation Consortium to archive over 200 TB of web data using software like Heritrix and OpenWayback. Metadata standards like WARC and CDX are used to describe archived web pages and their relationships over time.