About

Zeit.shift

Zeit.shift is a cooperation between the Dr. Friedrich Teßmann Library (Bolzano, Italy), the Universitäts- und Landesbibliothek Tirol (Innsbruck, Austria) and Eurac Research (Bolzano, Italy) funded by the European Regional Development Fund and Interreg V-A Italia - Austria 2014-2020, which seeks to preserve, develop and communicate the cultural and textual heritage of the historical region of Tyrol.

The project focuses on historical newspapers written in German and mostly Fraktur script, which are currently scattered across Tyrol and are only partially digitised. The goal of the project is twofold:

  1. digitise some 500,000 pages of Tyrolean papers published between 1850 and 1950 and gather them in a single, freely accessible web platform;
  2. promote participatory culture research by inviting citizens to actively curate, explore and engage with the data to accelerate research and create new knowledge.

Given the large number of newspapers (approx. 500,000 pages), the more people help curate the data and spread the word about Zeit.shift, the more searchable the newspaper corpus becomes and the longer the historical memory of Tyrol will be preserved.

Digitisation

Zeit.shift is digitally scanning the newspapers and running the scans through Optical Character Recognition (OCR). OCR is the automatic conversion of printed text (e.g., an image of a newspaper page) into digital format (e.g., a Word document). This process often introduces errors in the digital text, especially when dealing with historical sources: issues such as faded ink, complex fonts or poor quality scans affect the recognition capabilities of the OCR machine. A digital text containing OCR mistakes is known as “noisy” text. The noisier the text, the harder it is to use and search.


Here is an example of a "noisy" digital text in Zeit.shift:
Example scan
Tiroler Land-Zeitung, 21st December 1918, p. 8: Digital scan.

 

Example dirty OCR
Tiroler Land-Zeitung, 21st December 1918, p. 8: Noisy OCR (mistakes in red).

The newspapers

To be digitised

The Tyrolean newspapers we are digitising are held at the Friedrich Teßmann and Innsbruck University libraries. Here is the complete list (titles shortened): Innsbrucker Zeitung, Alpenland, Alpenländer Bote, Der Arbeiter, Volksruf, Gardasee-Post, Neueste Zeitung, Neueste Morgenzeitung, Innsbrucker Neueste, Innsbrucker illustrierte neueste Nachrichten, Abendblatt, Innsbrucker Abendblatt, Der Oberländer, Der Südtiroler, Tiroler Bauernzeitung, Tiroler Landbote, Der Landbote, Tiroler Grenzbote, Tiroler Volksblatt, Tiroler Land-Zeitung, Tiroler Gemeindeblatt, Alpenrosen, Oberinntaler Wochenblatt, Neue Inn-Zeitung, Tiroler Post, Die Post, Tiroler Sonntagsbote, Der Tiroler Wastl, Der Widerhall, Tiroler-Vorarlberger Bienen-Zeitung, Tiroler Bienen-Zeitung, Alpenländische Bienenzeitung, Unterinntaler Bote, Haller Wochenblatt, Sterne und Blumen, Volkszeitung Innsbruck, Deutsche Volkszeitung.

Available in Historypin

Online
  • Unterinntaler Bote, 1900 (110 adverts)
  • Schwazer Lokalanzeiger Kreisblatt, 1928 (203 adverts)
  • Meraner Zeitung, 1920 (1897 adverts)
  • Tiroler Grenzbote, 1923 (1602 adverts)
  • Tiroler-Vorarlberger Bienen-Zeitung, 1924-1925 (54 adverts)
  • Lienzer Zeitung, 1919 (104 adverts)
  • Pustertaler Bote, 1924 (409 adverts)
  • Bozner Nachrichten, January-April 1921 (2109 adverts)
Upcoming
  • Unterinntaler Bote, 1910
  • Reuttener Nachrichten, 1931
  • Der Oberländer, 1909
  • Haller Lokalanzeiger, 1934
  • Brixner Chronik, 1918

Technical overview

OCR was performed using Abbyy FineReader Engine v.11. The digitised material (TIFF image scans and ALTO XML files) amounts to some 13TB of data.

Citizen science

Zeit.shift falls under the crowdsourcing and distributed intelligence typology of citizen science projects (Haklay, 2013). As such, it relies on the cognitive and observation abilities of the participants to crowdsource research data.

Historypin

  • Task type: macrotask
  • Purpose: geo- and semantic tagging of adverts, collect additional expert knowledge from the public, foster remix culture

Ötzit! game

  • Task type: microtask
  • Purpose: correct the OCR of individual words

Dataset

The citizen science component of the project focuses on newspaper issues published approximately 100 years ago (1918-1924).

Project details

Team

  • Landesbibliothek Dr. Friedrich Teßmann (Lead partner): Johannes Andresen
  • Universitäts- und Landesbibliothek Tirol: Silvia Gstrein, Maritta Horwath, Christian Kössler, Barbara Laner, Johanna Walcher
  • Eurac Research: Andrea Abel, Paolo Brasolin, Greta Franzini, Verena Lyding, Egon Stemle, Anna Tramarin (intern), Giovanni Moretti (external advisor)
Teßmann logo
Uni Innsbruck logo
Eurac Research logo

Associated partners

  • Euregio Tirol-Südtirol-Trentino
  • Abteilung Tiroler Landesarchiv der Tiroler Landesregierung
  • Tiroler Landesmuseen
  • Südtiroler Kulturinstitut
  • Tiroler Bildungsforum
  • Südtiroler Landesarchiv der Autonomen Provinz Bozen-Südtirol
  • Bibliotheksverband Südtirol

Duration

  • October 2020 - June 2023

Budget

  • 658,000 Euros
    Inerreg logo

Objectives

  1. Digitisation of Tyrolean historical newspapers
  2. Development and implementation of a citizen science initiative to curate and enrich the digitised newspapers
  3. Computational linguistic processing of the digitised newspapers to improve data search and visualisation options
  4. Digital access to the complete collection of newspapers via a web platform to support research and education

Keywords

citizen science, games with a purpose, digitisation, historical newspapers, tyrol, natural language processing, heritage science, digital cultural heritage, cultural heritage volunteering, cultural heritage crowdsourcing, digital humanities, public history, public digital humanities

Licensing

  • Full newspaper scans from the Teßmann Digital Library are licensed under CC BY-NC 4.0
  • Full newspaper scans from the UniInnsbruck Digital Library are licensed under CC BY 4.0
  • Website and activity images are licensed under CC BY 4.0
  • Data and annotations contributed by citizens are licensed under CC0 1.0
  • Ötzit! is licensed under an MIT licence.

Credits

Zeit.shift sources icons from Icongeek26 and Freepik at Flaticon, and audio files from Freesound.