Data modelling and Citizen Science: impact of user-generated content within the PIA research project

Abstract:

<aside> 💡 Participatory Knowledge Practices in Analogue and Digital Image Archives (PIA) is an interdisciplinary project based on three photographic collections (Atlas of Swiss Folklore, Family Kreis, Ernst Brunner) of the Swiss Society Folklore Studies (SSFS) which aims to "design a visual interface with machine learning-based tools to make it easy to annotate, contextualise, organise and link both images and their meta-information, to deliberately encourage the participatory use of archives".

The human dimension is at the heart of the PIA project, allowing not only the enrichment and correction of metadata through crowdsourcing, but also enabling end users to generate their own collections - or stories - with new associated metadata as well as being able to upload personal content (call for images).

Thus, a new data model will need to particularly take into account the following elements:

  1. Management of existing (non-controlled) keywords and the forthcoming folksonomy;
  2. Metadata correction and enrichment through crowdsourcing;
  3. Validation of metadata created by Machine Learning methods (notably Object Detection and Visual Text Co-Embedding);
  4. Annotation and transcription of digital surrogates and born-digital content that are compatible with the W3C Web Annotation Data Model;
  5. Linked Data reconciliation of entities (agent, concept, place) on the basis of the existing data model and the user-generated metadata;
  6. Content upload that could, to some extent, be compatible with the International Image Interoperability Framework (IIIF).

PIA is as much about traditional crowdsourcing as it is about creating a modern Citizen Science platform that enables new uses and helps to streamline the research process of scholars. For this purpose, a user interface and various application programming interfaces (APIs) will be deployed to accommodate various forms of re-use by third parties.

The presentation will give the current state of the data model, the different knowledge representations and the ways forward.

</aside>

Resources**:**