Metadata

Examples? Ideas?

Retro Classic

VHS metadata labels

Rockstars of

Metadata
  • Artist
  • Title
  • Album
  • Date
  • Cover art

MD Types for Preservation

  • Descriptive
  • Technical
  • Environment
  • Agent
  • Rights
  • Event
  • Provenance
  • Structure

Examples:

  • Descriptive: title, album
  • Technical: resolution, bitrate
  • Environment: tech required
  • Agent: person, organization, tool
  • Rights: copyright, ownership
  • Event: an action that has a date
  • Provenance: who owned it before
  • Structure: file is part of a sequence

Important things

  • Controlled vocabulary!
  • Date/time: ISO 8601
    (2019-05-30T10:34:47+00:00)
  • Standards: Interoperability

CoVoc Examples

  • 35mm = 35 mm = 35 millimètre
  • dup pos = duplicate positive
  • de = deu = german = German = alemán
  • YUV,4:2:2,10bpc = yuv422p10le
  • Director = Directed by

Normalize date/time

    1. May 2019
  • May 4th, 2019
  • 4.5.2019
  • 5/4/2019
  • ISO 8601
    (2019-05-30T10:34:47+00:00)

Controlled Vocabulary

Your suggestions for:

  • Descriptive?
  • Technical?
  • Other?

Standards

One To Rule Them All?

XKCD: Universal Standards “Issue”

Exercise

Imagine some regular (preservation) workflows and identify:

  • Events
    Which action is done?
  • Agents
    Who is (involved in) doing it?
  • Objects
    What does it apply to?
  • Rights
    Who holds which rights to what?
  • Relationships
    How do these entities relate to each other?

Exercise

  • Event eg: “transcode”
  • Agent eg: “tool:ffmpeg”
  • Object eg: “preservation file”
  • Rights eg: “item owner”
  • Relationship eg: “converts”, “is copy of”

Draw (and name) relationships between entities (one way/arrow). Write down what metadata you think should be captured in that entity.

PREMIS

PREMIS

PREMIS - Environment Stack

Environment Stack

Good to Know

  • Dublin Core: Core fields (& their names)
  • METS: (MD container) descriptive, administrative, structural
  • PREMIS: A metadata framework
  • EBUCore: descriptive & technical (broadcast use case focus)
  • CEN EN 15907: Comprehensive description of cinematographic works
  • FRBR (“furbur”): Comprehensive description of bibliographic works
  • Mediainfo XML: Technical metadata (AV)

Dublin Core: Elements

  1. Contributor
  2. Coverage
  3. Creator
  4. Date
  5. Description
  6. Format
  7. Identifier
  8. Language
  9. Publisher
  10. Relation
  11. Rights
  12. Source
  13. Subject
  14. Title
  15. Type

See: DC Reference Description

FRBR

FRBR Entities

EN 15907

EN 15907 Entities

Interoperability

  • Use a standard.
    (at least as basis)
  • Use English field terms.
    (at least in the background)

Metadata creation and capture

  • When is which metadata created?
    • Ingest
    • Modifications (e.g. transcoding, rewrapping)
    • Restoration
    • Meta-Metadata:
      Who edited/updated which MD-field?

Metadata creation and capture

  • How can metadata be captured?
    • Automatic
    • Manual

Metadata: Usage & Storage

  • Whatfor is metadata kept?
  • How is metadata stored?

Plain Text

Sherlock
========================
Released in 2010 in United Kingdom
Actors: Benedict Cumberbatch, Martin Freeman

CSV

Comma Separated Value

"Title", "Release Date", "Country", "Actor 1", "Actor 2"
"Sherlock", 2010-07-25, "UK", "Benedict Cumberbatch", "Martin Freeman"

The CSV file

XML

A simple XML example

The XML file

DOC / PDF?

  • Pros?
  • Cons?

Embedded vs Sidecar

  • Embedded =
    inside the media file

  • Sidecar =
    additional files next to media file

Good practice

  • Use controlled vocabulary
  • …from public/common sources
  • Prefer structured text (CSV or XML)
  • …over DOC(X) and PDF
  • Embed at least basic info to ID the file
  • …and select which data may go better as sidecar
  • Machine and human readable
  • Field names (“backstage”): English, please.

Questions?

Comments?

Example XMLs