Understanding and Selecting a File Format for AV

Peter Bubestinger-Steindl
(peter @ ArkThis.com)

Why bother? - Let’s just have:

  • Best quality
  • Smallest filesize
  • Preserve original properties
  • Fast and easy to open/use/edit
  • Lasts forever
  • +cherries 🍒 & ice cream 🍦on top!
The Holy Grail

Tempting…

  • Hey, it’s a standard!
  • Hey, everyone’s using it!
  • Hey, the “big ones” are using it!
  • Hey, it’s from a major company!
  • Hey, it can do everything!
  • Hey, it’s so easy to use!
  • Hey, it’s gratis!

Digital AV formats…

  • Which do you know?
  • Which do you use?
  • Which would you like to know more about?
 

What are your wishes, needs, expectations of a format? 🤔️

Digital Video Trinity

What’s a Container / Wrapper?

"A container format […] is a file format that allows multiple data streams to be embedded into a single file.

Usually along with metadata for identifying and further detailing those streams."

Source: Wikipedia: Container format (computing)

What’s a Container?

Think of a regular paper folder…

  • It’s a wrapper around content.
  • Contains Metadata.
  • Structures the content streams.
Videofile paper mockup

What’s a Codec?

“A codec is a device or computer program which encodes or decodes a data stream or signal.”

Source: Wikipedia: Codec

What’s a Codec?

Think of a human language…

  • It’s coded information.
  • There may be dialects.
  • Different people may
    “speak / understand” differently.

Format Naming

Triplet notation greatly helps reducing confusion:

  • H.264 / AAC in MP4
  • FFV1 / PCM in MKV (Matroska)
  • ProRes / PCM in MOV
  • DPX / WAV (PCM) in a folder
  • etc

See The MPEG-4 confusion 😜️

“Rebranded” Format Names

Some (professional) video formats are actually a profile-set for existing (standards) formats.

This is good! 😇️

Let’s look inside! :)

VLC / MediaInfo

Website: videolan.org/vlc Mediainfo’s “Easy View” Website: mediaarea.net/MediaInfo

Characteristics / Properties

File 1 File 2 File 3
Container MOV MOV MOV
Videocodec UYVY H.264 XviD
Resolution 720 x 576px 1920 x 1080 640 x 480
FPS 25 24 30000/1001
-
Audiocodec PCM AAC MP3
Samplerate 48 kHz 48 kHz 44.1 kHz
Channels Stereo Surround 5.1 Mono

Bitrate = Data per Time

= How many bytes your (lossy) encoder may spend on the quality of the material.

 

Higher bitrate Better Codec
Better quality, but larger files Better quality at same size or: Same quality at smaller size

Significant properties

Knowing and deciding which properties to safeguard and which are allowed to change.

 

See:
LoC FADGI: DRAFT Significant Properties for Digital Video
Nestor (DE): Leitfaden DLTP AV Medien

Significant properties

Depend on media type (and use case).

Video Audio Metadata
  • resolution
  • framerate
  • aspect ratio
  • colorspace
  • subsampling
  • “resolution”
    (= samplerate, bit-depth)
  • channels
  • channel layout
  • language
  • title
  • author
  • rights information

“Different strokes for different folks” 😉

  • Digitization: As-original, as-untouched as possible. Records in realtime?
    (Plus: has headroom for optional restoration/improvements.)

  • Preservation: Stand the tests of time.
    (Highest ‘original’ quality)

  • Production: For daily work. High quality.
    (Optional, if preservation format can be used for this)

  • Access For quick and easy access.
    (Quality not necessarily best/high, but very convenient to play)

For audio: we’re lucky.

PCM/WAV is used from digitization to preservation - and if bandwidth ain’t narrow, even for access.

Why? Because it became “small enough”.

“Different strokes for different folks” 😉

  • Digitization: (V210, FFV1, MPEG-4 / PCM) uncompressed/lossless or very-high quality lossy.

  • Preservation: (V210, FFV1, MPEG-2, MPEG-4 / PCM) uncompressed/lossless or (very)high-quality lossy, open & documented, error-robust.

  • Production: (MPEG-2, MPEG-4, ProRes / PCM) high-quality/high-bitrate (lossy).

  • Access: (H.264 / AAC in MP4) Most often lossy-compressed currently-popular format combination.

Format choice = A balance of …

  • Size vs Quality
  • Features
  • Performance
  • Sustainability
  • plus: time, budget, staff
  • and of course: convenience

Good starting point for assessing practical usefulness.

Risks to format longevity

  • Data errors
  • Obsolescence
  • Vendor lock-in
  • Interoperability/complexity issues
  • Quality degradation

Countermeasures?

Data errors: Error resilience?

  • Bitstream checksums:
    Ability to know if the content is intact.

  • Error concealment: Optional choice of decoder to “mask” decoding issues. (decoder specific)

  • Make backup copies! 😇

Open vs Closed

Enigma encryption rotor windows
Enigma encryption rotor windows

Theory vs Practice

“Implementation overrules paper specs. Always.”

The Eternal Replayer

+ =

Format Complexity

Format Complexity: Less is More

Good rule = “Minimalistic Data Format”:

  • As simple as possible
  • As complicated as necessary

Simpler = more stable, easier to use, keep alive, reconstruct or fix.

Your use cases/priorities?

  • Who will want/need to work with these files?
  • Under which conditions?
  • For how long?
  • Digitization vs Production vs Preservation vs Access?
  • Which properties are significant to you?

Yagni Kiss Moscow?

YAGNI / KISS / MoSCoW

Exercise: Your Format Policy

Must Should Could Won’t
______ ______ ______ ______
______ ______ ______ ______
______ ______ ______ ______
______ ______ ______ ______

Choose a use-case and try to phrase your “wishes”.

Examples of published Policies

Checklist for a Preservation Format

Sustainability:

  1. Disclosure?
  2. Open reference implementation/libs?
  3. Adoption/popularity?
  4. Complexity?
  5. Independence vs external contingencies?
  6. Artificial restrictions?
  7. Self descriptive?

Quality and functionality:

  1. Preserve “original”?
  2. Image/sound quality?
  3. Interoperability?
  4. Editing?
  5. Support for (additional/expected) properties?
  6. Performance & data size?

Source: meemoo.be

Preservation Checklist: translated! ⭐

Sustainability:

  1. Documentation openly accessible?
  2. Open reference implementation?
  3. How likely is it to be supported in tools/devices for which userbase?
  4. Which features are implemented/tested/stable?
  5. Which choice/requirements do I have to handle it beyond “shelf life”?
  6. Is it legal/possible to handle it in future/different situations?
  7. Can it contain proper metadata?

Quality and functionality:

  1. Preserve significant properties?
  2. Sufficient image/sound quality and robustness to multi-generation copies?
  3. Interoperability / ease of usage & access?
  4. Direct use for editing?
  5. How many different formats will I need (pile up)?
  6. Handle performance / data size requirements?

Original source: meemoo.be

- Fin -

Questions?

Comments?

Peter Bubestinger-Steindl

Peter @ ArkThis.com