This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

ESB3021 - ew-vodingest

Ingest VoD assets into ESF and DASH OnDemand format

1 - Introduction

Introduction to ew-vodingest

ew-vodingest (ESB3021) reads and ingests an MP4 audio and video + subtitles asset defined by a SMIL file into an ESF (Edgeware Storage Format) asset in a way that is compatible with the DASH OnDemand profile. A corresponding DASH MPD file is generated. The syntax and usage is explained in the SMIL section.

Alternatively, instead of a SMIL file path, a path to a directory containing media files can be specified. In this case, all files with relevant file extensions will be used, but some metadata like subtitle role cannot be specified. For best control of metadata, it is therefore recommended to use a SMIL file.

ESB3021 ew-vodingest is intended as an improvement and replacement for the ESB3005 ew-recorderingest tool. However, at present, ew-vodingest only supports a small subset of the use cases that ew-recorder supports, so the tools need to be used in parallel for some time.

The currently supported codecs are:

video: AVC, HEVC audio: AAC, HE-AACv1, HE-AACv2, AC-3, EC-3 subtitles: WebVTT, TTML, STL, SRT

A main focus of this tool is to ingest VoD content that can be used to generate live content by combining VoD assets using ESB3019 ew-vod2cbm. There is therefore a restriction on the ingested content that it must be possible to discover a common constant GoP duration (sync-frame distance) for all video tracks, and that all sample durations are exactly the same in their respective timescales. The timescales may differ, but only by integer factors, for example 50 for 50Hz video and 25 for 25Hz video.

Directory processing

It is possible to specify a directory of media files without a SMIL file. ew-vodingest will list all files in that directory and extract all files with extensions:

	.mp4 for audio and video
	.ttml, .webvtt, .vtt, .stl, .srt for subtitles

This may be convenient, but gives fewer possibilities to set parameters for the tracks. For subtitles, the language can be automatically extracted from the filename.

Usage

ew-vodingest is a command-line tool that is run as

$ ew-vodingest [options]

  -hls
    	Output HLS playlists (experimental)
  -i string
    	SMIL file or dir defining the asset (mandatory)
  -o string
    	output directory (mandatory)
  -slim
    	Only output content_info and dat files
  -version
    	Get version, date and possible expiration date
  -w int
    	minimal output segment duration in milliseconds (default 4000)
  -y int
    	maximal output segment duration in milliseconds (default 12000)

The -i, -o, -w, -y options are the same as for ESB3005 ew-recorder.

2 - Installation

Installation

The command line tool ew-vodingest can be installed on a Linux distribution compatible with RedHat Enterprise Linux 7 or 8 with:

$ sudo yum install ew-vodingest-x.y.z-1.el8.x86_64.rpm

To uninstall, do:

$ sudo yum remove ew-vodingest.x86_64

3 - SMIL manifest

Input specification using a SMIL file

Input SMIL syntax

SMIL is an XML format for multimedia presentation. It can be used instead of an HLS master playlist or DASH Media Presentation File to tell what media files should be combined into an asset and some associated metadata parameters. The level of detail, however, is much lower than in the other formats. ew-vodingest supports the import of MP4 files for video and audio, in connection to subtitle files.

Our usage of SMIL follows legacy format of Wowza, but we have added some parameters as displayName for HLS and role for subtitles.

Basic structure

SMIL files should specify all relevant media files in a switch block in the body:

<?xml version="1.0" encoding="UTF-8"?>
<smil>
    <body>
        <switch>
            <video src="movie1.mp4" ... />
            <video src="audio.mp4" ... />
            <textstream ... />
            <srt ... />
        </switch>
    </body>
</smil>

Stream types

The supported stream types are <video> which either means video, audio or both, <texstream> that is a subtitle file or <srt> which is a subtitle file in SRT format.

Video/Audio streams

Audio and video streams are identified by the video tag.

Minimal configuration

The simplest variant is to just give a src attribute:

<video src="videoAndAudio.mp4"/>

If this is the only information give, all video and audio tracks will be extracted and given names following the patterns

media type pattern example
video video_<codec>_<bitrate> video_hevc_9000kbps
audio audio_<codec>_<lang>_<bitrate> audio_aac_en_256kbps

depending on what tracks and codecs are available. The bitrate is calculated from the file size and duration and the language is extracted from the mp4 file if available in an elng, of in the mdhd box as a fallback.

If multiple files are specified in the SMIL file, tracks will be extracted from all of them, but if the resulting names (mediatype_codec_bitrate_language) coincide, only one copy will be kept. This makes it possible to import files which all include the same audio bitrate and language, but different video bitrates.

Specifying the bitrate

The system-bitrate attribute is used to specify the bitrate for the stream.

<video src="video.mp4" system-bitrate="2500000" />

Optionally it is possible to specify bitrates as <param> values as

<video src="videoAndAudio.mp4">
    <param name="videoBitrate" value="2500000"/>
    <param name="audioBitrate" value="128000"/>
</video>

Track selection and track-specific parameters

To achieve higher control over the extraction of tracks, and their parameters, it is possible to add extra parameters.

In particular, one can use the audioOnly and videoOnly keys to specify that only one type of media track should be extracted.

To extract audio and video in this way, one could use the following snippet:

<video src="videoAndAudio.mp4" system-bitrate="2500000">
    <param name="videoOnly" value="TRUE"/>
</video>
<video src="videoAndAudio.mp4" system-bitrate="128000">
    <param name="audioOnly" value="TRUE"/>
</video>

In addition, one can add an audioindex query index to extract a specific audio track. The audioindex value relates to the track ID inside an mp4 file, but is zero-based. The mapping is that ?audioindex=0 refers to the audio track with the lowest track ID, ?audioindex=1 to the second, and so on.

Here is an example that extracts the first two audio tracks, and gives them different parameters for language, bitrate, and displayName:

<video src="hev1_aac_mc.mp4?audioindex=0" system-language="dk" audio-bitrate="256000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="Danish 6ch"/>
</video>
<video src="hev1_aac_mc.mp4?audioindex=1" system-language="dk" audio-bitrate="192000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="Danish 2ch"/>
</video>

Audio Language

The language for an audio stream can be set using the system-language attribute:

<video src="mp4:video3.mp4?audioindex=0" system-language="eng">

For legacy reasons, one can alternatively use the attribute systemLanguage or language.

If the language is not specified, 3-letter language code in the mdhd box will be used. It will in turn be overridden by the optional elng box that can contain any language code.

Subtitle input

The supported input subtitle formats are TTML, WebVTT, STL, SRT. In all cases, a complete side-loaded file is expected. As part of the ESF format, the subtitles will be transformed into segmented wvtt. A complete WebVTT file will also be generated and referred to in the generated DASH manifest. The name of the output subtitle tracks are of the form

media type pattern example
subtitles subtitles_wvtt_<lang>_<role> subtitles_wvtt_se_caption

The role can be either caption or subtitle. If not specified, the role will be not be in the track name.

Subtitle streams in TTML, WebVTT, STL, or STT files are specified with the <textstream> or <srt> tags. The language can be specified with language attribute, or, for textstream, with the system-language attribute like:

<textstream src="subtitles.ttml" system-language="en" />
<srt src="swedish.srt" language="se"/>

The format is auto-detected from the file extension, which must be one of:

 .ttml, .webvtt, .vtt, .stl, .srt

The case of multiple languages in the same TTML file is not supported.

There is no bitrate specified for text streams. It will always be set to 1kbps.

Extracting language from subtitle file name

For the case where the SMIL-file is missing or there is no language attribute for the subtitle files, ew-vodingest will try to extract a language from the file name. The language extraction algorithm works like this:

  1. the file extension is removed
  2. split the name on “-” characters
  3. if the last part is at most three characters, use it as a language code

If a language is not found, the subtitle languages will be denoted “und”, “und1”, “und2” etc.

Example SMIL files

In the following, we give examples to show some possible variations of supported SMIL files.

Example 1 - video and audio from all files

This example has width and height for the video. That information will be discarded. There are only two distinct combinations of language and bitrate for audio, so only two variants audio_aac_eng_128kbps and audio_aac_eng_192kbps will be generated.

<?xml version="1.0" encoding="UTF-8"?>
<smil>
    <body>
    <switch>
        <video height="360" src="profile1.mp4" systemLanguage="eng" width="480">
        <param name="videoBitrate" value="500000"/>
        <param name="audioBitrate" value="128000"/>
        </video>
        <video height="480" src="profile2.mp4" systemLanguage="eng" width="720">
        <param name="videoBitrate" value="800000"/>
        <param name="audioBitrate" value="128000"/>
        </video>
        <video height="540" src="profile3.mp4" systemLanguage="eng" width="960">
        <param name="videoBitrate" value="1300000"/>
        <param name="audioBitrate" value="128000"/>
        </video>
        <video height="720" src="profile4.mp4" systemLanguage="eng" width="1280">
        <param name="videoBitrate" value="2300000"/>
        <param name="audioBitrate" value="192000"/>
        </video>
        <video height="1080" src="profile5.mp4" systemLanguage="eng" width="1920">
        <param name="videoBitrate" value="5000000"/>
        <param name="audioBitrate" value="192000"/>
        </video>
    </switch>
    </body>
</smil>

Example 2 - audioindex queries

This example shows extraction of audio tracks using the audioindex query parameter. The mp4: “scheme” is not needed, but supported for legacy reasons. The mp4:/// scheme is also supported for the same reason.

<?xml version="1.0"?>
<smil>
    <body>
    <switch>
        <video src="mp4:video1.mp4?audioindex=0" system-language="eng" audio-bitrate="96000">
        <param name="audioOnly" value="TRUE"/>
        </video>
        <video src="mp4:video1.mp4?audioindex=1" system-language="ger" audio-bitrate="96000">
        <param name="audioOnly" value="TRUE"/>
        </video>
        <video src="video2.mp4" system-bitrate="2000000">
        <param name="videoOnly" value="TRUE"/>
        </video>
        <video src="video1.mp4" system-bitrate="5000000">
        <param name="videoOnly" value="TRUE"/>
        </video>
        <textstream src="subtitles.ttml" system-language="eng">
        </textstream>
    </switch>
    </body>
</smil>

Example 3 - displayName and role parameters

This example uses parameters for displayName for audio and subtitles, and role for subtitles.

<?xml version="1.0" encoding="utf-8"?>
<smil>
  <body>
    <switch>
      <video src="video800.mp4" system-bitrate="800000">
        <param name="videoOnly" value="TRUE"/>
      </video>
      <video src="video400.mp4" system-bitrate="400000">
        <param name="videoOnly" value="TRUE"/>
      </video>
      <video src="audio.mp4?audioindex=0" system-language="eng" audio-bitrate="256000">
        <param name="audioOnly" value="TRUE"/>
        <param name="displayName" value="English 6ch"/>
      </video>
      <video src="audio.mp4?audioindex=1" system-language="eng" audio-bitrate="192000">
        <param name="audioOnly" value="TRUE"/>
        <param name="displayName" value="English 2ch"/>
      </video>
      <srt src="swe.srt" language="swe">
        <param name="displayName" value="svenska"/>
        <param name="role" value="subtitle"/>
      </srt>
      <srt src="swe_cc.srt" language="swe">
        <param name="displayName" value="svenska (CC)"/>
        <param name="role" value="caption"/>
      </srt>
      <textstream src="eng.stl" system-language="eng">
        <param name="displayName" value="English"/>
        <param name="role" value="caption"/>
      </textstream>
    </switch>
  </body>
</smil>

Example 4 - The simplest possible - same as no SMIL

This example shows a SMIL file where all parameters are extracted automatically. In this case all video sources contain audio with language set to engin the mdhd box in the mp4 files, and the subtitle languages can be extracted from the file names.

<?xml version="1.0" encoding="UTF-8"?>
<smil>
  <body>
    <switch>
      <video src="0.mp4"/>
      <video src="1.mp4"/>
      <video src="2.mp4"/>
      <video src="3.mp4"/>
      <srt src="xyz-eng.srt"/>
      <srt src="xyz-spa.srt"/>
    </switch>
  </body>
</smil>

The generated tracks are:

  • 4 video tracks with different bitrates
  • 2 audio track with different bitrates, but the same language “eng”
  • 2 subtitle tracks with language codes “eng” and “spa”

In this particular case, the SMIL file just provides a list of files and no extra parameter. Therefore, it is also possible to specify the directory of this file as input to ew-vodingest and get exactly the same result.

4 - Releases

ESB3021 - ew-vodingest releases

4.1 - Release 0.6.2

Initial beta release of ew-vodingest

Features:

  • Ingest SMIL + mp4 + subtitle assets in combined ESF + DASH OnDemand format
  • Support AVC, HEVC video
  • Support AAC, AC-3 audio
  • Support TTML, WebVTT, SRT, STL subtitles
  • Support displayName parameter for audio and subtitles
  • Support role (caption or subtitle) parameter for subtitles

Known limitations:

  • Requires mounted file system for input and output
  • Require constant GoP duration and constant sample steps
  • Does not preserve subtitle styling
  • Logging is not nice

Release information

  • Date: 2022-03-28
  • Type: Beta for testing
  • Expiration Date: 2022-09-24