This the multi-page printable view of this section. Click here to print.
ESB3021 - ew-vodingest
- 1: Introduction
- 2: Installation
- 3: SMIL manifest
- 4: Releases
- 4.1: Release 0.6.2
1 - Introduction
ew-vodingest
ew-vodingest
(ESB3021) reads and ingests an MP4 audio and video + subtitles asset
defined by a SMIL file into
an ESF (Edgeware Storage Format) asset in a way that is compatible with the
DASH OnDemand profile. A corresponding DASH MPD file is generated.
The syntax and usage is explained in the SMIL section.
Alternatively, instead of a SMIL file path, a path to a directory containing media files can be specified. In this case, all files with relevant file extensions will be used, but some metadata like subtitle role cannot be specified. For best control of metadata, it is therefore recommended to use a SMIL file.
ESB3021 ew-vodingest
is intended as an improvement and replacement for
the ESB3005 ew-recorder
ingest tool.
However, at present, ew-vodingest
only supports a small subset of the
use cases that ew-recorder
supports, so the tools need to be used in
parallel for some time.
The currently supported codecs are:
video: AVC, HEVC audio: AAC, HE-AACv1, HE-AACv2, AC-3, EC-3 subtitles: WebVTT, TTML, STL, SRT
A main focus of this tool is to ingest VoD content that can be used to
generate live content by combining VoD assets using ESB3019 ew-vod2cbm
.
There is therefore a restriction on the ingested content that it must be possible
to discover a common constant GoP duration (sync-frame distance)
for all video tracks, and that all sample durations are exactly the same
in their respective timescales. The timescales may differ, but only
by integer factors, for example 50 for 50Hz video and 25 for 25Hz
video.
Directory processing
It is possible to specify a directory of media files without a SMIL file.
ew-vodingest
will list all files in that directory and extract all files
with extensions:
.mp4 for audio and video
.ttml, .webvtt, .vtt, .stl, .srt for subtitles
This may be convenient, but gives fewer possibilities to set parameters for the tracks. For subtitles, the language can be automatically extracted from the filename.
Usage
ew-vodingest
is a command-line tool that is run as
$ ew-vodingest [options]
-hls
Output HLS playlists (experimental)
-i string
SMIL file or dir defining the asset (mandatory)
-o string
output directory (mandatory)
-slim
Only output content_info and dat files
-version
Get version, date and possible expiration date
-w int
minimal output segment duration in milliseconds (default 4000)
-y int
maximal output segment duration in milliseconds (default 12000)
The -i, -o, -w, -y
options are the same as for ESB3005 ew-recorder
.
2 - Installation
The command line tool ew-vodingest
can be installed on a Linux
distribution compatible with RedHat Enterprise Linux 7 or 8 with:
$ sudo yum install ew-vodingest-x.y.z-1.el8.x86_64.rpm
To uninstall, do:
$ sudo yum remove ew-vodingest.x86_64
3 - SMIL manifest
Input SMIL syntax
SMIL is an XML format for multimedia presentation. It can be used
instead of an HLS master playlist or DASH Media Presentation File to
tell what media files should be combined into an asset and some associated
metadata parameters. The level of detail, however, is much lower than in the
other formats. ew-vodingest
supports the import of MP4 files for video
and audio, in connection to subtitle files.
Our usage of SMIL follows legacy format of Wowza, but we have added some parameters
as displayName
for HLS and role
for subtitles.
Basic structure
SMIL files should specify all relevant media files in a switch block in the body:
<?xml version="1.0" encoding="UTF-8"?>
<smil>
<body>
<switch>
<video src="movie1.mp4" ... />
<video src="audio.mp4" ... />
<textstream ... />
<srt ... />
</switch>
</body>
</smil>
Stream types
The supported stream types are <video>
which either means video, audio or both,
<texstream>
that is a subtitle file or <srt>
which is a subtitle
file in SRT format.
Video/Audio streams
Audio and video streams are identified by the video
tag.
Minimal configuration
The simplest variant is to just give a src
attribute:
<video src="videoAndAudio.mp4"/>
If this is the only information give, all video and audio tracks will be extracted and given names following the patterns
media type | pattern | example |
---|---|---|
video | video_<codec>_<bitrate> |
video_hevc_9000kbps |
audio | audio_<codec>_<lang>_<bitrate> |
audio_aac_en_256kbps |
depending on what tracks and codecs are available. The bitrate is calculated from
the file size and duration and the language is extracted from the mp4 file
if available in an elng
, of in the mdhd
box as a fallback.
If multiple files are specified in the SMIL file, tracks will be extracted from all of them, but if the resulting names (mediatype_codec_bitrate_language) coincide, only one copy will be kept. This makes it possible to import files which all include the same audio bitrate and language, but different video bitrates.
Specifying the bitrate
The system-bitrate
attribute is used to specify the bitrate for the stream.
<video src="video.mp4" system-bitrate="2500000" />
Optionally it is possible to specify bitrates as <param>
values as
<video src="videoAndAudio.mp4">
<param name="videoBitrate" value="2500000"/>
<param name="audioBitrate" value="128000"/>
</video>
Track selection and track-specific parameters
To achieve higher control over the extraction of tracks, and their parameters, it is possible to add extra parameters.
In particular, one can use the audioOnly
and videoOnly
keys to specify
that only one type of media track should be extracted.
To extract audio and video in this way, one could use the following snippet:
<video src="videoAndAudio.mp4" system-bitrate="2500000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="videoAndAudio.mp4" system-bitrate="128000">
<param name="audioOnly" value="TRUE"/>
</video>
In addition, one can add an audioindex
query index to extract a specific audio track.
The audioindex
value relates to the track ID inside an mp4 file, but
is zero-based. The mapping is that ?audioindex=0
refers to the audio
track with the lowest track ID, ?audioindex=1
to the second, and so on.
Here is an example that extracts the first two audio tracks, and gives them different parameters for language, bitrate, and displayName:
<video src="hev1_aac_mc.mp4?audioindex=0" system-language="dk" audio-bitrate="256000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="Danish 6ch"/>
</video>
<video src="hev1_aac_mc.mp4?audioindex=1" system-language="dk" audio-bitrate="192000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="Danish 2ch"/>
</video>
Audio Language
The language for an audio stream can be set using the system-language
attribute:
<video src="mp4:video3.mp4?audioindex=0" system-language="eng">
For legacy reasons, one can alternatively use the attribute systemLanguage
or language
.
If the language is not specified, 3-letter language code in the mdhd
box will be used.
It will in turn be overridden by the optional elng
box that can contain any language code.
Subtitle input
The supported input subtitle formats are TTML, WebVTT, STL, SRT. In all cases, a complete side-loaded file is expected. As part of the ESF format, the subtitles will be transformed into segmented wvtt. A complete WebVTT file will also be generated and referred to in the generated DASH manifest. The name of the output subtitle tracks are of the form
media type | pattern | example |
---|---|---|
subtitles | subtitles_wvtt_<lang>_<role> |
subtitles_wvtt_se_caption |
The role can be either caption
or subtitle
. If not specified, the role will
be not be in the track name.
Subtitle streams in TTML, WebVTT, STL, or STT files are specified
with the <textstream>
or <srt>
tags.
The language can be specified with language
attribute, or,
for textstream, with the system-language
attribute like:
<textstream src="subtitles.ttml" system-language="en" />
<srt src="swedish.srt" language="se"/>
The format is auto-detected from the file extension, which must be one of:
.ttml, .webvtt, .vtt, .stl, .srt
The case of multiple languages in the same TTML file is not supported.
There is no bitrate specified for text streams. It will always be set to 1kbps.
Extracting language from subtitle file name
For the case where the SMIL-file is missing or there is no language attribute for
the subtitle files, ew-vodingest
will try to extract a language from the file name.
The language extraction algorithm works like this:
- the file extension is removed
- split the name on “-” characters
- if the last part is at most three characters, use it as a language code
If a language is not found, the subtitle languages will be denoted “und”, “und1”, “und2” etc.
Example SMIL files
In the following, we give examples to show some possible variations of supported SMIL files.
Example 1 - video and audio from all files
This example has width and height for the video. That information will be discarded.
There are only two distinct combinations of language and bitrate for audio, so
only two variants audio_aac_eng_128kbps
and audio_aac_eng_192kbps
will be generated.
<?xml version="1.0" encoding="UTF-8"?>
<smil>
<body>
<switch>
<video height="360" src="profile1.mp4" systemLanguage="eng" width="480">
<param name="videoBitrate" value="500000"/>
<param name="audioBitrate" value="128000"/>
</video>
<video height="480" src="profile2.mp4" systemLanguage="eng" width="720">
<param name="videoBitrate" value="800000"/>
<param name="audioBitrate" value="128000"/>
</video>
<video height="540" src="profile3.mp4" systemLanguage="eng" width="960">
<param name="videoBitrate" value="1300000"/>
<param name="audioBitrate" value="128000"/>
</video>
<video height="720" src="profile4.mp4" systemLanguage="eng" width="1280">
<param name="videoBitrate" value="2300000"/>
<param name="audioBitrate" value="192000"/>
</video>
<video height="1080" src="profile5.mp4" systemLanguage="eng" width="1920">
<param name="videoBitrate" value="5000000"/>
<param name="audioBitrate" value="192000"/>
</video>
</switch>
</body>
</smil>
Example 2 - audioindex queries
This example shows extraction of audio tracks using the audioindex
query parameter.
The mp4:
“scheme” is not needed, but supported for legacy reasons.
The mp4:///
scheme is also supported for the same reason.
<?xml version="1.0"?>
<smil>
<body>
<switch>
<video src="mp4:video1.mp4?audioindex=0" system-language="eng" audio-bitrate="96000">
<param name="audioOnly" value="TRUE"/>
</video>
<video src="mp4:video1.mp4?audioindex=1" system-language="ger" audio-bitrate="96000">
<param name="audioOnly" value="TRUE"/>
</video>
<video src="video2.mp4" system-bitrate="2000000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="video1.mp4" system-bitrate="5000000">
<param name="videoOnly" value="TRUE"/>
</video>
<textstream src="subtitles.ttml" system-language="eng">
</textstream>
</switch>
</body>
</smil>
Example 3 - displayName and role parameters
This example uses parameters for displayName
for audio and subtitles, and
role
for subtitles.
<?xml version="1.0" encoding="utf-8"?>
<smil>
<body>
<switch>
<video src="video800.mp4" system-bitrate="800000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="video400.mp4" system-bitrate="400000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="audio.mp4?audioindex=0" system-language="eng" audio-bitrate="256000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="English 6ch"/>
</video>
<video src="audio.mp4?audioindex=1" system-language="eng" audio-bitrate="192000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="English 2ch"/>
</video>
<srt src="swe.srt" language="swe">
<param name="displayName" value="svenska"/>
<param name="role" value="subtitle"/>
</srt>
<srt src="swe_cc.srt" language="swe">
<param name="displayName" value="svenska (CC)"/>
<param name="role" value="caption"/>
</srt>
<textstream src="eng.stl" system-language="eng">
<param name="displayName" value="English"/>
<param name="role" value="caption"/>
</textstream>
</switch>
</body>
</smil>
Example 4 - The simplest possible - same as no SMIL
This example shows a SMIL file where all parameters are extracted automatically.
In this case all video sources contain audio with language set to eng
in the mdhd
box in the mp4 files, and the subtitle languages can be extracted from the file names.
<?xml version="1.0" encoding="UTF-8"?>
<smil>
<body>
<switch>
<video src="0.mp4"/>
<video src="1.mp4"/>
<video src="2.mp4"/>
<video src="3.mp4"/>
<srt src="xyz-eng.srt"/>
<srt src="xyz-spa.srt"/>
</switch>
</body>
</smil>
The generated tracks are:
- 4 video tracks with different bitrates
- 2 audio track with different bitrates, but the same language “eng”
- 2 subtitle tracks with language codes “eng” and “spa”
In this particular case, the SMIL file just provides a list of files and no extra parameter.
Therefore, it is also possible to specify the directory of this file as input to ew-vodingest
and get exactly the same result.
4 - Releases
ESB3021 - ew-vodingest releases
4.1 - Release 0.6.2
Features:
- Ingest SMIL + mp4 + subtitle assets in combined ESF + DASH OnDemand format
- Support AVC, HEVC video
- Support AAC, AC-3 audio
- Support TTML, WebVTT, SRT, STL subtitles
- Support displayName parameter for audio and subtitles
- Support role (caption or subtitle) parameter for subtitles
Known limitations:
- Requires mounted file system for input and output
- Require constant GoP duration and constant sample steps
- Does not preserve subtitle styling
- Logging is not nice
Release information
- Date: 2022-03-28
- Type: Beta for testing
- Expiration Date: 2022-09-24