SMIL manifest
Input SMIL syntax
SMIL is an XML format for multimedia presentation. It can be used
instead of an HLS master playlist or DASH Media Presentation File to
tell what media files should be combined into an asset and some associated
metadata parameters. The level of detail, however, is much lower than in the
other formats. ew-vodingest
supports the import of MP4 files for video
and audio, in connection to subtitle files.
Our usage of SMIL follows legacy format of Wowza, but we have added some parameters
as displayName
for HLS and role
for subtitles.
Basic structure
SMIL files should specify all relevant media files in a switch block in the body:
<?xml version="1.0" encoding="UTF-8"?>
<smil>
<body>
<switch>
<video src="movie1.mp4" ... />
<video src="audio.mp4" ... />
<textstream ... />
<srt ... />
</switch>
</body>
</smil>
Stream types
The supported stream types are <video>
which either means video, audio or both,
<texstream>
that is a subtitle file or <srt>
which is a subtitle
file in SRT format.
Video/Audio streams
Audio and video streams are identified by the video
tag.
Minimal configuration
The simplest variant is to just give a src
attribute:
<video src="videoAndAudio.mp4"/>
If this is the only information give, all video and audio tracks will be extracted and given names following the patterns
media type | pattern | example |
---|---|---|
video | video_<codec>_<bitrate> |
video_hevc_9000kbps |
audio | audio_<codec>_<lang>_<bitrate> |
audio_aac_en_256kbps |
depending on what tracks and codecs are available. The bitrate is calculated from
the file size and duration and the language is extracted from the mp4 file
if available in an elng
, of in the mdhd
box as a fallback.
If multiple files are specified in the SMIL file, tracks will be extracted from all of them, but if the resulting names (mediatype_codec_bitrate_language) coincide, only one copy will be kept. This makes it possible to import files which all include the same audio bitrate and language, but different video bitrates.
Specifying the bitrate
The system-bitrate
attribute is used to specify the bitrate for the stream.
<video src="video.mp4" system-bitrate="2500000" />
Optionally it is possible to specify bitrates as <param>
values as
<video src="videoAndAudio.mp4">
<param name="videoBitrate" value="2500000"/>
<param name="audioBitrate" value="128000"/>
</video>
Track selection and track-specific parameters
To achieve higher control over the extraction of tracks, and their parameters, it is possible to add extra parameters.
In particular, one can use the audioOnly
and videoOnly
keys to specify
that only one type of media track should be extracted.
To extract audio and video in this way, one could use the following snippet:
<video src="videoAndAudio.mp4" system-bitrate="2500000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="videoAndAudio.mp4" system-bitrate="128000">
<param name="audioOnly" value="TRUE"/>
</video>
In addition, one can add an audioindex
query index to extract a specific audio track.
The audioindex
value relates to the track ID inside an mp4 file, but
is zero-based. The mapping is that ?audioindex=0
refers to the audio
track with the lowest track ID, ?audioindex=1
to the second, and so on.
Here is an example that extracts the first two audio tracks, and gives them different parameters for language, bitrate, and displayName:
<video src="hev1_aac_mc.mp4?audioindex=0" system-language="dk" audio-bitrate="256000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="Danish 6ch"/>
</video>
<video src="hev1_aac_mc.mp4?audioindex=1" system-language="dk" audio-bitrate="192000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="Danish 2ch"/>
</video>
Audio Language
The language for an audio stream can be set using the system-language
attribute:
<video src="mp4:video3.mp4?audioindex=0" system-language="eng">
For legacy reasons, one can alternatively use the attribute systemLanguage
or language
.
If the language is not specified, 3-letter language code in the mdhd
box will be used.
It will in turn be overridden by the optional elng
box that can contain any language code.
Subtitle input
The supported input subtitle formats are TTML, WebVTT, STL, SRT. In all cases, a complete side-loaded file is expected. As part of the ESF format, the subtitles will be transformed into segmented wvtt. A complete WebVTT file will also be generated and referred to in the generated DASH manifest. The name of the output subtitle tracks are of the form
media type | pattern | example |
---|---|---|
subtitles | subtitles_wvtt_<lang>_<role> |
subtitles_wvtt_se_caption |
The role can be either caption
or subtitle
. If not specified, the role will
be not be in the track name.
Subtitle streams in TTML, WebVTT, STL, or STT files are specified
with the <textstream>
or <srt>
tags.
The language can be specified with language
attribute, or,
for textstream, with the system-language
attribute like:
<textstream src="subtitles.ttml" system-language="en" />
<srt src="swedish.srt" language="se"/>
The format is auto-detected from the file extension, which must be one of:
.ttml, .webvtt, .vtt, .stl, .srt
The case of multiple languages in the same TTML file is not supported.
There is no bitrate specified for text streams. It will always be set to 1kbps.
Extracting language from subtitle file name
For the case where the SMIL-file is missing or there is no language attribute for
the subtitle files, ew-vodingest
will try to extract a language from the file name.
The language extraction algorithm works like this:
- the file extension is removed
- split the name on “-” characters
- if the last part is at most three characters, use it as a language code
If a language is not found, the subtitle languages will be denoted “und”, “und1”, “und2” etc.
Example SMIL files
In the following, we give examples to show some possible variations of supported SMIL files.
Example 1 - video and audio from all files
This example has width and height for the video. That information will be discarded.
There are only two distinct combinations of language and bitrate for audio, so
only two variants audio_aac_eng_128kbps
and audio_aac_eng_192kbps
will be generated.
<?xml version="1.0" encoding="UTF-8"?>
<smil>
<body>
<switch>
<video height="360" src="profile1.mp4" systemLanguage="eng" width="480">
<param name="videoBitrate" value="500000"/>
<param name="audioBitrate" value="128000"/>
</video>
<video height="480" src="profile2.mp4" systemLanguage="eng" width="720">
<param name="videoBitrate" value="800000"/>
<param name="audioBitrate" value="128000"/>
</video>
<video height="540" src="profile3.mp4" systemLanguage="eng" width="960">
<param name="videoBitrate" value="1300000"/>
<param name="audioBitrate" value="128000"/>
</video>
<video height="720" src="profile4.mp4" systemLanguage="eng" width="1280">
<param name="videoBitrate" value="2300000"/>
<param name="audioBitrate" value="192000"/>
</video>
<video height="1080" src="profile5.mp4" systemLanguage="eng" width="1920">
<param name="videoBitrate" value="5000000"/>
<param name="audioBitrate" value="192000"/>
</video>
</switch>
</body>
</smil>
Example 2 - audioindex queries
This example shows extraction of audio tracks using the audioindex
query parameter.
The mp4:
“scheme” is not needed, but supported for legacy reasons.
The mp4:///
scheme is also supported for the same reason.
<?xml version="1.0"?>
<smil>
<body>
<switch>
<video src="mp4:video1.mp4?audioindex=0" system-language="eng" audio-bitrate="96000">
<param name="audioOnly" value="TRUE"/>
</video>
<video src="mp4:video1.mp4?audioindex=1" system-language="ger" audio-bitrate="96000">
<param name="audioOnly" value="TRUE"/>
</video>
<video src="video2.mp4" system-bitrate="2000000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="video1.mp4" system-bitrate="5000000">
<param name="videoOnly" value="TRUE"/>
</video>
<textstream src="subtitles.ttml" system-language="eng">
</textstream>
</switch>
</body>
</smil>
Example 3 - displayName and role parameters
This example uses parameters for displayName
for audio and subtitles, and
role
for subtitles.
<?xml version="1.0" encoding="utf-8"?>
<smil>
<body>
<switch>
<video src="video800.mp4" system-bitrate="800000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="video400.mp4" system-bitrate="400000">
<param name="videoOnly" value="TRUE"/>
</video>
<video src="audio.mp4?audioindex=0" system-language="eng" audio-bitrate="256000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="English 6ch"/>
</video>
<video src="audio.mp4?audioindex=1" system-language="eng" audio-bitrate="192000">
<param name="audioOnly" value="TRUE"/>
<param name="displayName" value="English 2ch"/>
</video>
<srt src="swe.srt" language="swe">
<param name="displayName" value="svenska"/>
<param name="role" value="subtitle"/>
</srt>
<srt src="swe_cc.srt" language="swe">
<param name="displayName" value="svenska (CC)"/>
<param name="role" value="caption"/>
</srt>
<textstream src="eng.stl" system-language="eng">
<param name="displayName" value="English"/>
<param name="role" value="caption"/>
</textstream>
</switch>
</body>
</smil>
Example 4 - The simplest possible - same as no SMIL
This example shows a SMIL file where all parameters are extracted automatically.
In this case all video sources contain audio with language set to eng
in the mdhd
box in the mp4 files, and the subtitle languages can be extracted from the file names.
<?xml version="1.0" encoding="UTF-8"?>
<smil>
<body>
<switch>
<video src="0.mp4"/>
<video src="1.mp4"/>
<video src="2.mp4"/>
<video src="3.mp4"/>
<srt src="xyz-eng.srt"/>
<srt src="xyz-spa.srt"/>
</switch>
</body>
</smil>
The generated tracks are:
- 4 video tracks with different bitrates
- 2 audio track with different bitrates, but the same language “eng”
- 2 subtitle tracks with language codes “eng” and “spa”
In this particular case, the SMIL file just provides a list of files and no extra parameter.
Therefore, it is also possible to specify the directory of this file as input to ew-vodingest
and get exactly the same result.