- Edition of the DASH video subsection in the state-of-the-art

This commit is contained in:
acarlier 2019-10-17 21:03:43 +02:00
parent 966a294538
commit 965d82533c
4 changed files with 78 additions and 17 deletions

View File

@ -40,6 +40,15 @@
year = {2011}
}
@techreport{rtp-std,
author={Schulzrinne, H. and Casner, S. and Frederick, R. and Jacobson, V.},
type={Standard},
key={RFC 1889},
month={january},
year={1996},
title={{RTP: A Transport Protocol for Real-Time Applications}}
}
@techreport{dash-std-full,
author={DASH},
type={Standard},
@ -668,3 +677,48 @@
year={2013},
organization={ACM}
}
@inproceedings{sideris2015mpeg,
title={MPEG-DASH users' QoE: The segment duration effect},
author={Sideris, Anargyros and Markakis, E and Zotos, Nikos and Pallis, Evangelos and Skianis, Charalabos},
booktitle={2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX)},
pages={1--6},
year={2015},
organization={IEEE}
}
@inproceedings{stohr2017sweet,
title={Where are the sweet spots?: A systematic approach to reproducible dash player comparisons},
author={Stohr, Denny and Fr{\"o}mmgen, Alexander and Rizk, Amr and Zink, Michael and Steinmetz, Ralf and Effelsberg, Wolfgang},
booktitle={Proceedings of the 25th ACM international conference on Multimedia},
pages={1113--1121},
year={2017},
organization={ACM}
}
@inproceedings{chiariotti2016online,
title={Online learning adaptation strategy for DASH clients},
author={Chiariotti, Federico and D'Aronco, Stefano and Toni, Laura and Frossard, Pascal},
booktitle={Proceedings of the 7th International Conference on Multimedia Systems},
pages={8},
year={2016},
organization={ACM}
}
@inproceedings{yadav2017quetra,
title={Quetra: A queuing theory approach to dash rate adaptation},
author={Yadav, Praveen Kumar and Shafiei, Arash and Ooi, Wei Tsang},
booktitle={Proceedings of the 25th ACM international conference on Multimedia},
pages={1130--1138},
year={2017},
organization={ACM}
}
@inproceedings{huang2019hindsight,
title={Hindsight: evaluate video bitrate adaptation at scale},
author={Huang, Te-Yuan and Ekanadham, Chaitanya and Berglund, Andrew J and Li, Zhi},
booktitle={Proceedings of the 10th ACM Multimedia Systems Conference},
pages={86--97},
year={2019},
organization={ACM}
}

View File

@ -1,26 +1,26 @@
\frontmatter{}
\input{introduction/main}
%\input{introduction/main}
\resetstyle{}
\mainmatter{}
\input{foreword/main}
%\input{foreword/main}
\resetstyle{}
\input{state-of-the-art/main}
\resetstyle{}
\input{preliminary-work/main}
%\input{preliminary-work/main}
\resetstyle{}
\input{dash-3d/main}
%\input{dash-3d/main}
\resetstyle{}
\input{system-bookmarks/main}
%\input{system-bookmarks/main}
\resetstyle{}
\backmatter{}
\input{conclusion/main}
%\input{conclusion/main}

View File

@ -1,5 +1,6 @@
\fresh{}
In this chapter, we present the related work on video and 3D.
As discussed in the previous chapter, video and 3D share many similarities and that is why this chapter will start with a review on video streaming.
Then, we proceed with presenting various 3D streaming techniques, including compression, geometry and texture compromise, and viewpoint dependent streaming.
We end this end this chapter by reviewing the related work regarding 3D navigation and interfaces.
In this chapter, we review a part of the state of the art on Multimedia streaming and interaction.
As discussed in the previous chapter, video and 3D share many similarities and since there is already a very important body of work on video streaming, we start this chapter with a review of this domain with a particular focus on the DASH standard.
Then, we proceed with presenting various topics related to 3D streaming, including compression, geometry and texture compromise, and viewpoint dependent streaming.
Finally, we end this chapter by reviewing the related work regarding 3D navigation and interfaces.

View File

@ -2,17 +2,22 @@
\section{Video\label{sote:vide}}
Accessing a remote video through the Web has been a widely studied problem since the 1990s. The Real-time Transport Protocol (RTP, \cite{rtp-std}) has been an early attempt
to formalize audio and video streaming. The protocol allowed data to be transferred unilaterally from a server to a client, and required the server to handle a separate session for each client. While this protocol can be useful in particular scenarii, such as video-conferencing, it can not realistically scale to modern video streaming platforms such as Youtube or Netflix, which must serve millions of simultaneous clients.
Because of this limitation, and while the increasing network capabilities made video streaming a more and more common practice, a new trend emerged during the 2000s. Building on the democratization of HTTP servers, many industrial actors (Apple, Microsoft, Adobe, etc.) developed HTTP streaming systems to deliver multimedia content over the network. In an effort to bring interoperability between all different actors, the MPEG group launched an initiative which eventually became a standard known as DASH, Dynamic Adaptive Streaming over HTTP.
\subsection{DASH\@: the standard for video streaming\label{sote:dash}}
Dynamic Adaptive Streaming over HTTP (DASH), or MPEG-DASH \citep{dash-std,dash-std-2}, is now a widely deployed
standard for streaming adaptive video content on the Web \citep{dash-std-full}, made to be simple and scalable.
DASH is based on a clever way of preparing and structuring a video in order to allow a great adaptability of the streaming without requiring any server side computation.
standard for adaptively streaming video on the Web \citep{dash-std-full}, made to be simple, scalable and inter-operable.
DASH describes guidelines to prepare and structure video content, in order to allow a great adaptability of the streaming without requiring any server side computation. The client should be able to make good decisions on what part of the content should be downloaded, only based on an estimation of the network constraints and on the information provided in a descriptive file: the MPD.
\subsubsection{DASH structure}
All the content structure is described in a Media Presentation Description (MPD) file, written in the XML format.
This file has 4 layers: the periods, the adaptation sets, the representations and the segments.
A MPD has a tree-structure, meaning that it has multiple periods, each period can have multiple adaptation sets, each adaptation set can have multiple representation, and each representation can have multiple segments.
A MPD has a hierarchical structure, meaning that it has multiple periods, and that each period can have multiple adaptation sets, that each adaptation set can have multiple representation, and that each representation can have multiple segments.
\paragraph{Periods.}
Periods are used to delimit content depending on time.
@ -34,8 +39,8 @@ Until this level in the MPD, content has been divided but it is still far from b
In fact, a representation of the images of a chapter of a movie is still a long video, and keeping such a big file is not possible since heavy files prevent streaming adaptability: if the user requests to change the level of resolution of a video, the system would either have to wait until the file is totally downloaded, or cancel the request, making all the progress done unusable.
Segments are used to prevent this issue.
They typically encode files that contain approximately one second of video, and give the software a greater ability to dynamically adapt to the system.
If a user wants to seek somewhere else in the video, only one second of data can be lost, and only one second of data needs to be downloaded for the playback to resume.
They typically encode files that contain two to ten seconds of video, and give the software a greater ability to dynamically adapt to the system.
If a user wants to seek somewhere else in the video, only one segment of data can be lost, and only one segment of data needs to be downloaded for the playback to resume. The impact of the segment duration has been investigated in many work, including \citep{sideris2015mpeg, stohr2017sweet}. For example, \citep{stohr2017sweet} discuss how the segment duration affects the streaming: short segments lower the initial delay and provide the best stalling Quality of Experience, but make the total downloading time of the video longer because of overhead.
\subsubsection{Content preparation and server}
@ -45,9 +50,10 @@ All the intelligence and the decision making is moved to the client side.
This is one of the DASH strengths: no powerful server is required, and since static HTTP server are stable and efficient, all DASH clients can benefit from it.
\subsubsection{Client side computation}
\subsubsection{Client side adaptation}
A client typically starts by downloading the MPD file, and then proceeds on downloading segments of the different adaptation sets that he needs, estimating itself its downloading speed and choosing itself whether it needs to change representation or not.
A client typically starts by downloading the MPD file, and then proceeds on downloading segments from the different adaptation sets. While the standard describes well how to structure content on the server side, the client may be freely implemented to take into account the specificities of a given application.
The most important part of any implementation of a DASH client is called the adaptation logic. This component takes into account a set of parameters, such as network conditions (bandwidth, throughput, for example), buffer states or segments size to derive a decision on which segments should be downloaded next. Most of the industrial actors have of course their own adaptation logic, and many more have been proposed in the literature. A thorough review is beyond the scope of this state-of-the-art, but examples include \citep{chiariotti2016online} who formulate the problem in a reinforcement learning framework, \citep{yadav2017quetra} who formulate the problem using Queuing theory, or \citep{huang2019hindsight} who use a formulation derived from the Knapsack problem.
\subsection{DASH-SRD}
DASH has already been adopted in the setting of video streaming.