|
|
|
|
@@ -1,9 +1,7 @@
|
|
|
|
|
\fresh{}
|
|
|
|
|
|
|
|
|
|
\section{Video\label{sote:vide}}
|
|
|
|
|
|
|
|
|
|
Accessing a remote video through the Web has been a widely studied problem since the 1990s. The Real-time Transport Protocol (RTP, \cite{rtp-std}) has been an early attempt
|
|
|
|
|
to formalize audio and video streaming. The protocol allowed data to be transferred unilaterally from a server to a client, and required the server to handle a separate session for each client. While this protocol can be useful in particular scenarii, such as video-conferencing, it can not realistically scale to modern video streaming platforms such as Youtube or Netflix, which must serve millions of simultaneous clients.
|
|
|
|
|
Accessing a remote video through the Web has been a widely studied problem since the 1990s. The Real-time Transport Protocol (RTP,~\cite{rtp-std}) has been an early attempt
|
|
|
|
|
to formalize audio and video streaming. The protocol allowed data to be transferred unilaterally from a server to a client, and required the server to handle a separate session for each client. While this protocol can be useful in particular scenarii, such as video-conferencing, it can not realistically scale to modern video streaming platforms such as Youtube or Netflix, which must serve millions of simultaneous clients.
|
|
|
|
|
|
|
|
|
|
Because of this limitation, and while the increasing network capabilities made video streaming a more and more common practice, a new trend emerged during the 2000s. Building on the democratization of HTTP servers, many industrial actors (Apple, Microsoft, Adobe, etc.) developed HTTP streaming systems to deliver multimedia content over the network. In an effort to bring interoperability between all different actors, the MPEG group launched an initiative which eventually became a standard known as DASH, Dynamic Adaptive Streaming over HTTP.
|
|
|
|
|
|
|
|
|
|
@@ -11,7 +9,7 @@ Because of this limitation, and while the increasing network capabilities made v
|
|
|
|
|
|
|
|
|
|
Dynamic Adaptive Streaming over HTTP (DASH), or MPEG-DASH \citep{dash-std,dash-std-2}, is now a widely deployed
|
|
|
|
|
standard for adaptively streaming video on the Web \citep{dash-std-full}, made to be simple, scalable and inter-operable.
|
|
|
|
|
DASH describes guidelines to prepare and structure video content, in order to allow a great adaptability of the streaming without requiring any server side computation. The client should be able to make good decisions on what part of the content should be downloaded, only based on an estimation of the network constraints and on the information provided in a descriptive file: the MPD.
|
|
|
|
|
DASH describes guidelines to prepare and structure video content, in order to allow a great adaptability of the streaming without requiring any server side computation. The client should be able to make good decisions on what part of the content should be downloaded, only based on an estimation of the network constraints and on the information provided in a descriptive file: the MPD.
|
|
|
|
|
|
|
|
|
|
\subsubsection{DASH structure}
|
|
|
|
|
|
|
|
|
|
@@ -52,12 +50,12 @@ This is one of the DASH strengths: no powerful server is required, and since sta
|
|
|
|
|
|
|
|
|
|
\subsubsection{Client side adaptation}
|
|
|
|
|
|
|
|
|
|
A client typically starts by downloading the MPD file, and then proceeds on downloading segments from the different adaptation sets. While the standard describes well how to structure content on the server side, the client may be freely implemented to take into account the specificities of a given application.
|
|
|
|
|
The most important part of any implementation of a DASH client is called the adaptation logic. This component takes into account a set of parameters, such as network conditions (bandwidth, throughput, for example), buffer states or segments size to derive a decision on which segments should be downloaded next. Most of the industrial actors have of course their own adaptation logic, and many more have been proposed in the literature. A thorough review is beyond the scope of this state-of-the-art, but examples include \citep{chiariotti2016online} who formulate the problem in a reinforcement learning framework, \citep{yadav2017quetra} who formulate the problem using Queuing theory, or \citep{huang2019hindsight} who use a formulation derived from the Knapsack problem.
|
|
|
|
|
A client typically starts by downloading the MPD file, and then proceeds on downloading segments from the different adaptation sets. While the standard describes well how to structure content on the server side, the client may be freely implemented to take into account the specificities of a given application.
|
|
|
|
|
The most important part of any implementation of a DASH client is called the adaptation logic. This component takes into account a set of parameters, such as network conditions (bandwidth, throughput, for example), buffer states or segments size to derive a decision on which segments should be downloaded next. Most of the industrial actors have of course their own adaptation logic, and many more have been proposed in the literature. A thorough review is beyond the scope of this state-of-the-art, but examples include \citep{chiariotti2016online} who formulate the problem in a reinforcement learning framework, \citep{yadav2017quetra} who formulate the problem using Queuing theory, or \citep{huang2019hindsight} who use a formulation derived from the Knapsack problem.
|
|
|
|
|
|
|
|
|
|
\subsection{DASH-SRD}
|
|
|
|
|
Being now widely adopted in the context of video streaming, DASH has been adapted to various other contexts.
|
|
|
|
|
DASH-SRD (Spatial Relationship Description,~\citep{dash-srd}) is a feature that extends the DASH standard to allow streaming only a spatial subpart of a video to a device.
|
|
|
|
|
DASH-SRD (Spatial Relationship Description,~\citep{dash-srd}) is a feature that extends the DASH standard to allow streaming only a spatial subpart of a video to a device.
|
|
|
|
|
It works by encoding a video at multiple resolutions, and tiling the highest resolutions as shown in Figure~\ref{sota:srd-png}.
|
|
|
|
|
That way, a client can choose to download either the low resolution of the whole video or higher resolutions of a subpart of the video.
|
|
|
|
|
|
|
|
|
|
@@ -93,11 +91,10 @@ An example of such a property is given in Listing~\ref{sota:srd-xml}.
|
|
|
|
|
]{assets/state-of-the-art/video/srd.xml}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
Essentially, this feature is a way of achieving view-dependent streaming, since the client only displays a part of the video and can avoid downloading content that will not be displayed. While Figure~\ref{sota:srd-png} illustrates how DASH-SRD can be used in the context of zoomable video streaming, the ideas developed in DASH-SRD have proven particularly useful in the context of 360 video streaming (see for example \citep{ozcinar2017viewport}).
|
|
|
|
|
Essentially, this feature is a way of achieving view-dependent streaming, since the client only displays a part of the video and can avoid downloading content that will not be displayed. While Figure~\ref{sota:srd-png} illustrates how DASH-SRD can be used in the context of zoomable video streaming, the ideas developed in DASH-SRD have proven particularly useful in the context of 360 video streaming (see for example \citep{ozcinar2017viewport}).
|
|
|
|
|
This is especially interesting in the context of 3D streaming since we have this same pattern of a user viewing only a part of a content.
|
|
|
|
|
|
|
|
|
|
% \subsection{Prefetching in video streaming}
|
|
|
|
|
% \copied{}
|
|
|
|
|
%
|
|
|
|
|
% We briefly survey other research on prefetching that focuses on non-continuous interaction in other types of media.
|
|
|
|
|
%
|
|
|
|
|
|