This commit is contained in:
Thomas Forgione 2019-08-29 16:51:42 +02:00
parent afb8bee33d
commit ae089a0da2
No known key found for this signature in database
GPG Key ID: 203DAEA747F48F41
15 changed files with 123 additions and 22 deletions

View File

@ -3,3 +3,7 @@
\newcommand{\missing}[1]{{\noindent\color{Red}Missing #1}}
\newcommand{\argmax}[1]{\underset{#1}{\mathrm{argmax}\ }}
\newcommand{\copied}{\color{blue}}
\newcommand{\fresh}{\color{ForestGreen}}
\newcommand{\resetstyle}{\color{black}}

View File

@ -1,3 +1,4 @@
\copied{}
\section{Client}\label{sec:dashclientspec}
In this section, we specify a DASH NVE client that exploits the preparation of the 3D content in an NVE for streaming.
@ -24,7 +25,7 @@ The utility is a function of a segment, either geometry or texture, and the curr
Let us detail first, all parameters available from the offline/static preparation of the 3D NVE\@.
These parameters are stored in the MPD file.
First, for each geometry segment $s^G$ there is a predetermined 3D area $\mathcal{A}_{3D}(s^G)$, equal to the sum of all triangle areas in this segment (in 3D); it is computed as the segments are created.
Note that the texture segments will have similar information, but computed at \textit{navigation time} $t_i$.
Note that the texture segments have similar information, but computed at \textit{navigation time} $t_i$.
The second information stored in the MPD for all segments, geometry, and texture, is the size of the segment (in kB).
Indeed, geometry segments have close to a similar number of faces; their size is almost uniform.
For texture segments, the size is usually much smaller than the geometry segments but also varies a lot, as between two successive resolutions the number of pixels is divided by 4.
@ -46,7 +47,7 @@ Given the coordinates of the bounding box $\mathcal{BB}(AS^G)$ and the viewpoint
\subsubsection{Utility for geometry segments}
We now have all parameters to derive a utility measure of a geometry segment.
Utility for texture segments will follow from the geometric utility.
Utility for texture segments follows from the geometric utility.
The utility of a geometric segment $s^G$ for a viewpoint $v(t_i)$ is:
\begin{equation*}
@ -68,7 +69,7 @@ So, we define the utility:
where we sum over all geometry segments received before time $t_i$ that intersect $\Delta(T,t_i)$ and such that the adaptation set it belongs to is in the frustum.
This formula defines the utility of a texture segment by computing the linear combination of the utility of the geometry segments that use this texture, weighted by the proportion of area covered by the texture in the segment.
We compute the PSNR by using the MSE in the MPD and denote it $psnr(s^T)$.
We do this to acknowledge the fact that a texture at a greater resolution will have a higher utility than a lower resolution texture.
We do this to acknowledge the fact that a texture at a greater resolution has a higher utility than a lower resolution texture.
The equivalent term for geometry is 1 (and does not appear).
Having defined a utility on both geometry and texture segments, the client uses it next for its streaming strategy.

View File

@ -1,3 +1,4 @@
\copied{}
\section{Content preparation}\label{sec:dash3d}
In this section, we describe how we preprocess and store the 3D data of the NVE, consisting of a polygon soup, textures, and material information into a DASH-compliant Media Presentation Description (MPD) file.

View File

@ -1,3 +1,4 @@
\copied{}
\section{Evaluation}\label{sec:eval}
We now describe our setup and the data we use in our experiments. We present an evaluation of our system and a comparison of the impact of the design choices we introduced in the previous sections.

View File

@ -0,0 +1,43 @@
\fresh{}
\section{A detour by DASH for video}
\copied{}
Dynamic Adaptive Streaming over HTTP (DASH), or MPEG-DASH~\cite{stockhammer2011dynamic,Sodagar2011}, is now a widely deployed
standard for streaming adaptive video content on the Web~\cite{dashstandard}, made to be simple and scalable.
\fresh{}
DASH is based on a clever way of structuring the content that allows a great adaptability during the streaming without requiring any server side computation.
\subsection{Content structure}
All those pieces are structured in a Media Persentation Description (MPD) file, written in the XML format.
This file has 4 layers, the periods, the adaptation sets, the representations and the segments.
Each period can have many adaptation sets, each adaptation set can have many representation, and each representation can have many segments.
\subsubsection{Periods}
Periods are used to delimit content depending on the time. It can be used to delimit chapters, or to add advertisements that occur at the beginning, during or at the end of a video.
\subsubsection{Adaptation sets}
Adaptation sets are used to delimit content depending of the format.
Each adaptation set has a mime-type, and all the representations and segments that it contains share this mime-type.
In videos, most of the time, each period has at least one adaptation set containing the images, and one adaptation set containing the sound.
\subsubsection{Representations}
The representation level is the level DASH uses to offer the same content at different levels of resolution.
For example, a adaptation set containing images have a representation for each available resolution (it might be 480p, 720p, 1080p, etc\ldots).
This allows a user to choose its representation and change it during the video, but most importantly, since the software is able to estimate its downloading speed based on the time it took to download data in the past, it is able to find the optimal resolution, being the highest resolution that arrives on time to avoid stalling.
\subsubsection{Segments}
Until this level of the MPD, content can be long.
For example, a representation of images of a chapter of a movie can be heavy and long to download.
However, downloading heavy files is not suitable for streaming because it prevents the dynamicity of it: if the user requests to change the level of resolution of a video, the system would either have to wait until the file is totally downloaded, or cancel the request, making all the progress done unusable.
Segments are used to prevent this behaviour. They typically encode files that last approximately one second of video, and give the software a great ability to dynamically adapt to the system. If a user wants to seek somewhere else in the video, only one second of data can be lost, and only one second of data has to be downloaded for the playback to resume.
\subsection{Client side computation}
Once a video is encoded in DASH format, once the files have been structured and the MPD has been generated, they can simply be put on a static HTTP server that does no computation other than serving files when it receives requests.
All the intelligence and the decision making is moved to the client side.
A client typically starts by downloading the MPD file, and then proceeds on downloading segments of the different adaptation sets that he needs, estimating itself its downloading speed and choosing itself whether it needs to change representation or not.
\subsection{On our way to DASH-3D}
DASH is made to be format agnositic, and even though it is almost only applied for video streaming nowadays, we believe it is still suitable for 3D streaming. Even though periods are not much of a use in the case of a scene that doesn't evolve as time goes by, but adaptation sets allow us to separate our content between geometry and textures, and gives answers to the questions that were addresed in the conclusion of the previous chapter.

View File

@ -1,20 +1,14 @@
\chapter{DASH-3D}
\input{dash-3d/introduction}
\resetstyle{}
\input{dash-3d/content-preparation}
\resetstyle{}
\input{dash-3d/client}
\resetstyle{}
\input{dash-3d/evaluation}
\resetstyle{}
% \section{Streaming}
%
% \subsection{Content preparation}
%
% \written{ACMMM 18}
%
% \subsection{Client}
%
% \written{ACMMM 18}
%
% \section{Rendering}
%
% \missing{everything}
%

View File

@ -8,6 +8,17 @@
\begin{document}
\listoftodos{}
\vspace{2cm}
\copied{}
Text copied from other articles will be in this color
\fresh{}
Text that was freshly written will be in this color
\resetstyle{}
\makeflyleaf{}
\tableofcontents
\input{plan}

View File

@ -1,3 +1,8 @@
\input{preliminary-work/main}
\resetstyle{}
\input{dash-3d/main}
\resetstyle{}
\input{system-bookmarks/main}
\resetstyle{}

View File

@ -1,3 +1,5 @@
\copied{}
\section{Impact of 3D Bookmarks on Navigation}\label{sec:3dnavigation}
We now describe an experiment that we conducted on 51 participants, with two goals in mind.
@ -109,8 +111,8 @@ Table~\ref{t:questions} shows the list of questions.
\toprule
& Questions & Answers \\
\midrule
1 & What was the difficulty level WITHOUT recommendation? & 3.04 / 5 $\pm0.31$ (99\% confidence interval) \\
2 & What was the difficulty level WITH recommendation? & 2.15 / 5 $\pm0.30$ (99\% confidence interval) \\
1 & What was the difficulty level WITHOUT recommendation? & 3.04 / 5 $\pm0.31$ \\
2 & What was the difficulty level WITH recommendation? & 2.15 / 5 $\pm0.30$ \\
3 & Did the recommendations help you to find the coins? & 42 Yes, 5 No\\
4 & Did the recommendations help you to browse the scene? & 49 Yes, 2 No\\
5 & Do you think recommendations can be helpful? & 49 Yes, 2 No\\
@ -118,7 +120,7 @@ Table~\ref{t:questions} shows the list of questions.
7 & Did you enjoy this? & 36 Yes, 3 No\\
\bottomrule
\end{tabular}
\caption{List of questions in the questionnaire and summary of answers.}\label{t:questions}
\caption{List of questions in the questionnaire and summary of answers. Questions 1 and 2 have a 99\% confidence interval.}\label{t:questions}
\end{table}
\textbf{Participants}.

View File

@ -0,0 +1,23 @@
\fresh{}
\section{Conclusion}
In this chapter, we have described a basic interface that allows a user to navigate in a scene that is being streamed.
It allowed us to understand the problems linked to the dynamicity of both the user behaviour and the 3D content:
\begin{itemize}
\item Navigating in a 3D scene can be complex, due to the many degrees of freedom, and tweaking the interface can increase the user's Quality of Experience.
\item The tweaks operated on the interface may have a drawback on the streaming aspect of the system.
\item Depending on how the interface is tweaked, the behaviour of the users may change and heuristics can be determined to benefit from this.
\end{itemize}
However, the system described in this chapter has some drawbacks:
\begin{itemize}
\item \textbf{It doesn't support materials and textures}: these elements are downloaded at the beginning of the interaction, and since they can have a massive size, this solution is not satisfactory for a system streaming an NVE\@.
\item \textbf{It still requires a heavy load on the server side}: even though the server is not performing online rendering of the scene, it still has to perform frustum and backface culling to find the faces to send to the client, and it also has to keep track of what each client has already downloaded, and what remains to be downloaded.
\item \textbf{The performance of the rendering has not been taken into account}: of course, a system for navigating in 3D scenes must have a sufficient framerate to guarantee a good Quality of Experience for users, and this chapter does not tackle at any point the difficulty to have many tasks to do at the same time (downloading data, uploading the OpenGL buffers, managing the user interaction, rendering the scene, etc\ldots).
\item \textbf{No multi-resolution techniques are used}: in modern 3D streaming, mutli-resolution is a must-have. It prevents the user from waiting until all the data is arrived while still having a global, lower-resolution view of the content he's trying to access.
\end{itemize}
After learning these lessons, we show, in the next chapter, what is possible to do in order to alleviate these issues.

View File

@ -2,5 +2,13 @@
\newcommand{\NoReco}{\textsf{NoBM \xspace}}
\newcommand{\Viewports}{\textsf{VP \xspace}}
\newcommand{\Arrows}{\textsf{Ar \xspace}}
\input{preliminary-work/bookmarks-impact}
\resetstyle{}
\input{preliminary-work/streaming}
\resetstyle{}
\input{preliminary-work/conclusion}
\resetstyle{}

View File

@ -1,3 +1,5 @@
\copied{}
\section{Impact of 3D Bookmarks on Streaming}\label{s:system}
\subsection{3D Model Streaming}

View File

@ -1,3 +1,4 @@
\copied{}
\section{Adding bookmarks into DASH NVE framework}\label{sec:bookmarks}
In this section, we explain how to include a new interaction in the system described in Section~\ref{sec:dash3d}.
@ -116,7 +117,7 @@ We thus render the thumbnail with the mask of already downloaded segments superi
\subsection{Loader modifications}
We build on the loader introduced in~\cite{forgione2018dash} (Algorithm 1) to implement a client adaptation logic.
We include a bookmark adaptation logic such that (i) when a bookmark is hovered for the first time, the corresponding images (see Listing~\ref{bookmark-as}) are downloaded, and (ii) when a bookmark is clicked, we switch from utility $\mathcal{U}$ to true utility $\mathcal{U}^*$ to determine which segments to download next.
We include a bookmark adaptation logic such that (i) when a bookmark is hovered for the first time, the corresponding images (see Listing~\ref{listing:bookmark-as}) are downloaded, and (ii) when a bookmark is clicked, we switch from utility $\mathcal{U}$ to true utility $\mathcal{U}^*$ to determine which segments to download next.
\begin{algorithm}[th]
\SetKwInOut{Input}{input}

View File

@ -1,3 +1,7 @@
\chapter{System bookmarks}
\input{system-bookmarks/bookmark}
\resetstyle{}
\input{system-bookmarks/user-study}
\resetstyle{}

View File

@ -1,3 +1,4 @@
\copied{}
\section{Experiments}
\subsection{Setup}