Start merging ACMMM 2018
This commit is contained in:
@@ -1,3 +1,5 @@
|
||||
\newcommand{\written}[1]{{\noindent\color{Green}Things written in #1}}
|
||||
\newcommand{\unpublished}[1]{{\noindent\color{Green}Things written but not published in #1}}
|
||||
\newcommand{\missing}[1]{{\noindent\color{Red}Missing #1}}
|
||||
|
||||
\newcommand{\argmax}[1]{\underset{#1}{\mathrm{argmax}\ }}
|
||||
|
||||
@@ -1,12 +1,17 @@
|
||||
\usepackage{multirow}
|
||||
\usepackage{amssymb}
|
||||
\usepackage{xspace}
|
||||
\usepackage{url}
|
||||
\usepackage{algorithm2e}
|
||||
\usepackage{datatool}
|
||||
\usepackage{pgfplots}
|
||||
\usepackage{tikz}
|
||||
\usepackage{graphicx}
|
||||
\usepackage{subfigure}
|
||||
\usepackage{todonotes}
|
||||
\usepackage{booktabs}
|
||||
\usepackage{tikz}
|
||||
\newcommand{\tikzwidth}{0.85\columnwidth}
|
||||
\newcommand{\tikzheight}{0.65\columnwidth}
|
||||
|
||||
\usepackage[colorlinks = true,
|
||||
linkcolor = blue,
|
||||
@@ -43,6 +48,29 @@ anchorcolor = blue]{hyperref}
|
||||
\makesomeone{judge}{4}{Quatri\`eme MEMBRE}{Charg\'e de Recherche}{Membre du Jury}
|
||||
\makesomeone{judge}{5}{Cinqui\`eme MEMBRE}{Charg\'e de Recherche}{Membre du Jury}
|
||||
|
||||
\definecolor{mygreen}{RGB}{0,139,0}
|
||||
\definecolor{mycolor1}{RGB}{148,0,211}
|
||||
\definecolor{mycolor2}{RGB}{0,158,115}
|
||||
\definecolor{mycolor3}{RGB}{86,180,233}
|
||||
\definecolor{mycolor4}{RGB}{230,159,0}
|
||||
\definecolor{mycolor5}{RGB}{240,228,66}
|
||||
\definecolor{mycolor6}{RGB}{0,114,178}
|
||||
\definecolor{mycolor7}{RGB}{229,30,16}
|
||||
\definecolor{mycolor8}{RGB}{0,0,0}
|
||||
\definecolor{mycolor9}{RGB}{169,169,169}
|
||||
|
||||
\pgfplotscreateplotcyclelist{mystyle}{%
|
||||
{blue},
|
||||
{DarkGreen},
|
||||
{red},
|
||||
{mycolor4},
|
||||
{mycolor5},
|
||||
{mycolor6},
|
||||
{mycolor7},
|
||||
{mycolor8},
|
||||
{mycolor9},
|
||||
}
|
||||
|
||||
% Defines the colors from xcolor
|
||||
\preparecolorset{rgb}{}{}{%
|
||||
AliceBlue,.94,.972,1;%
|
||||
|
||||
134
src/dash-3d/client.tex
Normal file
134
src/dash-3d/client.tex
Normal file
@@ -0,0 +1,134 @@
|
||||
\section{Client}\label{sec:dashclientspec}
|
||||
|
||||
In this section, we specify a DASH NVE client that exploits the preparation of the 3D content in an NVE for streaming.
|
||||
|
||||
The generated MPD file describes the content organization so that the client gets all the necessary information to make educated decisions and query the 3D content it needs according to the available resources and current viewpoint.
|
||||
A camera path generated by a particular user is a set of viewpoint $v(t_i)$ indexed by a continuous time interval $t_i \in [t_1,t_{end}]$.
|
||||
|
||||
The DASH client first downloads the MPD file to get the material (.mtl) file containing information about all the geometry and textures available for the entire 3D model.
|
||||
At time instance $t_i$, the DASH client decides to download the appropriate segments containing the geometry and the texture to generate the viewpoint $v(t_{i+1})$ for the time instance $t_{i+1}$.
|
||||
|
||||
Starting from $t_1$, the camera continuously follows a camera path $C=\{v(t_i), t_i \in [t_1,t_{end}]\}$, along which downloading opportunities are strategically exploited to sequentially query the most useful segments.
|
||||
|
||||
\subsection{Segment Utility}\label{subsec:utility}
|
||||
|
||||
Unlike video streaming, where the bitrate of each segment correlates with the quality of the video received, for 3D content, the size (in bytes) of the content does not necessarily correlate well to its contribution to visual quality.
|
||||
A large polygon with huge visual impact takes the same number of bytes as a tiny polygon.
|
||||
Further, the visual impact is \textit{view dependent} --- a large object that is far away or out of view does not contribute to the visual quality as much as a smaller object that is closer to the user.
|
||||
As such, it is important for a DASH-based NVE client to estimate the usefulness of a given segment to download, so that it can make good decisions about what to download.
|
||||
We call this usefulness the \textit{utility} of the segment.
|
||||
|
||||
The utility is a function of a segment, either geometry or texture, and the current viewpoint (camera location, view angle, and look-at point), and is therefore dynamically computed online by the client from parameters in the MPD file.
|
||||
|
||||
\subsubsection{Offline parameters}
|
||||
Let us detail first, all parameters available from the offline/static preparation of the 3D NVE\@.
|
||||
These parameters are stored in the MPD file.
|
||||
First, for each geometry segment $s^G$ there is a predetermined 3D area $\mathcal{A}_{3D}(s^G)$, equal to the sum of all triangle areas in this segment (in 3D); it is computed as the segments are created.
|
||||
Note that the texture segments will have similar information, but computed at \textit{navigation time} $t_i$.
|
||||
The second information stored in the MPD for all segments, geometry, and texture, is the size of the segment (in kB).
|
||||
Indeed, geometry segments have close to a similar number of faces; their size is almost uniform.
|
||||
For texture segments, the size is usually much smaller than the geometry segments but also varies a lot, as between two successive resolutions the number of pixels is divided by 4.
|
||||
|
||||
Finally, for each texture segment $s^{T}$, the MPD stores the \textit{MSE} (mean square error) of the image and resolution, relative to the highest resolution (by default, triangles are filled with its average color).
|
||||
Offline parameters are stored in the MPD as shown in Listing 1\todo{fix reference}.
|
||||
|
||||
\subsubsection{Online parameters}
|
||||
In addition to the offline parameters stored in the MPD file for each segment, view-dependent parameters are computed at navigation time.
|
||||
First, a measure of 3D area is computed for texture segments.
|
||||
As a texture maps on a set of triangles, we account for the area in 3D of all these triangles.
|
||||
We could consider such an offline measure (attached to the adaptation set containing the texture), but we prefer to only account for the triangles that have been already downloaded by the client.
|
||||
We call the set of triangles colored by a texture $T$: $\Delta(s^T)=\Delta(T)$ (depending only on $T$ and equal for any representation/segment $s^T$ in this texture adaptation set).
|
||||
At each time $t_i$, a subset of $\Delta(T)$ has been downloaded; we denote it $\Delta(T, t_i)$.
|
||||
|
||||
Moreover, each geometry segment belongs to a geometry adaptation set $AS^G$ whose bounding box coordinates are stored in the MPD\@.
|
||||
Given the coordinates of the bounding box $\mathcal{BB}(AS^G)$ and the viewpoint $v(t_i)$ at time $t_i$, the client computes the distance $\mathcal{D}(v(t_i),AS^G)$ of the bounding box $\mathcal{BB}(AS^G)$ as the distance from the center of $\mathcal{BB}(AS^G)$ to the principal point of the camera, given in $v(t_i)$.
|
||||
|
||||
|
||||
\subsubsection{Utility for geometry segments}
|
||||
We now have all parameters to derive a utility measure of a geometry segment.
|
||||
Utility for texture segments will follow from the geometric utility.
|
||||
|
||||
The utility of a geometric segment $s^G$ for a viewpoint $v(t_i)$ is:
|
||||
\begin{equation*}
|
||||
\mathcal{U} \Big(s^G,v(t_i) \Big) = \frac{\mathcal{A}_{3D}(s^G)}{\mathcal{D}{\left(v{(t_i)},AS^G\right)}^2}
|
||||
\end{equation*}
|
||||
where $AS^G$ is the adaptation set containing $s^G$.
|
||||
|
||||
Basically, the utility of a segment is proportional to the area that its faces cover, and inversely proportional to the square of the distance between the camera and the center of the bounding box of the adaptation set containing the segment.
|
||||
That way, we favor segments with big faces that are close to the camera.
|
||||
|
||||
\subsubsection{Utility for texture segments}
|
||||
For a texture $T$ stored in a segment $s^T$, the triangles in $\Delta(T)$ are stored in arbitrary geometry segments, that is, they do not have spatial coherence.
|
||||
Thus, for each $k^{th}$ downloaded geometry segment $s_k^G$, and total downloaded segment $K$ at time $t_i$, we collect the triangles of $\Delta(T, t_i)$ in $s^G_k$, and compute the ratio of $\mathcal{A}_{3D}(s_k^G)$ covered by these triangles.
|
||||
So, we define the utility:
|
||||
\begin{equation*}
|
||||
\mathcal{U}\Big( s^T,v(t_i) \Big)
|
||||
= psnr(s^T) \sum_{k\in K}\frac{\mathcal{A}_{3D}( s_k^G\cap \Delta(T,t_i))}{\mathcal{A}_{3D}(s_k^G)} \mathcal{U}\Big( s_k^G,v(t_i) \Big)
|
||||
\end{equation*}
|
||||
where we sum over all geometry segments received before time $t_i$ that intersect $\Delta(T,t_i)$ and such that the adaptation set it belongs to is in the frustum.
|
||||
This formula defines the utility of a texture segment by computing the linear combination of the utility of the geometry segments that use this texture, weighted by the proportion of area covered by the texture in the segment.
|
||||
We compute the PSNR by using the MSE in the MPD and denote it $psnr(s^T)$.
|
||||
We do this to acknowledge the fact that a texture at a greater resolution will have a higher utility than a lower resolution texture.
|
||||
The equivalent term for geometry is 1 (and does not appear).
|
||||
Having defined a utility on both geometry and texture segments, the client uses it next for its streaming strategy.
|
||||
|
||||
\subsection{DASH Adaptation Logic}\label{subsec:dashadaptation}
|
||||
|
||||
Along the camera path $C=\{v(t_i)\}$, viewpoints are indexed by a continuous time interval $t_i \in [t_1,t_{end}]$.
|
||||
Contrastingly, the DASH adaptation logic proceeds sequentially along a discrete time line.
|
||||
The first request \texttt{(HTTP request)} made by the DASH client at time $t_1$ selects the most useful segment $s_1^*$ to download and will be followed by subsequent decisions at $t_2, t_3, \dots$.
|
||||
While selecting $s_i^*$, the i-th best segment to request, the adaptation logic compromises between geometry, texture, and the available \texttt{representations} given the current bandwidth, camera dynamics, and the previously described utility scores.
|
||||
The difference between $t_{i+1}$ and $t_{i}$ is the $s_i^*$ delivery delay.
|
||||
It varies with the segment size and network conditions.
|
||||
Algorithm~\ref{algorithm:nextsegment} details how our DASH client makes decisions.
|
||||
|
||||
|
||||
|
||||
\begin{algorithm}
|
||||
\SetKwInOut{Input}{input}
|
||||
\SetKwInOut{Output}{output}
|
||||
\Input{Current index $i$, time $t_i$, viewpoint $v(t_i)$, buffer of already downloaded \texttt{segments} $\mathcal{B}_i$, MPD}
|
||||
\Output{Next segment $s^{*}_i$ to request, updated buffer $\mathcal{B}_{i+1}$}
|
||||
\SetAlgoLined{}
|
||||
{- Estimate the bandwidth $\widehat{BW_i}$ and RTT $\widehat{\tau_i}$ \;}
|
||||
|
||||
{- Among all \texttt{segments} that are not already downloaded $s \in \mathcal{S} \backslash \mathcal{B}_i$, % \;}
|
||||
% {-
|
||||
keep the ones inside the upcoming viewing frustums $\mathcal{FC}=\mathbb{FC}(\widehat{v}(t_i)), t\in [t_i, t_i+\chi]$ thanks to a viewpoint predictor $t_i \rightarrow \hat{v}(t_i)$, a temporal horizon $\chi$ and a frustum culling operator $\mathbb{FC}$ \;}
|
||||
|
||||
|
||||
{- Optimize a criterion $\Omega$ based on $\mathcal{U}$ values and well chosen viewpoint $v(t_i)$ to select the next segment to query }
|
||||
{\begin{equation*} s^{*}_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC}} \Omega_{\theta_i} \Big(\mathcal{U}\left(s,v(t_i)\right)\Big) \label{eq1}\end{equation*} \\
|
||||
given parameters $\theta_i$ that gathers both online parameters $(i,t_i,v(t_i),\widehat{BW_i}, \widehat{\tau_i}, \mathcal{B}_i)$ and offline metadata\;}
|
||||
|
||||
{- Update the buffer $\mathcal{B}_{i+1}$ for the next decision: $s^{*}_i$ and lowest \texttt{representations} of $s^{*}_i$ are considered downloaded\;}
|
||||
{- \Return{segment $s^{*}_i$, buffer $\mathcal{B}_{i+1}$}\;}
|
||||
|
||||
{\caption{Algorithm to identify the next segment to query\label{algorithm:nextsegment}}}
|
||||
\end{algorithm}
|
||||
|
||||
The most naive way to sequentially optimize the $\mathcal{U}$ is to limit the decision-making to the current viewpoint $v(t_i)$.
|
||||
In that case, the best segment $s$ to request would be the one maximizing $\mathcal{U}(s, v(t_i))$ to simply make a better rendering from the current viewpoint $v(t_i)$.
|
||||
Due to transmission delay however, this segment will be only delivered at time $t_{i+1}=t_{i+1}(s)$ depending on the segment size and network conditions: \begin{equation*} t_{i+1}(s)=t_i+\frac{\mathtt{size}(s)}{\widehat{BW_i}} + \widehat{\tau_i}\label{eq2}\end{equation*}
|
||||
|
||||
In consequence, the most useful segment from $v(t_i)$ at decision time $t_i$ might be less useful at delivery time from $v(t_{i+1})$.
|
||||
|
||||
A better solution is to download a segment that is expected to be the most useful in the future.
|
||||
With a temporal horizon $\chi$, we can optimize the cumulated $\mathcal{U}$ over $[t_{i+1}(s), t_i+\chi]$:
|
||||
|
||||
\begin{equation}
|
||||
s^*_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC} } \int_{t_{i+1}(s)}^{t_i+\chi} \mathcal{U}(s,\hat{v}(t_i)) dt
|
||||
\label{eq:smart}
|
||||
\end{equation}
|
||||
|
||||
In our experiments, we typically use $\chi=2s$ and estimate the (\ref{eq:smart}) integral by a Riemann sum where the $[t_{i+1}(s), t_i+\chi]$ interval is divided in 4 subintervals of equal size.
|
||||
For each subinterval extremity, an order 1 predictor $\hat{v}(t_i)$ linearly estimates the viewpoint based on $v(t_i)$ and speed estimation (discrete derivative at $t_i$).
|
||||
|
||||
We also tested an alternative greedy heuristic selecting the segment that optimizes an utility variation during downloading (between $t_i$ and $t_{i+1}$):
|
||||
\begin{equation}
|
||||
s^{\texttt{GREEDY}}_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC}} \frac{\mathcal{U}\Big(s,\hat{v}(t_{i+1}(s))\Big)}{t_{i+1}(s) - t_i}
|
||||
\label{eq:greedy}
|
||||
\end{equation}
|
||||
|
||||
|
||||
|
||||
95
src/dash-3d/content-preparation.tex
Normal file
95
src/dash-3d/content-preparation.tex
Normal file
@@ -0,0 +1,95 @@
|
||||
\section{Content preparation}\label{sec:dash3d}
|
||||
|
||||
In this section, we describe how we preprocess and store the 3D data of the NVE, consisting of a polygon soup, textures, and material information into a DASH-compliant Media Presentation Description (MPD) file.
|
||||
In our work, we use the \texttt{obj} file format for the polygons, \texttt{png} for textures, and \texttt{mtl} format for material information.
|
||||
The process, however, applies to other formats as well.
|
||||
|
||||
\subsection{The MPD File}
|
||||
In DASH, the information about content storage and characteristics, such as location, resolution, or size, are extracted from an MPD file by the client.
|
||||
The client relies only on these information to decide which chunk to request and at which quality level.
|
||||
The MPD file is an XML file that is organized into different sections hierarchically.
|
||||
The \texttt{period} element is a top-level element, which for the case of video, indicates the start time and length of a video chapter.
|
||||
This element does not apply to NVE, and we use a single \texttt{period} for the whole scene, as the scene is static.
|
||||
Each \texttt{period} element contains one or more adaptation sets, which describe the alternate versions, formats, and types of media.
|
||||
We utilize adaptation sets to organize a 3D scene's material, geometry, and texture.
|
||||
|
||||
\subsection{Adaptation Sets}
|
||||
When the user navigates freely within an NVE, the frustum at given time almost always contains a limited part of the 3D scene.
|
||||
Similar to how DASH for video streaming partitions a video clip into temporal chunks, we segment the polygons into spatial chunks, such that the DASH client can request only the relevant chunks.
|
||||
|
||||
\subsubsection{Geometry Management}\label{sec:geometry}
|
||||
We use a space partitioning tree to organize the faces into cells.
|
||||
A face belongs to a cell if its barycenter falls inside the corresponding bounding box.
|
||||
Each cell corresponds to an adaptation set.
|
||||
Thus, geometry information is spread on adaptation sets based on spatial coherence, allowing the client to download the relevant faces selectively.
|
||||
A cell is relevant if it intersects the frustum of the client's current viewpoint. Figure~\ref{fig:bigpic} shows the relevant cells in blue.
|
||||
As our 3D content, a virtual environment, is biased to spread the most along the horizontal plane, we alternate between splitting between the two horizontal directions.
|
||||
|
||||
We create a separate adaptation set for large faces (e.g., the sky or ground) because they are essential to the 3D model and do not fit into cells.
|
||||
We consider a face to be large if its area in 3D is more than $a+3\sigma$, where $a$ and $\sigma$ are the average and the standard deviation of 3D area of faces respectively.
|
||||
In our example, it selects the 5 largest faces that represent $15\%$ of the total face area.
|
||||
We thus obtain a decomposition of the NVE into adaptation sets that partitions the geometry of the scene into a small adaptation set containing the larger faces of the model, and smaller adaptation sets containing the remaining faces.
|
||||
|
||||
We store the spatial location of each adaptation set, characterized by the coordinates of its bounding box, in the MPD file as the supplementary property of the adaptation set in the form of ``\textit{$x_{\min}$, width, $y_{\min}$, height, $z_{\min}$, depth}'' (as shown in Listing 1).
|
||||
This information is used by the client to implement a view-dependent streaming (Section~\ref{sec:dashclientspec}).
|
||||
|
||||
\subsubsection{Texture Management}
|
||||
As with geometry data, we handle textures using adaptation sets but separate from geometry.
|
||||
Each texture file is contained in a different adaptation set, with multiple representations providing different image resolutions (see Section~\ref{sec:representation}).
|
||||
We add an attribute to each adaptation set that contains texture, describing the average color of the texture.
|
||||
The client can use this attribute to render a face for which the corresponding texture has not been loaded yet, so that most objects appear, at least, with a uniform natural color (see Figure~\ref{fig:textures}).
|
||||
|
||||
|
||||
\subsubsection{Material Management}
|
||||
The material \texttt{.mtl} file is a text file that describes all materials used in the \texttt{.obj} files for the entire 3D model.
|
||||
A material has a name, properties such as specular parameters, and, most importantly, a path to a texture file.
|
||||
The \texttt{.mtl} file maps each face of the \texttt{.obj} to a material.
|
||||
As the \texttt{.mtl} file is a different type of media than geometry and texture, we define a particular adaptation set for this file, with a single representation.
|
||||
|
||||
\subsection{Representations}\label{sec:representation}
|
||||
Each adaptation set can contain one or more representations of the geometry or texture data, at different levels of detail (e.g., a different number of faces).
|
||||
For geometry, the resolution (i.e., 3D areas of faces) is heterogeneous, thus applying a sensible multi-resolution representation is cumbersome: the 3D area of faces varies from $0.01$ to more than $10K$, disregarding the outliers.
|
||||
For textured scenes, it is common to have such heterogeneous geometry size since information can be stored either in geometry or texture.
|
||||
Thus, handling the streaming compromise between geometry and texture is more adaptive than handling separately multi-resolution geometry.
|
||||
Moreover, as our faces are partitioned into independent cells, multi-resolution would cause difficult stitching issues such as topological gaps between the cells.
|
||||
|
||||
For an adaptation set containing texture, each representation contains a single segment where the image file is stored at the chosen resolution.
|
||||
In our example, from the full-size image, we generate successive resolutions by dividing both height and width by 2, stopping when the image size is less or equal to $64\times 64$.
|
||||
Figure~\ref{fig:textures} illustrates the use of the textures against the rendering using a single, average color per face.
|
||||
|
||||
\begin{figure}
|
||||
\includegraphics[width=0.472\columnwidth]{assets/dash-3d/average-color/full-res.png}
|
||||
\includegraphics[width=0.50\columnwidth]{assets/dash-3d/average-color/no-res.png}
|
||||
\caption{Rendering of the model with full resolution texture (left), and faces with average default color (right).\label{fig:textures}}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Segments}
|
||||
To allow random access to the content within an adaptation set storing geometry data, we group the faces into segments.
|
||||
Each segment is then stored as a \texttt{.obj} file that can be individually requested by the client.
|
||||
For geometry, we partition the faces in an adaptation set into sets of $N_s$ faces, by first sorting the faces by their area in 3D space in descending order, and then place each successive $N_s$ faces into a segment.
|
||||
Thus, the first segment contains the biggest faces and the last one the smallest.
|
||||
In addition to the selected faces, a segment stores all face vertices and attributes so that each segment is independent.
|
||||
For textures, each representation contains a single segment.
|
||||
|
||||
\begin{figure}
|
||||
\lstinputlisting[%
|
||||
language=XML,
|
||||
caption={MPD description of a geometry adaptation set, and a texture adaptation set.},
|
||||
label=geometry-as-example,
|
||||
emph={%
|
||||
MPD,
|
||||
Period,
|
||||
AdaptationSet,
|
||||
Representation,
|
||||
BaseURL,
|
||||
SegmentBase,
|
||||
Initialization,
|
||||
Role,
|
||||
SupplementalProperty,
|
||||
SegmentList,
|
||||
SegmentURL,
|
||||
Viewpoint
|
||||
}
|
||||
]{assets/dash-3d/geometry-as.xml}\label{listing:MPD}
|
||||
\end{figure}
|
||||
|
||||
242
src/dash-3d/evaluation.tex
Normal file
242
src/dash-3d/evaluation.tex
Normal file
@@ -0,0 +1,242 @@
|
||||
\section{Evaluation}\label{sec:eval}
|
||||
|
||||
We now describe our setup and the data we use in our experiments. We present an evaluation of our system and a comparison of the impact of the design choices we introduced in the previous sections.
|
||||
|
||||
\subsection{Experimental Setup}
|
||||
|
||||
\subsubsection{Model}
|
||||
We use a city model of the Marina Bay area in Singapore in our experiments.
|
||||
The model came in 3DS Max format and has been converted into Wavefront OBJ format before the processing described in Section~\ref{sec:dash3d}.
|
||||
The converted model has 387,551 vertices and 552,118 faces.
|
||||
Table~\ref{table:size} gives some general information about the model.
|
||||
We partition the geometry into a k-$d$ tree until the leafs have less than 10000 faces, which gives us 64 adaptation sets, plus one containing the large faces.
|
||||
|
||||
\begin{table}
|
||||
\centering
|
||||
\begin{tabular}{ll}
|
||||
\textbf{Files} & \textbf{Size} \\ \midrule
|
||||
3DS Max & 55 MB \\
|
||||
OBJ file & 62 MB\\
|
||||
MTL file & 0.27MB \\
|
||||
Textures (high res) & 167 MB \\
|
||||
Textures (low res) & 11 MB \\
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\caption{Sizes of the different files of the model\label{table:size}}
|
||||
\end{table}
|
||||
|
||||
\subsubsection{User Navigations}
|
||||
To evaluate our system, we collected realistic user navigation traces that we can replay in our experiments.
|
||||
We presented six users with a web interface, on which the model was loaded progressively as the user could interact with it.
|
||||
The available interactions were inspired by traditional first-person interactions in video games, i.e., W, A, S, and D keys to translate the camera, and mouse to rotate the camera.
|
||||
We asked users to browse and explore the scene until they felt they had visited all important regions.
|
||||
We then asked them to produce camera navigation paths that would best present the 3D scene to a user that would discover it.
|
||||
To record a path, the users first place their camera to their preferred starting point, then click on a button to start recording.
|
||||
Every 100ms, the position, viewing angle of the camera and look-at point are saved into an array that will then be exported into JSON format.
|
||||
The recorded camera trace allows us to replay each camera path to perform our simulations and evaluate our system.
|
||||
We collected 13 camera paths this way.
|
||||
|
||||
\subsubsection{Network Setup}
|
||||
We tested our implementation under three network bandwidth of 2.5 Mbps, 5 Mbps, and 10 Mbps with an RTT of 38 ms, following the settings from DASH-IF~\cite{DASH_NETWORK_PROFILE}.
|
||||
The values are kept constant during the entire client session to analyze the difference in magnitude of performance by increasing the bandwidth.
|
||||
|
||||
In our experiments, we set up a virtual camera that moves along a navigation path, and our access engine downloads segments in real time according to Algorithm~\ref{algorithm:nextsegment}.
|
||||
We log in a JSON file the time when a segment is requested and when it is received.
|
||||
By doing so, we avoid wasting time and resources to evaluate our system while downloading segments and store all the information necessary to plot the figures introduced in the subsequent sections.
|
||||
|
||||
\subsubsection{Hardware and Software}
|
||||
The experiments were run on an Acer Aspire V3 with an Intel Core i7 3632QM processor and an NVIDIA GeForce GT 740M graphics card.
|
||||
The DASH client is written in Rust\footnote{\url{https://www.rust-lang.org/}}, using Glium\footnote{\url{https://github.com/glium/glium}} for rendering, and reqwest\footnote{\url{https://github.com/seanmonstar/reqwest/}} to load the segments.
|
||||
|
||||
\subsubsection{Metrics}
|
||||
To objectively evaluate the quality of the resulting rendering, we use PSNR\@.
|
||||
The scene as rendered offline using the same camera path with all the geometry and texture data available is used as ground truth.
|
||||
Note that a pixel error can occur in our case only in two situations: (i) when a face is missing, in which case the color of the background object is shown, and (ii) when a texture is either missing or downsampled.
|
||||
We do not have pixel error due to compression.
|
||||
|
||||
\subsubsection{Experiments}
|
||||
We present experiments to validate our implementation choices at every step of our system.
|
||||
We replay the user-generated camera paths with various bandwidth conditions while varying key components of our system.
|
||||
|
||||
Table~\ref{table:experiments} sums up all the components we varied in our experiments.
|
||||
We compare the impact of two space-partitioning trees, a $k$-d tree and an Octree, on content preparation.
|
||||
We also try several utility metrics for geometry segments: an offline one, which assigns to each geometry segment $s^G$ the cumulated 3D area of its belonging faces $\mathcal{A}_{3D}(s^G)$; an online one, which assigns to each geometry segment the inverse of its distance to the camera position; and finally our proposed method, as described in Section~\ref{subsec:utility} ($\mathcal{A}_{3D}(s^G)/ \mathcal{D}{(v{(t_i)},AS^G)}^2$).
|
||||
We consider two streaming policies to be applied by the client, proposed in Section~\ref{sec:dashclientspec}.
|
||||
The greedy strategy determines, at each decision time, the segment that maximizes its predicted utility at arrival divided by its predicted delivery delay, which corresponds to equation (\ref{eq:greedy}).
|
||||
The second streaming policy that we run is the one we proposed in equation (\ref{eq:smart}).
|
||||
We have also analyzed the effect of grouping the faces in geometry segments of an adaptation set based on their 3D area.
|
||||
Finally, we try several bandwidth parameters to study how our system can adapt to varying network conditions.
|
||||
|
||||
\begin{table}
|
||||
\centering
|
||||
\begin{tabular}{@{}ll@{}}
|
||||
\toprule
|
||||
\textbf{Parameters} & \textbf{Values} \\\midrule
|
||||
Content preparation & Octree, $k$-d tree \\
|
||||
Utility & Offline, Online, Proposed \\
|
||||
Streaming policy & Greedy, Proposed \\
|
||||
Grouping of Segments & Sorted based on area, Unsorted\\
|
||||
Bandwidth & 2.5 Mbps, 5 Mbps, 10 Mbps \\\bottomrule
|
||||
\end{tabular}
|
||||
\caption{Different parameters in our experiments\label{table:experiments}}
|
||||
\end{table}
|
||||
|
||||
\subsection{Experimental Results}
|
||||
\begin{figure}
|
||||
\centering
|
||||
\begin{tikzpicture}
|
||||
\begin{axis}[
|
||||
xlabel=Time (in s),
|
||||
ylabel=PSNR,
|
||||
no markers,
|
||||
cycle list name=mystyle,
|
||||
width=\tikzwidth,
|
||||
height=\tikzheight,
|
||||
legend pos=south east,
|
||||
xmin=0,
|
||||
xmax=90,
|
||||
x label style={at={(axis description cs:0.5,0.05)},anchor=north},
|
||||
y label style={at={(axis description cs:0.125,.5)},anchor=south},
|
||||
]
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
|
||||
\addlegendentry{\scriptsize $k$-d tree}
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/2/curve.dat};
|
||||
\addlegendentry{\scriptsize octree}
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
\caption{Impact of the space-partitioning tree on the rendering quality with a 5Mbps bandwidth.\label{fig:preparation}}
|
||||
\end{figure}
|
||||
|
||||
Figure~\ref{fig:preparation} shows how the space partition can affect the rendering quality.
|
||||
We use our proposed utility metrics (see Section~\ref{subsec:utility}) and streaming policy from Equation (\ref{eq:smart}), on content divided into adaptation sets obtained either using a $k$-d tree or an Octree and run experiments on all camera paths at 5 Mbps.
|
||||
The octree partitions content into non-homogeneous adaptation sets; as a result, some adaptation sets may contain smaller segments, which contain both important (large) and non-important polygons. For the $k$-d tree, we create cells containing the same number of faces $N_a$ (here, we take $N_a=10k$).
|
||||
Figure~\ref{fig:preparation} shows that the system seems to be slightly less efficient with an Octree than with a $k$-d tree based partition, but this result is not significant.
|
||||
For the remaining experiments, partitioning is based on a $k$-d tree.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\begin{tikzpicture}
|
||||
\begin{axis}[
|
||||
xlabel=Time (in s),
|
||||
ylabel=PSNR,
|
||||
no markers,
|
||||
cycle list name=mystyle,
|
||||
width=\tikzwidth,
|
||||
height=\tikzheight,
|
||||
legend pos=south east,
|
||||
xmin=0,
|
||||
xmax=90,
|
||||
x label style={at={(axis description cs:0.5,0.05)},anchor=north},
|
||||
y label style={at={(axis description cs:0.125,.5)},anchor=south},
|
||||
]
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
|
||||
\addlegendentry{\scriptsize Proposed}
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/3/curve.dat};
|
||||
\addlegendentry{\scriptsize Online only}
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/4/curve.dat};
|
||||
\addlegendentry{\scriptsize Offline only}
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
\caption{Impact of the segment utility metric on the rendering qualit with a 5Mbps bandwidth.\label{fig:utility}}
|
||||
\end{figure}
|
||||
|
||||
Figure~\ref{fig:utility} displays how a utility metric should take advantage of both offline and online features.
|
||||
The experiments consider $k$-d tree cell for adaptation sets and the proposed streaming policy, on all camera paths.
|
||||
We observe that a purely offline utility metric leads to poor PSNR results.
|
||||
An online-only utility improves the results, as it takes the user viewing frustum into consideration, but still, the proposed utility (in Section~\ref{subsec:utility}) performs better.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\begin{tikzpicture}
|
||||
\begin{axis}[
|
||||
xlabel=Time (in s),
|
||||
ylabel=PSNR,
|
||||
no markers,
|
||||
cycle list name=mystyle,
|
||||
width=\tikzwidth,
|
||||
height=\tikzheight,
|
||||
legend pos=south east,
|
||||
xmin=0,
|
||||
xmax=90,
|
||||
x label style={at={(axis description cs:0.5,0.05)},anchor=north},
|
||||
y label style={at={(axis description cs:0.125,.5)},anchor=south},
|
||||
]
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
|
||||
\addlegendentry{\scriptsize Sorting the faces by area}
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/5/curve.dat};
|
||||
\addlegendentry{\scriptsize Without sorting the faces}
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
\caption{Impact of creating the segments of an adaptation set based on decreasing 3D area of faces with a 5Mbps bandwidth.}\label{fig:sorting}
|
||||
\end{figure}
|
||||
|
||||
Figure~\ref{fig:sorting} shows the effect of grouping the segments in an adaptation set based on their area in 3D.
|
||||
Clearly, the PSNR significantly improves when the 3D area of faces is considered for creating the segments. Since all segments are of the same size, sorting the faces by area before grouping them into segments leads to a skew distribution of how useful the segments are. This skewness means that the decision that the client makes (to download those with the largest utility first) can make a bigger difference in the quality.
|
||||
|
||||
We also compared the greedy vs.\ proposed streaming policy (as shown in Figure~\ref{fig:greedyweakness}) for limited bandwidth (5 Mbps).
|
||||
The proposed scheme outperforms the greedy during the first 30s and does a better job overall.
|
||||
Table~\ref{table:greedyVsproposed} shows the average PSNR for the proposed method and the greedy method for different downloading bandwidth.
|
||||
In the first 30 sec, since there are relatively few 3D contents downloaded, making a better decision at what to download matters more: we observe during that time that the proposed method leads to 1 --- 1.9 dB better in quality terms of PSNR compared to Greedy.
|
||||
|
||||
Table~\ref{table:perc} shows the distribution of texture resolutions that are downloaded by greedy and our Proposed scheme, at different bandwidths.
|
||||
Resolution 5 is the highest and 1 is the lowest.
|
||||
The table clearly shows a weakness of the greedy policy: as the bandwidth increases, the distribution of downloaded textures resolution stays more or less the same.
|
||||
In contrast, our proposed streaming policy adapts to an increasing bandwidth by downloading higher resolution textures (13.9\% at 10 Mbps, vs. 0.3\% at 2.5 Mbps).
|
||||
In fact, an interesting feature of our proposed streaming policy is that it adapts the geometry-texture compromise to the bandwidth. The textures represent 57.3\% of the total amount of downloaded bytes at 2.5 Mbps, and 70.2\% at 10 Mbps.
|
||||
In other words, our system tends to favor geometry segments when the bandwidth is low, and favor texture segments when the bandwidth increases.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\begin{tikzpicture}
|
||||
\begin{axis}[
|
||||
xlabel=Time (in s),
|
||||
ylabel=PSNR,
|
||||
no markers,
|
||||
cycle list name=mystyle,
|
||||
width=\tikzwidth,
|
||||
height=\tikzheight,
|
||||
legend pos=south east,
|
||||
xmin=0,
|
||||
xmax=90,
|
||||
x label style={at={(axis description cs:0.5,0.05)},anchor=north},
|
||||
y label style={at={(axis description cs:0.125,.5)},anchor=south},
|
||||
]
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
|
||||
\addlegendentry{\scriptsize Proposed}
|
||||
\addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/6/curve.dat};
|
||||
\addlegendentry{\scriptsize Greedy}
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
\caption{Impact of the streaming policy (greedy vs.\ proposed) with a 5 Mbps bandwidth.}\label{fig:greedyweakness}
|
||||
\end{figure}
|
||||
|
||||
\begin{table}
|
||||
\centering
|
||||
\begin{tabular}{@{}p{2.5cm}p{0.7cm}p{0.7cm}p{0.7cm}p{0.3cm}p{0.7cm}p{0.7cm}p{0.7cm}@{}}
|
||||
\toprule
|
||||
\multirow{2}{1.7cm}{} & \multicolumn{3}{c}{\textbf{First 30 Sec}} & & \multicolumn{3}{c}{\textbf{Overall}}\\
|
||||
\cline{2-8}
|
||||
BW (in Mbps) & 2.5 & 5 & 10 & & 2.5 & 5 & 10 \\
|
||||
\midrule
|
||||
Greedy & 14.4 & 19.4 & 22.1 & & 19.8 & 26.9 & 29.7 \\
|
||||
Proposed & 16.3 & 20.4 & 23.2 & & 23.8 & 28.2 & 31.1 \\
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\caption{Average PSNR, Greedy vs. Proposed\label{table:greedyVsproposed}}
|
||||
\end{table}
|
||||
|
||||
\begin{table}
|
||||
\centering
|
||||
\renewcommand{\arraystretch}{1.2}
|
||||
\begin{tabular}{@{}cccc@{}}
|
||||
\textbf{Resolutions} & \textbf{2.5 Mbps} & \textbf{5 Mbps} & \textbf{10 Mbps} \\
|
||||
\toprule
|
||||
1 & 5.7\% vs 1.4\% & 6.3\% vs 1.4\% & 6.17\% vs 1.4\% \\
|
||||
2 & 10.9\% vs 8.6\% & 13.3\% vs 7.8\% & 14.0\% vs 8.3\%\\
|
||||
3 & 15.3\% vs 28.6\% & 20.1\% vs 24.3\% & 20.9\% vs 22.5\% \\
|
||||
4 & 14.6\% vs 18.4\% & 14.4\% vs 25.2\% & 14.2\% vs 24.1\% \\
|
||||
5 & 11.4\% vs 0.3\% & 11.1\% vs 5.9\% & 11.5\% vs 13.9\% \\\bottomrule
|
||||
\end{tabular}
|
||||
\caption{Percentages of downloaded bytes for textures from each resolution, for the greedy streaming policy (left) and for our proposed scheme (right)\label{table:perc}}
|
||||
\end{table}
|
||||
|
||||
20
src/dash-3d/main.tex
Normal file
20
src/dash-3d/main.tex
Normal file
@@ -0,0 +1,20 @@
|
||||
\chapter{DASH-3D}
|
||||
\input{dash-3d/content-preparation}
|
||||
\input{dash-3d/client}
|
||||
\input{dash-3d/evaluation}
|
||||
|
||||
|
||||
% \section{Streaming}
|
||||
%
|
||||
% \subsection{Content preparation}
|
||||
%
|
||||
% \written{ACMMM 18}
|
||||
%
|
||||
% \subsection{Client}
|
||||
%
|
||||
% \written{ACMMM 18}
|
||||
%
|
||||
% \section{Rendering}
|
||||
%
|
||||
% \missing{everything}
|
||||
%
|
||||
33
src/listing-config.sty
Normal file
33
src/listing-config.sty
Normal file
@@ -0,0 +1,33 @@
|
||||
\definecolor{colKeys}{rgb}{0,0.5,0}
|
||||
\definecolor{colIdentifier}{rgb}{0,0,0}
|
||||
\definecolor{colComments}{rgb}{0,0.5,1}
|
||||
\definecolor{colString}{rgb}{0.6,0.1,0.1}
|
||||
\definecolor{colBackground}{rgb}{0.95,0.95,1}
|
||||
\lstset{%configuration de listings
|
||||
float=hbp,%
|
||||
basicstyle=\ttfamily\small,%
|
||||
%
|
||||
identifierstyle=\color{colIdentifier}, %
|
||||
keywordstyle=\color{colKeys}, %
|
||||
stringstyle=\color{colString}, %
|
||||
commentstyle=\color{colComments}\textit, %
|
||||
%
|
||||
backgroundcolor=\color{colBackground},%
|
||||
%
|
||||
columns=flexible, %
|
||||
tabsize=2, %
|
||||
frame=trbl, %
|
||||
%frameround=tttt,%
|
||||
extendedchars=true, %
|
||||
showspaces=false, %
|
||||
showstringspaces=false, %
|
||||
numbers=left, %
|
||||
numberstyle=\tiny, %
|
||||
breaklines=true, %
|
||||
breakautoindent=true, %
|
||||
captionpos=b,%
|
||||
xrightmargin=0.2cm, %
|
||||
xleftmargin=0.2cm,
|
||||
emphstyle={\color{DarkGreen}},
|
||||
language=XML
|
||||
}
|
||||
@@ -1,7 +1,9 @@
|
||||
\documentclass{book}
|
||||
|
||||
\usepackage{commands}
|
||||
\usepackage{listings}
|
||||
\usepackage{config}
|
||||
\usepackage{listing-config}
|
||||
|
||||
\begin{document}
|
||||
|
||||
|
||||
31
src/plan.tex
31
src/plan.tex
@@ -34,21 +34,7 @@ Avoiding as much as possible server-side computations
|
||||
|
||||
\written{MMSys 16, ACMMM 18}
|
||||
|
||||
\chapter{DASH-3D}
|
||||
|
||||
\section{Streaming}
|
||||
|
||||
\subsection{Content preparation}
|
||||
|
||||
\written{ACMMM 18}
|
||||
|
||||
\subsection{Client}
|
||||
|
||||
\written{ACMMM 18}
|
||||
|
||||
\section{Rendering}
|
||||
|
||||
\missing{everything}
|
||||
\input{dash-3d/main.tex}
|
||||
|
||||
\part{3D Interaction}
|
||||
|
||||
@@ -66,17 +52,4 @@ Avoiding as much as possible server-side computations
|
||||
|
||||
\written{MMSys 16}
|
||||
|
||||
\chapter{System bookmarks}
|
||||
|
||||
\section{Bookmark interaction}
|
||||
|
||||
\unpublished{MMSys 18}
|
||||
|
||||
\section{QoS improvement with bookmarks}
|
||||
|
||||
\written{MMSys 16}
|
||||
|
||||
\unpublished{MMSys 18}
|
||||
|
||||
\include{user-study}
|
||||
|
||||
\input{system-bookmarks/main}
|
||||
|
||||
13
src/system-bookmarks/main.tex
Normal file
13
src/system-bookmarks/main.tex
Normal file
@@ -0,0 +1,13 @@
|
||||
\chapter{System bookmarks}
|
||||
|
||||
\section{Bookmark interaction}
|
||||
|
||||
\unpublished{MMSys 18}
|
||||
|
||||
\section{QoS improvement with bookmarks}
|
||||
|
||||
\written{MMSys 16}
|
||||
|
||||
\unpublished{MMSys 18}
|
||||
|
||||
\input{system-bookmarks/user-study}
|
||||
Reference in New Issue
Block a user