More writing, citations, fix references

2019-09-13 12:01:43 +02:00
parent e198861812
commit 4641129216
15 changed files with 281 additions and 116 deletions
@@ -1,5 +1,5 @@
 \copied{}
-\section{Client}\label{sec:dashclientspec}
+\section{Client\label{d3:dash-client}}

 In this section, we specify a DASH NVE client that exploits the preparation of the 3D content in an NVE for streaming.

@@ -11,7 +11,7 @@ At time instance $t_i$, the DASH client decides to download the appropriate segm

 Starting from $t_1$, the camera continuously follows a camera path $C=\{v(t_i), t_i \in  [t_1,t_{end}]\}$, along which downloading opportunities are strategically exploited to sequentially query the most useful segments.

-\subsection{Segment Utility}\label{subsec:utility}
+\subsection{Segment Utility\label{d3:utility}}

 Unlike video streaming, where the bitrate of each segment correlates with the quality of the video received, for 3D content, the size (in bytes) of the content does not necessarily correlate well to its contribution to visual quality.
 A large polygon with huge visual impact takes the same number of bytes as a tiny polygon.
@@ -31,7 +31,7 @@ Indeed, geometry segments have close to a similar number of faces; their size is
 For texture segments, the size is usually much smaller than the geometry segments but also varies a lot, as between two successive resolutions the number of pixels is divided by 4.

 Finally, for each texture segment $s^{T}$, the MPD stores the \textit{MSE} (mean square error) of the image and resolution, relative to the highest resolution (by default, triangles are filled with its average color).
-Offline parameters are stored in the MPD as shown in Listing~\ref{listing:MPD}.
+Offline parameters are stored in the MPD as shown in Listing~\ref{d3:mpd}.

 \subsubsection{Online parameters}
 In addition to the offline parameters stored in the MPD file for each segment, view-dependent parameters are computed at navigation time.
@@ -73,7 +73,7 @@ We do this to acknowledge the fact that a texture at a greater resolution has a
 The equivalent term for geometry is 1 (and does not appear).
 Having defined a utility on both geometry and texture segments, the client uses it next for its streaming strategy.

-\subsection{DASH Adaptation Logic}\label{subsec:dashadaptation}
+\subsection{DASH Adaptation Logic\label{d3:dash-adaptation}}

 Along the camera path $C=\{v(t_i)\}$, viewpoints are indexed by a continuous time interval $t_i \in  [t_1,t_{end}]$.
 Contrastingly, the DASH adaptation logic proceeds sequentially along a discrete time line.
@@ -81,7 +81,7 @@ The first request \texttt{(HTTP request)} made by the DASH client at time $t_1$
 While selecting $s_i^*$, the i-th best segment to request, the adaptation logic compromises between geometry, texture, and the available \texttt{representations} given the current bandwidth, camera dynamics, and the previously described utility scores.
 The difference between $t_{i+1}$  and $t_{i}$ is the $s_i^*$ delivery delay.
 It varies with the segment size and network conditions.
-Algorithm~\ref{algorithm:nextsegment} details how our DASH client makes decisions.
+Algorithm~\ref{d3:next-segment} details how our DASH client makes decisions.



@@ -99,18 +99,18 @@ Algorithm~\ref{algorithm:nextsegment} details how our DASH client makes decision


    {- Optimize a criterion  $\Omega$ based on $\mathcal{U}$ values and well chosen viewpoint $v(t_i)$ to select the next segment to query }
-   {\begin{equation*} s^{*}_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC}}  \Omega_{\theta_i} \Big(\mathcal{U}\left(s,v(t_i)\right)\Big) \label{eq1}\end{equation*} \\
+   {\begin{equation*} s^{*}_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC}}  \Omega_{\theta_i} \Big(\mathcal{U}\left(s,v(t_i)\right)\Big) \label{d3:eq1}\end{equation*} \\
   given parameters $\theta_i$ that gathers both online parameters $(i,t_i,v(t_i),\widehat{BW_i}, \widehat{\tau_i}, \mathcal{B}_i)$ and offline metadata\;}

    {- Update the buffer $\mathcal{B}_{i+1}$  for the next decision: $s^{*}_i$ and lowest \texttt{representations} of $s^{*}_i$ are considered downloaded\;}
     {- \Return{segment  $s^{*}_i$, buffer $\mathcal{B}_{i+1}$}\;}

-    {\caption{Algorithm to identify the next segment to query\label{algorithm:nextsegment}}}
+    {\caption{Algorithm to identify the next segment to query\label{d3:next-segment}}}
 \end{algorithm}

 The most naive way to sequentially optimize the $\mathcal{U}$ is to limit the decision-making to the current viewpoint $v(t_i)$.
 In that case, the best segment $s$ to request would be the one maximizing $\mathcal{U}(s, v(t_i))$ to simply make a better rendering from the current viewpoint  $v(t_i)$.
-Due to transmission delay however, this segment will be only delivered at time $t_{i+1}=t_{i+1}(s)$ depending on the segment size and network conditions: \begin{equation*} t_{i+1}(s)=t_i+\frac{\mathtt{size}(s)}{\widehat{BW_i}} + \widehat{\tau_i}\label{eq2}\end{equation*}
+Due to transmission delay however, this segment will be only delivered at time $t_{i+1}=t_{i+1}(s)$ depending on the segment size and network conditions: \begin{equation*} t_{i+1}(s)=t_i+\frac{\mathtt{size}(s)}{\widehat{BW_i}} + \widehat{\tau_i}\label{d3:eq2}\end{equation*}

 In consequence, the most useful segment from $v(t_i)$ at decision time $t_i$ might be less useful at delivery time from $v(t_{i+1})$.

@@ -119,16 +119,16 @@ With a temporal horizon $\chi$, we can optimize the cumulated $\mathcal{U}$ over

 \begin{equation}
 s^*_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC} }  \int_{t_{i+1}(s)}^{t_i+\chi} \mathcal{U}(s,\hat{v}(t_i)) dt
-\label{eq:smart}
+\label{d3:smart}
 \end{equation}

-In our experiments, we typically use $\chi=2s$ and estimate the (\ref{eq:smart}) integral by a Riemann sum where the $[t_{i+1}(s), t_i+\chi]$ interval is divided in 4 subintervals of equal size.
+In our experiments, we typically use $\chi=2s$ and estimate the (\ref{d3:smart}) integral by a Riemann sum where the $[t_{i+1}(s), t_i+\chi]$ interval is divided in 4 subintervals of equal size.
 For each subinterval extremity, an order 1 predictor $\hat{v}(t_i)$ linearly estimates the viewpoint based on $v(t_i)$ and speed estimation (discrete derivative at $t_i$).

 We also tested an alternative greedy heuristic selecting the segment that optimizes an utility variation during downloading (between $t_i$ and $t_{i+1}$):
 \begin{equation}
 s^{\texttt{GREEDY}}_i= \argmax{s \in \mathcal{S} \backslash \mathcal{B}_i \cap \mathcal{FC}}  \frac{\mathcal{U}\Big(s,\hat{v}(t_{i+1}(s))\Big)}{t_{i+1}(s) - t_i}
-\label{eq:greedy}
+\label{d3:greedy}
 \end{equation}


@@ -1,5 +1,5 @@
 \copied{}
-\section{Content preparation}\label{sec:dash3d}
+\section{Content preparation\label{d3:dash-3d}}

 In this section, we describe how we preprocess and store the 3D data of the NVE, consisting of a polygon soup, textures, and material information into a DASH-compliant Media Presentation Description (MPD) file.
 In our work, we use the \texttt{obj} file format for the polygons, \texttt{png} for textures, and \texttt{mtl} format for material information.
@@ -18,12 +18,12 @@ We utilize adaptation sets to organize a 3D scene's material, geometry, and text
 When the user navigates freely within an NVE, the frustum at given time almost always contains a limited part of the 3D scene.
 Similar to how DASH for video streaming partitions a video clip into temporal chunks, we segment the polygons into spatial chunks, such that the DASH client can request only the relevant chunks.

-\subsubsection{Geometry Management}\label{sec:geometry}
+\subsubsection{Geometry Management\label{d3:geometry}}
 We use a space partitioning tree to organize the faces into cells.
 A face belongs to a cell if its barycenter falls inside the corresponding bounding box.
 Each cell corresponds to an adaptation set.
 Thus, geometry information is spread on adaptation sets based on spatial coherence, allowing the client to download the relevant faces selectively.
-A cell is relevant if it intersects the frustum of the client's current viewpoint. Figure~\ref{fig:bigpic} shows the relevant cells in blue.
+A cell is relevant if it intersects the frustum of the client's current viewpoint. Figure~\ref{d3:big-picture} shows the relevant cells in blue.
 As our 3D content, a virtual environment, is biased to spread the most along the horizontal plane, we alternate between splitting between the two horizontal directions.

 We create a separate adaptation set for large faces (e.g., the sky or ground) because they are essential to the 3D model and do not fit into cells.
@@ -31,14 +31,14 @@ We consider a face to be large if its area in 3D is more than $a+3\sigma$, where
 In our example, it selects the 5 largest faces that represent $15\%$ of the total face area.
 We thus obtain a decomposition of the NVE into adaptation sets that partitions the geometry of the scene into a small adaptation set containing the larger faces of the model, and smaller adaptation sets containing the remaining faces.

-We store the spatial location of each adaptation set, characterized by the coordinates of its bounding box, in the MPD file as the supplementary property of the adaptation set in the form of ``\textit{$x_{\min}$, width, $y_{\min}$, height, $z_{\min}$, depth}'' (as shown in Listing~\ref{listing:MPD}).
-This information is used by the client to implement a view-dependent streaming (Section~\ref{sec:dashclientspec}).
+We store the spatial location of each adaptation set, characterized by the coordinates of its bounding box, in the MPD file as the supplementary property of the adaptation set in the form of ``\textit{$x_{\min}$, width, $y_{\min}$, height, $z_{\min}$, depth}'' (as shown in Listing~\ref{d3:mpd}).
+This information is used by the client to implement a view-dependent streaming (Section~\ref{d3:dash-client}).

 \subsubsection{Texture Management}
 As with geometry data, we handle textures using adaptation sets but separate from geometry.
-Each texture file is contained in a different adaptation set, with multiple representations providing different image resolutions (see Section~\ref{sec:representation}).
+Each texture file is contained in a different adaptation set, with multiple representations providing different image resolutions (see Section~\ref{d3:representation}).
 We add an attribute to each adaptation set that contains texture, describing the average color of the texture.
-The client can use this attribute to render a face for which the corresponding texture has not been loaded yet, so that most objects appear, at least, with a uniform  natural color (see Figure~\ref{fig:textures}).
+The client can use this attribute to render a face for which the corresponding texture has not been loaded yet, so that most objects appear, at least, with a uniform  natural color (see Figure~\ref{d3:textures}).


 \subsubsection{Material Management}
@@ -47,7 +47,7 @@ A material has a name, properties such as specular parameters, and, most importa
 The \texttt{.mtl} file maps each face of the \texttt{.obj} to a material.
 As the \texttt{.mtl} file is a different type of media than geometry and texture, we define a particular adaptation set for this file, with a single representation.

-\subsection{Representations}\label{sec:representation}
+\subsection{Representations}\label{d3:representation}
 Each adaptation set can contain one or more representations of the geometry or texture data, at different levels of detail (e.g., a different number of faces).
 For geometry, the resolution (i.e., 3D areas of faces) is heterogeneous, thus applying a sensible multi-resolution representation is cumbersome: the 3D area of faces varies from $0.01$ to more than $10K$, disregarding the outliers.
 For textured scenes, it is common to have such heterogeneous geometry size since information can be stored either in geometry or texture.
@@ -56,7 +56,7 @@ Moreover, as our faces are partitioned into independent cells, multi-resolution

 For an adaptation set containing texture, each representation contains a single segment where the image file is stored at the chosen resolution.
 In our example, from the full-size image, we generate successive resolutions by dividing both height and width by 2, stopping when the image size is less or equal to $64\times 64$.
-Figure~\ref{fig:textures} illustrates the use of the textures against the rendering using a single, average color per face.
+Figure~\ref{d3:textures} illustrates the use of the textures against the rendering using a single, average color per face.

 \begin{figure}[th]
    \centering
@@ -68,7 +68,7 @@ Figure~\ref{fig:textures} illustrates the use of the textures against the render
        \includegraphics[width=1\textwidth]{assets/dash-3d/average-color/no-res.png}
        \caption{With average colors}
    \end{subfigure}
-    \caption{Rendering of the model with different styles of textures\label{fig:textures}}
+    \caption{Rendering of the model with different styles of textures\label{d3:textures}}
 \end{figure}

 \subsection{Segments}
@@ -83,7 +83,7 @@ For textures, each representation contains a single segment.
    \lstinputlisting[%
        language=XML,
        caption={MPD description of a geometry adaptation set, and a texture adaptation set.},
-        label=listing:MPD,
+        label=d3:mpd,
        emph={%
            MPD,
            Period,
@@ -1,5 +1,5 @@
 \copied{}
-\section{Evaluation}\label{sec:eval}
+\section{Evaluation\label{d3:evaluation}}

 We now describe our setup and the data we use in our experiments. We present an evaluation of our system and a comparison of the impact of the design choices we introduced in the previous sections.

@@ -7,9 +7,9 @@ We now describe our setup and the data we use in our experiments. We present an

 \subsubsection{Model}
 We use a city model of the Marina Bay area in Singapore in our experiments.
-The model came in 3DS Max format and has been converted into Wavefront OBJ format before the processing described in Section~\ref{sec:dash3d}.
+The model came in 3DS Max format and has been converted into Wavefront OBJ format before the processing described in Section~\ref{d3:dash-3d}.
 The converted model has 387,551 vertices and 552,118 faces.
-Table~\ref{table:size} gives some general information about the model.
+Table~\ref{d3:size} gives some general information about the model.
 We partition the geometry into a k-$d$ tree until the leafs have less than 10000 faces, which gives us 64 adaptation sets, plus one containing the large faces.

 \begin{table}[th]
@@ -24,7 +24,7 @@ We partition the geometry into a k-$d$ tree until the leafs have less than 10000
        Textures (low res) & 11 MB \\
        \bottomrule
    \end{tabular}
-    \caption{Sizes of the different files of the model\label{table:size}}
+    \caption{Sizes of the different files of the model\label{d3:size}}
 \end{table}

 \subsubsection{User Navigations}
@@ -39,10 +39,10 @@ The recorded camera trace allows us to replay each camera path to perform our si
 We collected 13 camera paths this way.

 \subsubsection{Network Setup}
-We tested our implementation under three network bandwidth of 2.5 Mbps, 5 Mbps, and 10 Mbps with an RTT of 38 ms, following the settings from DASH-IF~\cite{DASH_NETWORK_PROFILE}.
+We tested our implementation under three network bandwidth of 2.5 Mbps, 5 Mbps, and 10 Mbps with an RTT of 38 ms, following the settings from DASH-IF~\cite{dash-network-profiles}.
 The values are kept constant during the entire client session to analyze the difference in magnitude of performance by increasing the bandwidth.

-In our experiments, we set up a virtual camera that moves along a navigation path, and our access engine downloads segments in real time according to Algorithm~\ref{algorithm:nextsegment}.
+In our experiments, we set up a virtual camera that moves along a navigation path, and our access engine downloads segments in real time according to Algorithm~\ref{d3:next-segment}.
 We log in a JSON file the time when a segment is requested and when it is received.
 By doing so, we avoid wasting time and resources to evaluate our system while downloading segments and store all the information necessary to plot the figures introduced in the subsequent sections.

@@ -60,12 +60,12 @@ We do not have pixel error due to compression.
 We present experiments to validate our implementation choices at every step of our system.
 We replay the user-generated camera paths with various bandwidth conditions while varying key components of our system.

-Table~\ref{table:experiments} sums up all the components we varied in our experiments.
+Table~\ref{d3:experiments} sums up all the components we varied in our experiments.
 We compare the impact of two space-partitioning trees, a $k$-d tree and an Octree, on content preparation.
-We also try several utility metrics for geometry segments: an offline one, which assigns to each geometry segment $s^G$ the cumulated 3D area of its belonging faces $\mathcal{A}_{3D}(s^G)$; an online one, which assigns to each geometry segment the inverse of its distance to the camera position; and finally our proposed method, as described in Section~\ref{subsec:utility} ($\mathcal{A}_{3D}(s^G)/ \mathcal{D}{(v{(t_i)},AS^G)}^2$).
-We consider two streaming policies to be applied by the client, proposed in Section~\ref{sec:dashclientspec}.
-The greedy strategy determines, at each decision time, the segment that maximizes its predicted utility at arrival divided by its predicted delivery delay, which corresponds to  equation (\ref{eq:greedy}).
-The second streaming policy that we run is the one we proposed in equation (\ref{eq:smart}).
+We also try several utility metrics for geometry segments: an offline one, which assigns to each geometry segment $s^G$ the cumulated 3D area of its belonging faces $\mathcal{A}_{3D}(s^G)$; an online one, which assigns to each geometry segment the inverse of its distance to the camera position; and finally our proposed method, as described in Section~\ref{d3:utility} ($\mathcal{A}_{3D}(s^G)/ \mathcal{D}{(v{(t_i)},AS^G)}^2$).
+We consider two streaming policies to be applied by the client, proposed in Section~\ref{d3:dash-client}.
+The greedy strategy determines, at each decision time, the segment that maximizes its predicted utility at arrival divided by its predicted delivery delay, which corresponds to  equation (\ref{d3:greedy}).
+The second streaming policy that we run is the one we proposed in equation (\ref{d3:smart}).
 We have also analyzed the effect of grouping the faces in geometry segments of an adaptation set based on their 3D area.
 Finally, we try several bandwidth parameters to study how our system can adapt to varying network conditions.

@@ -80,7 +80,7 @@ Finally, we try several bandwidth parameters to study how our system can adapt t
        Grouping of Segments & Sorted based on area, Unsorted\\
        Bandwidth & 2.5 Mbps, 5 Mbps, 10 Mbps \\\bottomrule
    \end{tabular}
-    \caption{Different parameters in our experiments\label{table:experiments}}
+    \caption{Different parameters in our experiments\label{d3:experiments}}
 \end{table}

 \subsection{Experimental Results}
@@ -104,13 +104,13 @@ Finally, we try several bandwidth parameters to study how our system can adapt t
            \addlegendentry{\scriptsize octree}
        \end{axis}
    \end{tikzpicture}
-    \caption{Impact of the space-partitioning tree on the  rendering quality with a 5Mbps bandwidth.\label{fig:preparation}}
+    \caption{Impact of the space-partitioning tree on the  rendering quality with a 5Mbps bandwidth.\label{d3:preparation}}
 \end{figure}

-Figure~\ref{fig:preparation} shows how the space partition can affect the rendering quality.
-We use our proposed utility metrics (see Section~\ref{subsec:utility}) and streaming policy from Equation (\ref{eq:smart}), on content divided into adaptation sets obtained either using a $k$-d tree or an Octree and run experiments on all camera paths at 5 Mbps.
+Figure~\ref{d3:preparation} shows how the space partition can affect the rendering quality.
+We use our proposed utility metrics (see Section~\ref{d3:utility}) and streaming policy from Equation (\ref{d3:smart}), on content divided into adaptation sets obtained either using a $k$-d tree or an Octree and run experiments on all camera paths at 5 Mbps.
 The octree partitions content into non-homogeneous adaptation sets; as a result, some adaptation sets may contain smaller segments, which contain both important (large) and non-important polygons. For the $k$-d tree, we create cells containing the same number of faces $N_a$ (here, we take $N_a=10k$).
-Figure~\ref{fig:preparation} shows that the system seems to be slightly less efficient with an Octree than with a $k$-d tree based partition, but this result is not significant.
+Figure~\ref{d3:preparation} shows that the system seems to be slightly less efficient with an Octree than with a $k$-d tree based partition, but this result is not significant.
 For the remaining experiments, partitioning is based on a $k$-d tree.

 \begin{figure}[th]
@@ -135,13 +135,13 @@ For the remaining experiments, partitioning is based on a $k$-d tree.
            \addlegendentry{\scriptsize Offline only}
        \end{axis}
    \end{tikzpicture}
-    \caption{Impact of the segment utility metric on the rendering quality with a 5Mbps bandwidth.\label{fig:utility}}
+    \caption{Impact of the segment utility metric on the rendering quality with a 5Mbps bandwidth.\label{d3:utility-impact}}
 \end{figure}

-Figure~\ref{fig:utility} displays how a utility metric should take advantage of both offline and online features.
+Figure~\ref{d3:utility-impact} displays how a utility metric should take advantage of both offline and online features.
 The experiments consider $k$-d tree cell for adaptation sets and the proposed streaming policy, on all camera paths.
 We observe that a purely offline utility metric leads to poor PSNR results.
-An online-only utility improves the results, as it takes the user viewing frustum into consideration, but still, the proposed utility (in Section~\ref{subsec:utility}) performs better.
+An online-only utility improves the results, as it takes the user viewing frustum into consideration, but still, the proposed utility (in Section~\ref{d3:utility}) performs better.

 \begin{figure}[th]
    \centering
@@ -163,18 +163,18 @@ An online-only utility improves the results, as it takes the user viewing frustu
            \addlegendentry{\scriptsize Without sorting the faces}
        \end{axis}
    \end{tikzpicture}
-    \caption{Impact of creating the segments of an adaptation set based on decreasing 3D area of faces with a 5Mbps bandwidth.}\label{fig:sorting}
+    \caption{Impact of creating the segments of an adaptation set based on decreasing 3D area of faces with a 5Mbps bandwidth.\label{d3:sorting}}
 \end{figure}

-Figure~\ref{fig:sorting} shows the effect of grouping the segments in an adaptation set based on their area in 3D.
+Figure~\ref{d3:sorting} shows the effect of grouping the segments in an adaptation set based on their area in 3D.
 Clearly, the PSNR significantly improves when the 3D area of faces is considered for creating the segments. Since all segments are of the same size, sorting the faces by area before grouping them into segments leads to a skew distribution of how useful the segments are.  This skewness means that the decision that the client makes (to download those with the largest utility first) can make a bigger difference in the quality.

-We also compared the greedy vs.\ proposed streaming policy (as shown in Figure~\ref{fig:greedyweakness}) for limited bandwidth (5 Mbps).
+We also compared the greedy vs.\ proposed streaming policy (as shown in Figure~\ref{d3:greedy-weakness}) for limited bandwidth (5 Mbps).
 The proposed scheme outperforms the greedy during the first 30s and does a better job overall.
-Table~\ref{table:greedyVsproposed} shows the average PSNR for the proposed method and the greedy method for different downloading bandwidth.
+Table~\ref{d3:greedy-vs-proposed} shows the average PSNR for the proposed method and the greedy method for different downloading bandwidth.
 In the first 30 sec, since there are relatively few 3D contents downloaded, making a better decision at what to download matters more: we observe during that time that the proposed method leads to 1 --- 1.9 dB better in quality terms of PSNR compared to Greedy.

-Table~\ref{table:perc} shows the distribution of texture resolutions that are downloaded by greedy and our Proposed scheme, at different bandwidths.
+Table~\ref{d3:percentages} shows the distribution of texture resolutions that are downloaded by greedy and our Proposed scheme, at different bandwidths.
 Resolution 5 is the highest and 1 is the lowest.
 The table clearly shows a weakness of the greedy policy: as the bandwidth increases, the distribution of downloaded textures resolution stays more or less the same.
 In contrast, our proposed streaming policy adapts to an increasing bandwidth by downloading higher resolution textures (13.9\% at 10 Mbps, vs. 0.3\% at 2.5 Mbps).
@@ -201,7 +201,7 @@ In other words, our system tends to favor geometry segments when the bandwidth i
            \addlegendentry{\scriptsize Greedy}
        \end{axis}
    \end{tikzpicture}
-    \caption{Impact  of the streaming policy (greedy vs.\ proposed) with a 5 Mbps bandwidth.}\label{fig:greedyweakness}
+    \caption{Impact  of the streaming policy (greedy vs.\ proposed) with a 5 Mbps bandwidth.}\label{d3:greedy-weakness}
 \end{figure}

 \begin{table}[th]
@@ -216,7 +216,7 @@ In other words, our system tends to favor geometry segments when the bandwidth i
        Proposed  & 16.3 & 20.4 & 23.2 & & 23.8 & 28.2 & 31.1  \\
        \bottomrule
    \end{tabular}
-    \caption{Average PSNR, Greedy vs. Proposed\label{table:greedyVsproposed}}
+    \caption{Average PSNR, Greedy vs. Proposed\label{d3:greedy-vs-proposed}}
 \end{table}

 \begin{table}[th]
@@ -232,6 +232,6 @@ In other words, our system tends to favor geometry segments when the bandwidth i
        4 & 14.6\% vs 18.4\% & 14.4\% vs 25.2\% & 14.2\% vs 24.1\% \\
        5 & 11.4\% vs  0.3\% & 11.1\% vs  5.9\% & 11.5\% vs 13.9\% \\\bottomrule
    \end{tabular}
-    \caption{Percentages of downloaded bytes for textures from each resolution, for the greedy streaming policy (left) and for our proposed scheme (right)\label{table:perc}}
+    \caption{Percentages of downloaded bytes for textures from each resolution, for the greedy streaming policy (left) and for our proposed scheme (right)\label{d3:percentages}}
 \end{table}

@@ -1,3 +1,2 @@
-\subsection{On our way to DASH-3D}
-
-DASH is made to be format agnositic, and even though it is almost only applied for video streaming nowadays, we believe it is still suitable for 3D streaming. Even though periods are not much of a use in the case of a scene that doesn't evolve as time goes by, but adaptation sets allow us to separate our content between geometry and textures, and gives answers to the questions that were addresed in the conclusion of the previous chapter.
+DASH is made to be format agnositic, and even though it is almost only applied for video streaming nowadays, we believe it is still suitable for 3D streaming.
+Even though periods are not much of a use in the case of a scene that doesn't evolve as time goes by, but adaptation sets allow us to separate our content between geometry and textures, and gives answers to the questions that were addresed in the conclusion of the previous chapter.
@@ -1,5 +1,16 @@
 \chapter{DASH-3D}

+\begin{figure}[ht]
+    \centering
+    \includegraphics[width=\textwidth]{assets/dash-3d/bigpicture.png}
+    \caption{%
+        A subdivided 3D scene with a viewport, with regions delimited with
+        red edges. In white, the regions that are outside the field of view
+        of the camera; in green, the regions inside the field of view of the
+        camera.\label{d3:big-picture}
+    }
+\end{figure}
+
 \input{dash-3d/introduction}
 \resetstyle{}