Merge branch 'main' of gitea.tforgione.fr:tforgione/phd-typst

2023-06-12 17:32:56 +02:00 · 2023-06-12 17:32:56 +02:00 · 1c0669df3c
commit 1c0669df3c
parent b436c50bde 4fb63ba519
4 changed files with 352 additions and 11 deletions
--- a/dash-3d/client.typ
+++ b/dash-3d/client.typ
@ -219,12 +219,12 @@ We also tested an alternative greedy heuristic selecting the segment that optimi
 // \label{d3:greedy}
 // \end{equation}
 // $
-== JavaScript client<d3:js-implementation>
+=== JavaScript client<d3:js-implementation>
 In order to be able to evaluate our system, we need to collect traces and perform analyses on them.
 Since our scene is large, and since the system we are describing allows navigating in a streaming scene, we developed a JavaScript web client that implements our utility metrics and policies.
-=== Media engine
+==== Media engine
 Performance of our system is a key aspect in our work; as such, we can not use the default geometries described in Section~\ref{f:geometries} because of their poor performance, and we instead use buffer geometries.
 However, in our system, the way changes happen to the 3D content is always the same: we only add faces and textures to the model.
@ -234,7 +234,7 @@ We therefore implemented a class that derives `BufferGeometry`, for more conveni
 - It provides a method to add a new polygon to the geometry.
 - It also keeps track of what part of the buffers has been transmitted to the GPU: THREE.js allows us to set the range of the buffer that we want to update, and we are able to update only what is necessary.
-=== Our 3D model class<d3:model-class>
+==== Our 3D model class<d3:model-class>
 As said in the previous subsections, a geometry and a material are bound together in a mesh.
 This means that we are forced to have as many meshes as there are materials in our model.
 To make this easy to manage, we implemented a *Model* class, that holds both geometry and textures.
@ -301,7 +301,7 @@ In order to avoid having many models that share the same material (which would h
 //     \caption{\label{d3:render-structure}}
 // \end{figure}
-=== Access client
+==== Access client
 In order to be able to implement our view-dependent DASH-3D client, we need to implement the access client, which is responsible for deciding what to download and for downloading it.
 To do so, we use the strategy pattern illustrated in @d3:dash-loader.
@ -358,7 +358,7 @@ parsed: this data can contain vertices, texture coordinates, normals, materials
 //     \caption{\label{d3:dash-loader}}
 // \end{figure}
-=== Performance
+==== Performance
 JavaScript requires the use of _web workers_ to perform parallel computing.
 A web worker is a script in JavaScript that runs in the background, on a separate thread and that can communicate with the main script by sending and receiving messages.
@ -482,7 +482,7 @@ A sequence diagram of what happens when downloading, parsing and rendering conte
 // \end{figure}
-== Rust client<d3i:rust-implementation>
+=== Rust client<d3i:rust-implementation>
 However, a web client is not sufficient to analyse our streaming policies: many tasks are performed (such as rendering, and managing the interaction) and all this overhead pollutes the analysis of our policies.
 This is why we also implemented a client in Rust, for simulation, so we can gather precise simulated data.
--- a/dash-3d/conclusion.typ
+++ b/dash-3d/conclusion.typ
@ -0,0 +1,18 @@
 == Conclusion<d3:conclusion>
 Our work in this chapter started with the question: can DASH be used for NVE\@?
 The answer is _yes_.
 In answering this question, we contributed by showing how to organize a polygon soup and its textures into a DASH-compliant format that (i) includes a minimal amount of metadata that is useful for the client, (ii) organizes the data to allow the client to get the most useful content first.
 We further show that the data organization and its description with metadata (precomputed offline) is sufficient to design and build a DASH client that is adaptive --- it selectively downloads segments within its view, makes intelligent decisions about what to download, balances between geometry and texture while adapting to network bandwidth.
 This way, our system addresses the open problems we mentioned in @i:challenges.
 - *It prepares and structures the content in a way that enables streaming*: all this preparation is precomputed, and all the content is structured according to DASH framework, geometry but also materials and textures. Furthermore, textures are prepared in a multi-resolution manner, and even though multi-resolution geometry is not discussed here, the difficulty of integrating it in this system seem moderated: we could encode levels of detail in different representations and define a utility metric for each representation and the system should adapt naturally.
 - *We are able to estimate the utility of each segment* by exploiting all the metadata given in the MPD and by analysing the camera parameters of the user.
 - *We proposed a few streaming policies*, from the easiest to implement to the more complex, so that the client exploits the utility metrics to define a best guess for the next chunk to download.
 - *The implementation is efficient*: the content preparation allows a client to get all the information it needs from metadata and the server has nothing else to do than to serve files. Special attention has been granted to the client's performance.
 However, the work described in this chapter does not take any quality of experience metrics into account.
 We designed a 3D streaming system, but we kept the interaction system the simplest possible.
 Dealing with interaction while dealing with all of the other problems we try to solve seems hard, and we believe keeping the interaction simple was a necessary step to build a solid 3D streaming system.
 Now that we have this system, we are able to work again on the interaction problem and our work and conclusions are given in Chapter~\ref{sb}.
--- a/dash-3d/evaluation.typ
+++ b/dash-3d/evaluation.typ
@ -0,0 +1,326 @@
 == Evaluation<d3:evaluation>
 We now describe our setup and the data we use in our experiments. We present an evaluation of our system and a comparison of the impact of the design choices we introduced in the previous sections.
 === Experimental setup
 ==== Model
 We use a city model of the Marina Bay area in Singapore in our experiments.
 The model came in 3DS Max format and has been converted into Wavefront OBJ format before the processing described in @d3:dash-3d.
 The converted model has 387,551 vertices and 552,118 faces.
@d3:size gives some general information about the model and @d3:heterogeneity illustrates the heterogeneity of our model (wireframe rendering is used to illustrate the heterogeneity of the geometry complexity).
 We partition the geometry into a $k$-d tree until the leafs have less than 10000 faces, which gives us 64 adaptation sets, plus one containing the large faces.
 #figure(
  grid(
    columns: (1fr, 0.2fr, 1fr),
    {
      align(center + bottom,
        figure(
          image("../assets/dash-3d/heterogeneity/low-res-wire.png", width: 100%),
          caption: [Low resolution geometry]
        )
      )
    },
    [],
    {
      align(center + bottom,
        figure(
          image("../assets/dash-3d/heterogeneity/no-textures.png", width: 100%),
          caption: [Simplistic textures replicated]
        )
      )
    },
    {
      align(center + bottom,
        figure(
          image("../assets/dash-3d/heterogeneity/high-res-wire.png", width: 100%),
          caption: [High resolution geometry]
        )
      )
    },
    [],
    {
      align(center + bottom,
        figure(
          image("../assets/dash-3d/heterogeneity/high-res-textures.png", width: 100%),
          caption: [Detailed textures]
        )
      )
    },
  ),
  caption: [Illustration of the heterogeneity of the model]
 )<d3:heterogeneity>
 #figure(
  table(
    columns: (auto, auto),
    align: left,
    [*Files*], [*Size*],
    [3DS Max], [55 MB],
    [OBJ file], [62 MB],
    [MTL file], [0.27MB],
    [Textures (high res)], [167 MB],
    [Textures (low res)], [11 MB],
  ),
  caption: [Sizes of the different files of the model]
 )<d3:size>
 ==== User navigations
 To evaluate our system, we collected realistic user navigation traces which we can replay in our experiments.
 We presented six users with a web interface, on which the model was loaded progressively as the user could interact with it.
 The available interactions were inspired by traditional first-person interactions in video games, i.e., W, A, S, and D keys to translate the camera, and mouse to rotate the camera.
 We asked users to browse and explore the scene until they felt they had visited all important regions.
 We then asked them to produce camera navigation paths that would best present the 3D scene to a user that would discover it.
 To record a path, the users first place their camera to their preferred starting point, then click on a button to start recording.
 Every 100ms, the position, viewing angle of the camera and look-at point are saved into an array which will then be exported into JSON format.
 The recorded camera trace allows us to replay each camera path to perform our simulations and evaluate our system.
 We collected 13 camera paths this way.
 ==== Network setup
 We tested our implementation under three network bandwidth of 2.5 Mbps, 5 Mbps, and 10 Mbps with an RTT of 38 ms, following the settings from DASH-IF~\citep{dash-network-profiles}.
 The values are kept constant during the entire client session to analyze the difference in magnitude of performance by increasing the bandwidth.
 In our experiments, we set up a virtual camera that moves along a navigation path, and our access engine downloads segments in real time according to Algorithm~\ref{d3:next-segment}.
 We log in a JSON file the time when a segment is requested and when it is received.
 By doing so, we avoid wasting time and resources to evaluate our system while downloading segments and store all the information necessary to plot the figures introduced in the subsequent sections.
 ==== Hardware and software
 The experiments were run on an Acer Aspire V3 with an Intel Core i7 3632QM processor and an NVIDIA GeForce GT 740M graphics card.
 The DASH client is written in Rust // TODO footnote : \url{https://www.rust-lang.org/}}, using Glium\footnote{\url{https://github.com/glium/glium}} for rendering, and reqwest\footnote{\url{https://github.com/seanmonstar/reqwest/}} to load the segments.
 ==== Metrics
 To objectively evaluate the quality of the resulting rendering, we use PSNR\@.
 The scene as rendered offline using the same camera path with all the geometry and texture data available is used as ground truth.
 Note that a pixel error can occur in our case only in two situations: (i) when a face is missing, in which case the color of the background object is shown, and (ii) when a texture is either missing or downsampled.
 We do not have pixel error due to compression.
 ==== Experiments
 We present experiments to validate our implementation choices at every step of our system.
 We replay the user-generated camera paths with various bandwidth conditions while varying key components of our system.
@d3:experiments sums up all the components we varied in our experiments.
 We compare the impact of two space-partitioning trees, a $k$-d tree and an octree, on content preparation.
 We also try several utility metrics for geometry segments: an offline one, which assigns to each geometry segment $s^G$
 the cumulated 3D area of its belonging faces $cal(A)(s^G)$; an online one, which assigns to each geometry segment the
 inverse of its distance to the camera position; and finally our proposed method, as described in Section~\ref{d3:utility} ($cal(A)(s^G)/ cal(D){(v{(t_i)},"AS"^G)}^2$).
 We consider two streaming policies to be applied by the client, proposed in Section~\ref{d3:dash-client}.
 The greedy strategy determines, at each decision time, the segment that maximizes its predicted utility at arrival divided by its predicted delivery delay, which corresponds to  equation (\ref{d3:greedy}).
 The second streaming policy that we run is the one we proposed in equation (\ref{d3:smart}).
 We have also analyzed the effect of grouping the faces in geometry segments of an adaptation set based on their 3D area.
 Finally, we try several bandwidth parameters to study how our system can adapt to varying network conditions.
 #figure(
  table(
    columns: (auto, auto),
    align: left,
    [*Parameters*], [*Values*],
    [Content preparation], [Octree, $k$-d tree],
    [Utility], [Offline, Online, Proposed],
    [Streaming policy], [Greedy, Proposed ],
    [Grouping of Segments], [Sorted based on area, Unsorted],
    [Bandwidth], [2.5 Mbps, 5 Mbps, 10 Mbps],
  ),
  caption: [Different parameters in our experiments],
 )<d3:experiments>
 === Experimental results
 #figure(
  caption: [Impact of the space-partitioning tree on the  rendering quality with a 5Mbps bandwidth],
  [TODO],
 )<d3:preparation>
 // \begin{figure}[th]
 //     \centering
 //     \begin{tikzpicture}
 //         \begin{axis}[
 //                 xlabel=Time (in s),
 //                 ylabel=PSNR,
 //                 no markers,
 //                 cycle list name=mystyle,
 //                 width=\tikzwidth,
 //                 height=\tikzheight,
 //                 legend pos=south east,
 //                 xmin=0,
 //                 xmax=90,
 //             ]
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
 //             \addlegendentry{\scriptsize $k$-d tree}
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/2/curve.dat};
 //             \addlegendentry{\scriptsize octree}
 //         \end{axis}
 //     \end{tikzpicture}
 //     \caption{Impact of the space-partitioning tree on the  rendering quality with a 5Mbps bandwidth.\label{d3:preparation}}
 // \end{figure}
@d3:preparation shows how the space partition can affect the rendering quality.
 We use our proposed utility metrics (see Section~\ref{d3:utility}) and streaming policy from equation (\ref{d3:smart}), on content divided into adaptation sets obtained either using a $k$-d tree or an octree and run experiments on all camera paths at 5 Mbps.
 The octree partitions content into non-homogeneous adaptation sets; as a result, some adaptation sets may contain smaller segments, which contain both important (large) and non-important polygons. For the $k$-d tree, we create cells containing the same number of faces $N_a$ (here, we take $N_a=10000$).
 Figure~\ref{d3:preparation} shows that the system seems to be slightly less efficient with an octree than with a $k$-d tree based partition, but this result is not significant.
 For the remaining experiments, partitioning is based on a $k$-d tree.
 #figure(
  caption: [Impact of the segment utility metric on the rendering quality with a 5Mbps bandwidth],
  [TODO]
 )<d3:utility-impact>
 // \begin{figure}[th]
 //     \centering
 //     \begin{tikzpicture}
 //         \begin{axis}[
 //                 xlabel=Time (in s),
 //                 ylabel=PSNR,
 //                 no markers,
 //                 cycle list name=mystyle,
 //                 width=\tikzwidth,
 //                 height=\tikzheight,
 //                 legend pos=south east,
 //                 xmin=0,
 //                 xmax=90,
 //             ]
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
 //             \addlegendentry{\scriptsize Proposed}
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/3/curve.dat};
 //             \addlegendentry{\scriptsize Online only}
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/4/curve.dat};
 //             \addlegendentry{\scriptsize Offline only}
 //         \end{axis}
 //     \end{tikzpicture}
 //     \caption{Impact of the segment utility metric on the rendering quality with a 5Mbps bandwidth.\label{d3:utility-impact}}
 // \end{figure}
@d3:utility-impact displays how a utility metric should take advantage of both offline and online features.
 The experiments consider $k$-d tree cell for adaptation sets and the proposed streaming policy, on all camera paths.
 We observe that a purely offline utility metric leads to poor PSNR results.
 An online-only utility improves the results, as it takes the user viewing frustum into consideration, but still, the proposed utility (in Section~\ref{d3:utility}) performs better.
 #figure(
  caption: [Impact of creating the segments of an adaptation set based on decreasing 3D area of faces with a 5Mbps bandwidth],
  [TODO]
 )<d3:sorting>
 // \begin{figure}[th]
 //     \centering
 //     \begin{tikzpicture}
 //         \begin{axis}[
 //                 xlabel=Time (in s),
 //                 ylabel=PSNR,
 //                 no markers,
 //                 cycle list name=mystyle,
 //                 width=\tikzwidth,
 //                 height=\tikzheight,
 //                 legend pos=south east,
 //                 xmin=0,
 //                 xmax=90,
 //             ]
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
 //             \addlegendentry{\scriptsize Sorting the faces by area}
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/5/curve.dat};
 //             \addlegendentry{\scriptsize Without sorting the faces}
 //         \end{axis}
 //     \end{tikzpicture}
 //     \caption{Impact of creating the segments of an adaptation set based on decreasing 3D area of faces with a 5Mbps bandwidth.\label{d3:sorting}}
 // \end{figure}
@d3:sorting shows the effect of grouping the segments in an adaptation set based on their area in 3D.
 The PSNR significantly improves when the 3D area of faces is considered for creating the segments. Since all segments are of the same size, sorting the faces by area before grouping them into segments leads to a skew distribution of how useful the segments are.  This skewness means that the decision that the client makes (to download those with the largest utility first) can make a bigger difference in the quality.
 We also compared the greedy vs.  proposed streaming policy (as shown in @d3:greedy-weakness) for limited bandwidth (5 Mbps).
 The proposed scheme outperforms the greedy during the first 30s and does a better job overall.
@d3:greedy-vs-proposed shows the average PSNR for the proposed method and the greedy method for different downloading bandwidth.
 In the first 30 sec, since there are relatively few 3D contents downloaded, making a better decision at what to download matters more: we observe during that time that the proposed method leads to 1 --- 1.9 dB better in quality terms of PSNR compared to the greedy method.
@d3:percentages shows the distribution of texture resolutions that are downloaded by greedy and our proposed scheme, at different bandwidths.
 Resolution 5 is the highest and 1 is the lowest.
 The table shows a weakness of the greedy policy: the distributioon of downloaded textures does not adapt to the bandwidth.
 In contrast, our proposed streaming policy adapts to an increasing bandwidth by downloading higher resolution textures (13.9% at 10 Mbps, vs. 0.3% at 2.5 Mbps).
 In fact, an interesting feature of our proposed streaming policy is that it adapts the geometry-texture compromise to the bandwidth. The textures represent 57.3% of the total amount of downloaded bytes at 2.5 Mbps, and 70.2% at 10 Mbps.
 In other words, our system tends to favor geometry segments when the bandwidth is low, and favor texture segments when the bandwidth increases.
 #figure(
  caption: [Impact  of the streaming policy (greedy vs.\ proposed) with a 5 Mbps bandwidth],
  [TODO]
 )<d3:greedy-weakness>
 // \begin{figure}[th]
 //     \centering
 //     \begin{tikzpicture}
 //         \begin{axis}[
 //                 xlabel=Time (in s),
 //                 ylabel=PSNR,
 //                 no markers,
 //                 cycle list name=mystyle,
 //                 width=\tikzwidth,
 //                 height=\tikzheight,
 //                 legend pos=south east,
 //                 xmin=0,
 //                 xmax=90,
 //             ]
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/1/curve.dat};
 //             \addlegendentry{\scriptsize Proposed}
 //             \addplot table [y=psnr, x=time]{assets/dash-3d/gnuplot/6/curve.dat};
 //             \addlegendentry{\scriptsize Greedy}
 //         \end{axis}
 //     \end{tikzpicture}
 //     \caption{Impact  of the streaming policy (greedy vs.\ proposed) with a 5 Mbps bandwidth.}\label{d3:greedy-weakness}
 // \end{figure}
 #figure(
  table(
    columns: (auto, auto, auto, auto, auto, auto, auto),
    align: left,
    [], [], [], [], [], [], [],
    [*BW (in Mbps)*], [2.5], [5], [10], [2.5], [5], [10],
    [*Greedy*]  , [14.4], [19.4], [22.1], [19.8], [26.9], [29.7],
    [*Proposed*], [16.3], [20.4], [23.2], [23.8], [28.2], [31.1],
  ),
  caption: [Average PSNR, Greedy vs. Proposed],
 )<d3:greedy-vs-proposed>
 // \begin{table}[th]
 //     \centering
 //     \begin{tabular}{@{}p{2.5cm}p{0.7cm}p{0.7cm}p{0.7cm}p{0.3cm}p{0.7cm}p{0.7cm}p{0.7cm}@{}}
 //         \toprule
 //         \multirow{2}{1.7cm}{} & \multicolumn{3}{c}{\textbf{First 30 Sec}} & & \multicolumn{3}{c}{\textbf{Overall}}\\
 //         \cline{2-8}
 //         BW (in Mbps) & 2.5 & 5 & 10 & & 2.5 & 5 & 10 \\
 //         \midrule
 //         Greedy    & 14.4 & 19.4 & 22.1 & & 19.8 & 26.9 & 29.7  \\
 //         Proposed  & 16.3 & 20.4 & 23.2 & & 23.8 & 28.2 & 31.1  \\
 //         \bottomrule
 //     \end{tabular}
 //     \caption{Average PSNR, Greedy vs. Proposed\label{d3:greedy-vs-proposed}}
 // \end{table}
 #figure(
  table(
    columns: (auto, auto, auto, auto),
    [*Resolutions*], [*2.5 Mbps*], [*5 Mbps*], [*10 Mbps*],
    [1], [ 5.7% vs  1.4%], [ 6.3% vs  1.4%], [6.17% vs  1.4%],
    [2], [10.9% vs  8.6%], [13.3% vs  7.8%], [14.0% vs  8.3%],
    [3], [15.3% vs 28.6%], [20.1% vs 24.3%], [20.9% vs 22.5%],
    [4], [14.6% vs 18.4%], [14.4% vs 25.2%], [14.2% vs 24.1%],
    [5], [11.4% vs  0.3%], [11.1% vs  5.9%], [11.5% vs 13.9%],
  ),
  caption: [Percentages of downloaded bytes for textures from each resolution, for the greedy streaming policy (left) and for our proposed scheme (right)],
 )<d3:percentages>
 // \begin{table}[th]
 //     \centering
 //     \renewcommand{\arraystretch}{1.2}
 //     \begin{tabular}{@{}cccc@{}}
 //         \toprule
 //         \textbf{Resolutions} & \textbf{2.5 Mbps} & \textbf{5 Mbps} & \textbf{10 Mbps}  \\
 //         \midrule
 //         1 &  5.7% vs  1.4% &  6.3% vs  1.4% & 6.17% vs  1.4% \\
 //         2 & 10.9% vs  8.6% & 13.3% vs  7.8% & 14.0% vs  8.3%\\
 //         3 & 15.3% vs 28.6% & 20.1% vs 24.3% & 20.9% vs 22.5% \\
 //         4 & 14.6% vs 18.4% & 14.4% vs 25.2% & 14.2% vs 24.1% \\
 //         5 & 11.4% vs  0.3% & 11.1% vs  5.9% & 11.5% vs 13.9% \\
 //         \bottomrule
 //     \end{tabular}
 //     \caption{\label{d3:percentages}}
 // \end{table}
--- a/dash-3d/main.typ
+++ b/dash-3d/main.typ
@ -29,8 +29,5 @@ We finally evaluate these system parameters under different bandwidth setups and
 #include("introduction.typ")
 #include("content-preparation.typ")
 #include("client.typ")
-\input{dash-3d/content-preparation}
+#include("evaluation.typ")
-\input{dash-3d/client}
+#include("conclusion.typ")
 \input{dash-3d/evaluation}
 \input{dash-3d/conclusion}