Did stuff

This commit is contained in:
Thomas Forgione 2019-10-17 11:00:09 +02:00
parent 966a294538
commit 0b1383c5b6
No known key found for this signature in database
GPG Key ID: 203DAEA747F48F41
5 changed files with 28 additions and 21 deletions

View File

@ -219,11 +219,11 @@ Since our scene is large, and since the system we are describing allows navigati
\subsubsection{Media engine}
Of course, in this work, we are concerned about performance of our system, and we will not be able to use the normal geometries described in Section~\ref{f:geometries}.
Of course, in this work, we are concerned about performance of our system, and we will not be able to use the default geometries described in Section~\ref{f:geometries} because of its poor performance, we need to use buffer geometries.
However, in our system, the way changes happen to the 3D content is always the same: we only add faces and textures to the model.
Therefore, we made a class that derives BufferGeometry, and that makes it more convenient for us.
Therefore, we make a class that derives \texttt{BufferGeometry} for convenience.
\begin{itemize}
\item It has a constructor that takes as parameter the number of faces: it allocates all the memory needed for our buffers so we do not have to reallocate it which would be inefficient.
\item It has a constructor that takes as parameter the number of faces: it allocates all the memory needed for our buffers so we do not have to reallocate it (which would be inefficient).
\item It keeps track of the number of faces it is currently holding: it can then avoid rendering faces that have not been filled and knows where to put new faces.
\item It provides a method that adds a face to the geometry.
\item It also keeps track of what part of the buffers has been transmitted to the GPU\@: THREE.js allows us to set the range of the buffer that we want to update and we are able to update only what is necessary.
@ -232,7 +232,7 @@ Therefore, we made a class that derives BufferGeometry, and that makes it more c
\paragraph{Our 3D model class.\label{d3:model-class}}
As said in the previous subsections, a geometry and a material are bound together in a mesh.
This means that we are forced to have as many meshes as there are materials in our model.
To make this easy to manage, we made a \textbf{Model} class, that holds everything we need.
To make this easy to manage, we make a \textbf{Model} class, that holds both geometry and textures.
We can add vertices, faces, and materials to this model, and it will internally deal with the right geometries, materials and meshes.
In order to avoid having many models that have the same material which would harm performance, it automatically merges faces that share the same material in the same buffer geometry, as shown in Figure~\ref{d3:render-structure}.
@ -293,12 +293,13 @@ In order to avoid having many models that have the same material which would har
\subsubsection{Access client}
In order to be able to implement our DASH-3D client, we need to implement the access client, which is responsible for deciding what to download and download it.
In order to be able to implement our view dependent DASH-3D client, we need to implement the access client, which is responsible for deciding what to download and for downloading it.
To do so, we use the strategy pattern, as shown in Figure~\ref{d3:dash-loader}.
We have a base class named \texttt{LoadingPolicy} that contain some attributes and functions to keep data about what has been downloaded that a derived class can use to make smart decisions, and exposes a function named \texttt{nextSegment} that takes two arguments:
We have a base class named \texttt{LoadingPolicy} that contain some attributes and functions to keep data about what has been downloaded.
This class has a derived class can use to make smart decisions, and exposes a function named \texttt{nextSegment} that takes two arguments:
\begin{itemize}
\item the MPD, so that a strategy can know all the metadata of the segments before making its decision;
\item the camera, because the best segment depends on the position of the camera.
\item the camera, because the next best segment depends on the position of the camera.
\end{itemize}
The greedy, greedy predictive and proposed policies from the previous chapter are all classes that derive from \texttt{LoadingPolicy}.
@ -351,8 +352,9 @@ In JavaScript, there is no way of doing parallel computing without using \emph{w
A web worker is a script in JavaScript that runs in the background, on a separate thread and that can communicate with the main script by sending and receiving messages.
Since our system has many tasks to do, it seems natural to use workers to manage the streaming without impacting the framerate of the renderer.
However, what a worker can do is very limited, since it cannot access the variables of the main script.
Because of this, we are forced to run the renderer on the main script, where it can access the HTML page, and we move all the other tasks to the worker (the access client, the control engine and the segment parsers), and since the main script is the one communicating with the GPU, it will still have to update the model with the parsed content it receives from the worker.
Using the worker does not so much improve the framerate of the system, but it reduces the latency that occurs when receiving a new segment, which can be very frustrating since in a single thread scenario, each time a segment is received, the interface freezes for around half a second.
Because of this, we are forced to run the renderer on the main script, where it can access the HTML page, and we move all the other tasks to the worker (the access client, the control engine and the segment parsers).
Since the main script is the only thread communicating with the GPU, it will still have to update the model with the parsed content it receives from the worker.
Using a worker does not so much improve the framerate of the system, but it reduces the latency that occurs when receiving a new segment, which can be very frustrating since in a single thread scenario, each time a segment is received, the interface freezes for around half a second.
A sequence diagram of what happens when downloading, parsing and rendering content is shown in Figure~\ref{d3:sequence}.
\begin{figure}[ht]
@ -476,6 +478,6 @@ In order to be able to run simulations, we develop the bricks of the DASH client
\item the \textbf{simulator} takes a user trace as a parameter, it then replays the trace using specific parameters of the access client and outputs a file containing the history of the simulation (which files have been downloaded, and when);
\item the \textbf{renderer} takes the user trace as well as the history generated by the simulator as parameters, and renders images that correspond to what would have been seen.
\end{itemize}
When simulating experiments, we will run the simulator on many traces that we collected during user-studies, and we will then run the renderer program on it to generate images corresponding to the simulation.
When simulating experiments, we will run the simulator on many traces that we collected during user-studies, and we will then run the renderer program according to the traces to generate images corresponding to the simulation.
We are then able to compute PSNR between those frames and the ground truth frames.
Doing so guarantees us that our simulator is not affected by the performances of our renderer.

View File

@ -9,7 +9,7 @@ We further show that the data organisation and its description with metadata (pr
This way, our system addresses the open problems we mentioned in~\ref{i:challenges}.
\begin{itemize}
\item \textbf{It prepares and structures the content in a way that enables streaming}: all this preparation is precomputed, and all the content is structured, even materials and textures. Furthermore, textures are prepared in a multi-resolution manner, and even though multi-resolution geometry is not discussed here, the difficulty of integrating it in this system seem moderated: we could encode levels of detail in different representations and define a utility metric for each representation and the system should adapt naturally.
\item \textbf{It prepares and structures the content in a way that enables streaming}: all this preparation is precomputed, and all the content is structured according to DASH framework, geometry but also materials and textures. Furthermore, textures are prepared in a multi-resolution manner, and even though multi-resolution geometry is not discussed here, the difficulty of integrating it in this system seem moderated: we could encode levels of detail in different representations and define a utility metric for each representation and the system should adapt naturally.
\item \textbf{We are able to estimate the utility of each segment} by exploiting all the metadata given in the MPD and by analysing the camera parameters of the user.
\item \textbf{We proposed a few streaming policies}, from the easiest to implement to the more complex, so that the client exploits the utility metrics to define a best guess for the next chunk to download.
\item \textbf{The implementation is efficient}: the content preparation allows a client to get all the information it needs from metadata and the server has nothing else to do than to serve files. Special attention has been granted to the client's performance.

View File

@ -16,14 +16,14 @@ We utilize adaptation sets to organize a 3D scene's material, geometry, and text
\fresh{}
The piece of software that does the preprocessing of the model mostly consists in file manipulation and is written is Rust as well.
It successively preprocess the geometry and then the textures.
It successively preprocesses the geometry and then the textures.
The MPD is generated by a library named \href{https://github.com/netvl/xml-rs}{xml-rs} that works like a stack:
\begin{itemize}
\item a structure is created on the root of the MPD file;
\item the \texttt{start\_element} method creates a new child in the XML file;
\item the \texttt{end\_element} method ends the current child and pops the stack.
\end{itemize}
This structure is passed along with our geometry and texture preprocessors that can add elements to the XML file as they are generating the corresponding files.
This structure is passed along with our geometry and texture preprocessors that can add elements to the XML file as they are generating the corresponding data chunks.
\copied{}
@ -114,3 +114,6 @@ For textures, each representation contains a single segment.
]{assets/dash-3d/geometry-as.xml}
\end{figure}
\fresh{}
Now that 3D data is partitioned and that the MPD file is generated, we see in the next section how the client uses the MPD to request the appropriate data chunks

View File

@ -3,11 +3,11 @@
In this chapter, we take a little step back from interaction and propose a system with simple interactions that however, addresses most of the open problems mentioned in Section~\ref{i:challenges}.
We take inspiration from video streaming: working on the similarities between video streaming and 3D streaming (seen in~\ref{i:video-vs-3d}), we benefit from the DASH efficiency (seen in~\ref{sote:dash}) for streaming 3D content.
DASH is based on content preparation and structuring which helps not only the streaming policies but also leads to a scalable and efficient system since it moves completely the load from the server.
A DASH client is simply a client that downloads the structure of the content, and then, depending on its needs independently of the server, and decide what to download.
DASH is based on content preparation and structuring which helps not only the streaming policies but also leads to a scalable and efficient system since it moves completely the load from the server to the clients.
A DASH client is simply a client that downloads the structure of the content, and then, depending on its needs independently of the server, decides what to download.
In this chapter, we show how to mimic DASH video with 3D streaming, and we develop a system that keeps DASH benefits.
Section~\ref{d3:dash-3d} describes our content preparation, and all the preprocessing that is done to our model to allow efficient streaming.
Section~\ref{d3:dash-3d} describes our content preparation and metadata, and all the preprocessing that is done to our model to allow efficient streaming.
Section~\ref{d3:dash-client} gives possible implementations of clients that exploit the content structure.
Section~\ref{d3:evaluation} evaluates the impact of the different parameters that appear both in the content preparation and the client.
Finally, Section~\ref{d3:conclusion} sums up our work and explains how it tackles the challenges raised in the conclusion of the previous chapter.

View File

@ -15,15 +15,17 @@
Dynamic Adaptive Streaming over HTTP (DASH) is now a widely deployed standard for video streaming, and even though video streaming and 3D streaming are different problems, many of DASH features can inspire us for 3D streaming.
In this chapter, we present the most important contribution of this thesis: adapting DASH to 3D streaming.
We start by showing how we prepare 3D data into a format that complies with DASH and that stores enough metadata to enable a client to perform efficient streaming: we partition the scene into a $k$-d tree and we further segment each cell into chunks with a fixed number of faces, which are sorted by area so that faces of a different level of detail are not grouped together.
We also export each texture at different resolution, and we encode all the acquired metadata into a 3D version of the Media Presentation Description (MPD) that DASH uses for video.
First, we show how to prepare 3D data into a format that complies with DASH data organisation, and we store enough metadata to enable a client to perform efficient streaming.
The data preparation consists in partitioning the scene into spatially coherent cells and segmenting each cell into chunks with a fixed number of faces, which are sorted by area so that faces of a different level of detail are not grouped together.
We also export each texture at different resolution.
We encode the metadata that describes the data organisation into a 3D version of the Media Presentation Description (MPD) that DASH uses for video.
All this prepared content is then stored on a simple static HTTP server: a clients can request the content without any need for computation on the server side, allowing a server to support an arbitrary number of clients.
% Namely, we store in the metadata the coordinates of the cells of the $k$-d tree, the areas of geometry chunks, and the average colors of textures.
We then propose a DASH-3D client that performs frustum culling to eliminate cells outside the viewing volume of the camera (as shown in Figure~\ref{d3:big-picture}).
We define utility metrics to give scores to each chunk of data, be it geometry or texture, based on offline information that is given in the MPD, and online information that the client is able to compute, such as user interaction or bandwidth measurements.
We then propose DASH-3D clients that are viewpoint aware: they perform frustum culling to eliminate cells outside the viewing volume of the camera (as shown in Figure~\ref{d3:big-picture}).
We define utility metrics to give a score to each chunk of data, be it geometry or texture, based on offline information that is given in the MPD, and online information that the client is able to compute, such as view parameters, user interaction or bandwidth measurements.
We also define streaming policies that rely on those utilities in order for the client to determine which chunks need to be downloaded.
We finally evaluate all those parameters under different bandwidth setups and compare our streaming policies.
We finally evaluate these system parameters under different bandwidth setups and compare our streaming policies.
\newpage