Update

2019-10-01 17:34:05 +02:00
parent 9536bf1c25
commit 947a32d970
8 changed files with 68 additions and 61 deletions
@@ -22,20 +22,19 @@ Before streaming content, it needs to be prepared.
 This includes but is not limited to compression and segmentation.
 One of the question this thesis has to answer is \emph{what is the best way to prepare 3D content so that a client can benefit from it?}

-\subsection{Chunk utility}
-Once our content is prepared and split in chunks, we need to be able to rate those chunks depending on the user's position.
-A chunk that contains data in the field of view of the user should have a higher score than a chunk outside of it; a chunk that is close to the camera should have a higher score than a chunk far away from the camera, etc\ldots.
 An open question of this thesis is \emph{how do we determine how useful is a chunk of data depending on the user's position?}

 \subsection{Streaming policies}
-Rating the chunks is not enough, there are other contextual parameters that need to be taken into account, such as the size of a chunk, the bandwidth, the user's behaviour, etc\ldots.
-Another question that raises from this is \emph{how do we take into the context into account to decide which chunks to download?}
+Once our content is prepared and split in chunks, one needs to be able to rate those chunks depending on the user's position.
+A chunk that contains data in the field of view of the user should have a higher score than a chunk outside of it; a chunk that is close to the camera should have a higher score than a chunk far away from the camera, etc\ldots.
+This rating should also include other contextual parameters, such as the size of a chunk, the bandwidth, the user's behaviour, etc\ldots.
+The most important question we have to answer is \emph{how do we determine which chunks need to be downloaded depending on the chunks themselves and the user's interactions?}

 \subsection{Evaluation}
 In such systems, the two most important criteria for evaluation are quality of service, and quality of experience.
 The quality of service is a network-centric metric, which considers values such as throughput.
 The quality of experience is a user-centric metric, and can only be measured by asking how users feel about a system.
-To be able to know which streaming policies, we need to know \emph{how can we compare streaming policies and evalute the impact of their parameters in terms of quality of service and quality of experience?}
+To be able to know which streaming policies, one needs to know \emph{how can we compare streaming policies and evalute the impact of their parameters in terms of quality of service and quality of experience?}

 \subsection{Implementation}
 The objective of our work is to setup a client-server architecture that answers the problems mentioned earlier (content preparation, chunk utility, streaming policies).
@@ -1,19 +1,32 @@
 \chapter{Introduction\label{i}}

-\copied{}
+\fresh{}

-With the progress in data acquisition and modeling techniques, networked virtual environments, or NVE, are increasing in scale.
-For instance,~\cite{urban-data-visualisation} reported that the 3D scene for the city of Lyon takes more than 30 GB of data.
-It has become impractical to download the whole 3D scene before the user begins to navigate in the scene.
-A more common approach is to stream the required 3D content (models and textures) on demand, as the user moves around the scene.
-Downloading the required 3D content the moment the user demands it, however, leads to ``popping effect'' where 3D objects materialize suddenly in the view of the user, due to the latency between requesting for and receiving the 3D content from the server~\cite{visibility-determination}.
-Such latency can be quite high --- Varvello et al.\ reported a median of about 30 seconds for all 3D data in an avatar's surrounding to be loaded in high density Second Life regions under their experimental network conditions, due to a bottleneck at the server~\cite{second-life}.
+During the last years, 3D acquisition and modeling techniques have progressed a lot.
+Recent software such as \href{https://alicevision.org/#meshroom}{Meshroom} use \emph{structure from motion} and \emph{multi view stero} to infer a 3D model from a set of photographs.
+There are more and more devices that are specifically built to obtain 3D data: some are more expensive and provide with very precise information such as Lidar, and some cheaper devices can obtain coarse data such as the Kinect.
+Thanks to these techniques, more and more 3D data becomes available.
+These models have potential for multiple purposes, for example, they can be 3D printed which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses will consist in visualisation.
+For example, they can be used for augmented reality, to provide user with feedback that can be useful to help worker with complex tasks, but also for fashion (for example, \emph{Fitting Box} is a company that develops software to virtually try glasses).
+3D acquisition and visualisation is also useful to preserve cultural heritage, and software such as Google Heritage or 3DHop are such examples, or to allow users navigating in a city (as in Google Earth or Google Maps in 3D).
+\href{https://sketchfab.com}{Sketchfab} is an example of a website allowing users to share their 3D models and visualise models from other users.
+In most 3D visualisation systems, the 3D data needs to be transmitted to a terminal before the user can visualise it.
+The improvements in the acquisition setups we described lead to an increasing quality of the 3D models, and an increasing size in bytes as well.
+Simply downloading 3D content and waiting until the content is fully downloaded to let the user visualise it is no longer a satisfactory solution, and streaming needs to be performed.
+In this thesis, we are especially interested in the navigation and in the streaming of large 3D scenes, such as districts or whole cities.

-For a smoother user experience, NVE typically prefetch 3D content, so that a 3D object is readily available for rendering when the object falls into the view of the user.
-Efficient prefetching, however, requires the client or the server to predict where the user would navigate to in the future and retrieve the corresponding 3D content before the user reaches there.
-In a typical scenario, users navigate along a continuous path in a NVE, leading to a significant overlap between the 3D content visible from the user's known current position and possible next positions (i.e., \textit{spatial data locality}).
-Furthermore, there is a significant overlap between the 3D content visible from the current point in time to the next point in time (i.e., \textit{temporal data locality}).
-Both forms of locality lead to content overlaps, thus making a correct prediction easier and a wrong prediction less costly. 3D content overlaps are particularly common in a NVE with open space, such as a 3D archaeological site or a 3D city.
+% With the progress in data acquisition and modeling techniques, networked virtual environments, or NVE, are increasing in scale.
+% For instance,~\cite{urban-data-visualisation} reported that the 3D scene for the city of Lyon takes more than 30 GB of data.
+% It has become impractical to download the whole 3D scene before the user begins to navigate in the scene.
+% A more common approach is to stream the required 3D content (models and textures) on demand, as the user moves around the scene.
+% Downloading the required 3D content the moment the user demands it, however, leads to ``popping effect'' where 3D objects materialize suddenly in the view of the user, due to the latency between requesting for and receiving the 3D content from the server~\cite{visibility-determination}.
+% Such latency can be quite high --- Varvello et al.\ reported a median of about 30 seconds for all 3D data in an avatar's surrounding to be loaded in high density Second Life regions under their experimental network conditions, due to a bottleneck at the server~\cite{second-life}.
+%
+% For a smoother user experience, NVE typically prefetch 3D content, so that a 3D object is readily available for rendering when the object falls into the view of the user.
+% Efficient prefetching, however, requires the client or the server to predict where the user would navigate to in the future and retrieve the corresponding 3D content before the user reaches there.
+% In a typical scenario, users navigate along a continuous path in a NVE, leading to a significant overlap between the 3D content visible from the user's known current position and possible next positions (i.e., \textit{spatial data locality}).
+% Furthermore, there is a significant overlap between the 3D content visible from the current point in time to the next point in time (i.e., \textit{temporal data locality}).
+% Both forms of locality lead to content overlaps, thus making a correct prediction easier and a wrong prediction less costly. 3D content overlaps are particularly common in a NVE with open space, such as a 3D archaeological site or a 3D city.

 \resetstyle{}

@@ -3,7 +3,7 @@
 First, in Chapter~\ref{f}, we give some preliminary information required to understand the types of objects we are manipulating in this thesis.
 We then proceed to compare 3D and video content: surprisingly, video and 3D share many problems, and analysing them gives inspiration for building a 3D streaming system.

-In Chapter~\ref{sote}, we present a review of the state of the art on the fields that we are interesting in.
+In Chapter~\ref{sote}, we present a review of the state of the art in the multimedia interaction and streaming.
 This chapter starts with an analysis of the video streaming standards.
 Then it reviews the different manners of performing 3D streaming.
 The last section of this chapter focuses on 3D interaction.
@@ -12,12 +12,12 @@ Then, in Chapter~\ref{bi}, we present our first contribution: an in-depth analys
 We first develop a basic interface for navigating in 3D and we introduce 3D objects called \emph{bookmarks} that help users navigate in the scene.
 We then present a user study that we conducted on 50 people that shows that bookmarks have a great impact on how easy it is for a user to perform tasks such as finding objects.
 % Then, we setup a basic 3D streaming system that allows us to replay the traces collected during the user study and simulate 3D streaming at the same time.
-Finally, we analyse how the presence of bookmarks impacts the streaming, and we propose and evaluate a few streaming policies that rely on precomputations that can be made thanks to bookmarks and that can increase the quality of experience.
+We analyse how the presence of bookmarks impacts the streaming, and we propose and evaluate a few streaming policies that rely on precomputations that can be made thanks to bookmarks and that can increase the quality of experience.

 In Chapter~\ref{d3}, we present the most important contribution of this thesis: DASH-3D.
 DASH-3D is an adaptation of the video streaming standard to 3D streaming.
 We first describe how we adapt the concepts of DASH to 3D content, including the segmentation of content.
-We then define utilty metrics that associates score to each chunk depending on the camera's position.
+We then define utility metrics that associates score to each chunk depending on the camera's position.
 Then, we present a client and various streaming policies based on our utilities that can benefit from the DASH format.
 We finally evaluate the different parameters of our client.