From 9f0163b4a09732f97a3312843b97f1934d9b938b Mon Sep 17 00:00:00 2001 From: Thomas Forgione Date: Tue, 26 Nov 2019 16:44:08 +0100 Subject: [PATCH] Working on Simon review --- src/commands.sty | 3 ++ src/config.sty | 1 + src/foreword/3d-model.tex | 25 +++++----- src/foreword/implementation.tex | 29 +++++------ src/foreword/main.tex | 3 ++ src/foreword/video-vs-3d.tex | 88 ++++++++++++++++----------------- src/introduction/challenges.tex | 28 +++++------ src/introduction/main.tex | 4 +- src/introduction/outline.tex | 4 +- src/state-of-the-art/main.tex | 3 ++ 10 files changed, 98 insertions(+), 90 deletions(-) diff --git a/src/commands.sty b/src/commands.sty index 089ff35..550332e 100644 --- a/src/commands.sty +++ b/src/commands.sty @@ -13,3 +13,6 @@ \renewcommand{\href}[2]{\rawhref{#1}{#2}\footnote{\url{#1}}} \newcommand\notsotiny{\@setfontsize\notsotiny\@viiipt\@ixpt} + +% Commands for review +\newcommand{\update}[2]{{\color{red}\sout{#1}} {\color{DarkGreen}#2}} diff --git a/src/config.sty b/src/config.sty index f63789f..77e16d8 100644 --- a/src/config.sty +++ b/src/config.sty @@ -30,6 +30,7 @@ anchorcolor = blue]{hyperref} \usepackage{numprint} \usepackage{pdfpages} \usepackage{enumitem} +\usepackage[normalem]{ulem} \setitemize{noitemsep,topsep=4pt,parsep=4pt,partopsep=0pt} \pagestyle{scrheadings} diff --git a/src/foreword/3d-model.tex b/src/foreword/3d-model.tex index 796d45f..78d3baf 100644 --- a/src/foreword/3d-model.tex +++ b/src/foreword/3d-model.tex @@ -1,4 +1,4 @@ -A 3D streaming system is a system that dynamically collects 3D data. +A 3D streaming system is a system that \update{dynamically}{progressively} collects 3D data. The previous chapter voluntarily remained vague about what \emph{3D data} actually is. This chapter presents in detail the 3D data we consider and how it is rendered. We also give insights about interaction and streaming by comparing the 3D case to the video one. @@ -6,24 +6,24 @@ We also give insights about interaction and streaming by comparing the 3D case t \section{What is a 3D model?\label{f:3d}} \subsection{3D data} -Most classical 3D models are sets of meshes and textures, that can potentially be arranged in a scene graph. +Most classical 3D models are sets of meshes and textures, which can potentially be arranged in a scene graph. Such a model can typically contain the following: \begin{itemize} - \item \textbf{vertices}, that are simply 3D points; - \item \textbf{faces}, that are polygons defined from vertices (most of the time, they are triangles); - \item \textbf{textures}, that are images that can be used to paint faces in order to add visual richness; - \item \textbf{texture coordinates}, that are information added to a face, describing how the texture should be painted over faces; - \item \textbf{normals}, that are 3D vectors that can give information about light behaviour on a face. + \item \textbf{Vertices}, which are 3D points, + \item \textbf{Faces}, which are polygons defined from vertices (most of the time, they are triangles), + \item \textbf{Textures}, which are images that can be used to paint faces in order to add visual richness, + \item \textbf{Texture coordinates}, which are information added to a face, describing how the texture should be painted over faces, + \item \textbf{Normals}, which are 3D vectors that can give information about light behaviour on a face. \end{itemize} -The Wavefront OBJ is one of the most popular format that describes all these elements in text format. +The Wavefront OBJ is \update{one of the most popular}{a} format that describes all these elements in text format. A 3D model encoded in the OBJ format typically consists in two files: the materials file (\texttt{.mtl}) and the object file (\texttt{.obj}). \paragraph{} The materials file declares all the materials that the object file will reference. -A material consists in name, and other photometric properties such as ambient, diffuse and specular colors, as well as texture maps. -Each face corresponds to a material and a renderer can use the material's information to render the faces. +A material consists in name, and other photometric properties such as ambient, diffuse and specular colors, as well as texture maps, \update{}{which are images that are painted on faces}. +Each face corresponds to a material \update{and a renderer can use the material's information to render the faces}{}. A simple material file is visible on Listing~\ref{i:mtl}. \paragraph{} @@ -111,10 +111,9 @@ During the rendering loop, there are two things to consider regarding performanc The way the loop works forces objects with different materials to be rendered separately. An efficient renderer keeps the number of objects in a scene low to avoid introducing overhead. - However, an important feature of 3D engines regarding performance is frustum culling. The frustum is the viewing volume of the camera. -Frustum culling consists in skipping objects that are outside the viewing volume of the camera in the rendering loop. +Frustum culling consists in skipping the objects that are outside the viewing volume of the camera in the rendering loop. Algorithm~\ref{f:frustum-culling} is a variation of Algorithm~\ref{f:renderer} with frustum culling. \begin{algorithm}[th] @@ -152,7 +151,7 @@ Algorithm~\ref{f:frustum-culling} is a variation of Algorithm~\ref{f:renderer} w \end{algorithm} A renderer that uses a single object avoids the overhead, but fails to benefit from frustum culling. -An optimized renderer needs to find a compromise between a too fine partition of the scene, which introduces overhead, and a too coarse partition which introduces useless rendering. +An optimized renderer needs to find a compromise between a too fine partition of the scene, which introduces overhead, and a too coarse partition, which introduces useless rendering. % ensures to have objects that do not spread across the whole scene, since that would lead to a useless frustum culling, and many objects to avoid rendering the whole scene at each frame. diff --git a/src/foreword/implementation.tex b/src/foreword/implementation.tex index 25facb5..d157406 100644 --- a/src/foreword/implementation.tex +++ b/src/foreword/implementation.tex @@ -1,20 +1,21 @@ \section{Implementation details} During this thesis, a lot of software has been developed, and for this software to be successful and efficient, we chose the appropriate languages. -When it comes to 3D streaming systems, there are two kind of software that we need. +When it comes to 3D streaming systems, we need two kind of software. + \begin{itemize} - \item \textbf{Interactive applications} that can run on as many devices as possible whether it be desktop or mobile in order to conduct user studies. For this context, we chose the \textbf{JavaScript language}, since it can run on many devices and it has great support for WebGL\@. - \item \textbf{Native applications} that can run fast on desktop devices, in order to run simulations and evaluate our ideas. For this context, we chose the \textbf{Rust} language, which is a somewhat recent language that provides both the efficiency of C and C++ and the safety of functional languages. + \item \textbf{Interactive applications} which can run on as many devices as possible so we can easily conduct user studies. For this context, we chose the \textbf{JavaScript language}.% , since it can run on many devices and it has great support for WebGL\@. + \item \textbf{Native applications} which can run fast on desktop devices, in order to run simulations and evaluate our ideas. For this context, we chose the \textbf{Rust} language.% , which is a somewhat recent language that provides both the efficiency of C and C++ and the safety of functional languages. \end{itemize} \subsection{JavaScript} \paragraph{THREE.js.} -On the web browser, the best way to perform 3D rendering is to use WebGL\@. -However, WebGL is very low level and it can be really painful to write code, even to render a simple triangle. +On the web browser, \update{the best way to perform 3D rendering is to use WebGL}{it is now possible to perform 3D rendering by using WebGL}. +However, WebGL is very low level and it can be painful to write code, even to render a simple triangle. For example, \href{https://www.tutorialspoint.com/webgl/webgl_drawing_a_triangle.htm}{this tutorial}'s code contains 121 lines of javascript, 46 being code (not comments or empty lines) to render a simple, non-textured triangle. For this reason, it seems unreasonable to build a system like the one we are describing in raw WebGL\@. -There are many libraires that wrap WebGL code and that help people building 3D interfaces, and \href{https://threejs.org}{THREE.js} is probably one of the most popular. +There are many libraires that wrap WebGL code and that help people building 3D interfaces, and \href{https://threejs.org}{THREE.js} \update{is probably one of the most popular}{is a very popular one (56617 stars on github, making it the 35th most starred repository on GitHub as of November 26th, 2019\footnote{\url{https://web.archive.org/web/20191126151645/https://gitstar-ranking.com/mrdoob/three.js}})}. THREE.js acts as a 3D engine built on WebGL\@. It provides classes to deal with everything we need: \begin{itemize} @@ -37,7 +38,7 @@ A snippet of the basic usage of these classes is given in Listing~\ref{f:three-h \paragraph{Geometries.\label{f:geometries}} Geometries are the classes that hold the vertices, texture coordinates, normals and faces. -There are two most important geometry classes in THREE.js: +\update{There are two most important geometry classes in THREE.js:}{THREE.js proposes two classes for handling geometries:} \begin{itemize} \item the \textbf{Geometry} class, which is made to be developer friendly and allows easy editing but can suffer from performance issues; \item the \textbf{BufferGeometry} class, which is harder to use for a developer, but allows better performance since the developer controls how data is transmitted to the GPU\@. @@ -46,12 +47,12 @@ There are two most important geometry classes in THREE.js: \subsection{Rust} -In this section, we explain the specificities of Rust and why it is a great language for writing efficient native software safely. +In this section, we explain the specificities of Rust and why it is an adequate language for writing efficient native software safely. \subsubsection{Borrow checker} Rust is a system programming language focused on safety. -It is made to be efficient (and effectively has performances comparable to C or C++) but with some extra features. +It is made to be efficient (and effectively has performances comparable to C\footnote{\url{https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust.html}} or C++\footnote{\url{https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust-gpp.html}}) but with some extra features. C++ users might see it as a language like C++ but that forbids undefined behaviours.\footnote{in Rust, when you need to execute code that might lead to undefined behaviours, you need to put it inside an \texttt{unsafe} block. Many operations will not be available outside an \texttt{unsafe} block (e.g., dereferencing a pointer, or mutating a static variable). The idea is that you can use \texttt{unsafe} blocks when you require it, but you should avoid it as much as possible and when you do it, you must be particularly careful.} The most powerful concept from Rust is \emph{ownership}. Basically, every value has a variable that we call its \emph{owner}. @@ -78,7 +79,7 @@ Consider the piece of C++ code in Listings~\ref{f:undefined-behaviour-cpp} and~\ \end{minipage} \end{figure} -Of course, this loop should go endlessly because the vector grows in size as we add elements in the loop. +This loop should go endlessly because the vector grows in size as we add elements in the loop. But the most important thing here is that since we add elements to the vector, it will eventually need to be reallocated, and that reallocation will invalidate the iterator, meaning that the following iterator will provoke an undefined behaviour. The equivalent code in Rust is in Listings~\ref{f:undefined-behaviour-rs} and~\ref{f:undefined-behaviour-rs-it}. @@ -115,13 +116,13 @@ And effectively, the borrow checker will crash the compiler with the error in Li This example is one of the many examples of how powerful the borrow checker is: in Rust code, there can be no dangling reference, and all the segmentation faults coming from them are detected by the compiler. The borrow checker may seem like an enemy to newcomers because it often rejects code that seem correct, but once one gets used to it, they understand what is the problem with their code and either fix the problem easily, or realise that the whole architecture is wrong and understand why. -It is probably for those reasons that Rust is the \emph{most loved programming language} according to the Stack Overflow Developer Survey in~\citeyear{so-survey-2016,so-survey-2017,so-survey-2018} and~\citeyear{so-survey-2019}. +It is probably for those reasons that Rust is the \emph{most loved programming language} according to the Stack Overflow Developer Survey in~\citeyear{so-survey-2016}, \citeyear{so-survey-2017}, \citeyear{so-survey-2018} and~\citeyear{so-survey-2019}. \subsubsection{Tooling} -Even better, Rust comes with great tooling. +\update{Even better}{Moreover}, Rust comes with \update{great tooling}{many programs that help developers}. \begin{itemize} - \item \href{https://github.com/rust-lang/rust}{\textbf{\texttt{rustc}}} is the Rust compiler. It is very comfortable due to the nice error messages it displays. + \item \href{https://github.com/rust-lang/rust}{\textbf{\texttt{rustc}}} is the Rust compiler. It is comfortable due to \update{the nice error messages it displays}{the clarity and precise explanations of its error messages}. \item \href{https://github.com/rust-lang/cargo}{\textbf{\texttt{cargo}}} is the official Rust's project and package manager. It manages compilation, dependencies, documentation, tests, etc. \item \href{https://github.com/racer-rust/racer}{\textbf{\texttt{racer}}}, \href{https://github.com/rust-lang/rls}{\textbf{\texttt{rls}} (Rust Language Server)} and \href{https://github.com/rust-analyzer/rust-analyzer}{\textbf{\texttt{rust-analyzer}}} are software that manage automatic compilation to display errors in code editors as well as providing semantic code completion. \item \href{https://github.com/rust-lang/rustfmt}{\textbf{\texttt{rustfmt}}} auto formats code. @@ -145,6 +146,6 @@ Its objectives are: In our work, many tasks will consist in 3D content analysis, reorganising and 3D rendering evaluation. Many of these tasks require long computations, lasting from hours to entire days. -To perform them, we need a programming language that has good performances (Rust has comparable performance with C or C++). +To perform them, we need a programming language that has good performances. In addition, the extra features that Rust provides ease tremendously development, and this is why we use Rust for all tasks that do not require having a web interface. diff --git a/src/foreword/main.tex b/src/foreword/main.tex index 0e21f71..f2ad170 100644 --- a/src/foreword/main.tex +++ b/src/foreword/main.tex @@ -1,5 +1,8 @@ \chapter{Foreword\label{f}} +\minitoc{} +\newpage + \input{foreword/3d-model} \input{foreword/video-vs-3d} \input{foreword/implementation} diff --git a/src/foreword/video-vs-3d.tex b/src/foreword/video-vs-3d.tex index a947178..b2ceea0 100644 --- a/src/foreword/video-vs-3d.tex +++ b/src/foreword/video-vs-3d.tex @@ -1,6 +1,6 @@ \section{Similarities and differences between video and 3D\label{i:video-vs-3d}} -Contrary to what one might think, the video streaming setting and the 3D streaming setting share many similarities: at a higher level of abstraction, both systems allow a user to access remote content without having to wait until everything is loaded. +The video streaming setting and the 3D streaming setting share many similarities: at a higher level of abstraction, both systems allow a user to access remote content without having to wait until everything is loaded. Analysing similarities and differences between the video and the 3D scenarios as well as having knowledge about video streaming literature are the key to developing an efficient 3D streaming system. \subsection{Chunks of data} @@ -16,36 +16,36 @@ One of the main differences between video and 3D streaming is the persistence of In video streaming, only one second of video is required at a time. Of course, most video streaming services prefetch some future chunks, and keep in cache some previous ones, but a minimal system could work without latency and keep in memory only two chunks: the current one and the next one. -In 3D streaming, each chunk is part of a scene, and already a few problems appear here: +\update{In 3D streaming, each chunk is part of a scene, and already a few problems appear here:}{Already a few problems appear here regarding 3D streaming:} \begin{itemize} \item depending on the user's field of view, many chunks may be required to perform a single rendering; \item chunks do not become obsolete the way they do in video, a user navigating in a 3D scene may come back to a same spot after some time, or see the same objects but from elsewhere in the scene. \end{itemize} -\subsection{Multi-resolution} +\subsection{Multiple representations} All major video streaming platforms support multi-resolution streaming. -This means that a client can choose the resolution at which it requests the content. -It can be chosen directly by the user or automatically determined by analysing the available resources (size of the screen, downloading bandwidth, device performances, etc.) +This means that a client can choose the \update{resolution}{quality} at which it requests the content. +It can be chosen directly by the user or automatically determined by analysing the available resources (size of the screen, downloading bandwidth, device performances) \begin{figure}[th] \centering \includegraphics[width=\textwidth]{assets/introduction/youtube-multiresolution.png} - \caption{The different resolutions available for a Youtube video} + \caption{The different \update{resolutions}{qualities} available for a Youtube video} \end{figure} -Similarly, recent work in 3D streaming have proposed different ways to progressively stream 3D models, displaying a low resolution to the user without latency, and supporting interaction with the model while details are being downloaded. +Similarly, recent work in 3D streaming have proposed different ways to progressively stream 3D models, displaying a low \update{resolution}{quality version of the model} to the user without latency, and supporting interaction with the model while details are being downloaded. Such strategies are reviewed in Section~\ref{sote:3d-streaming}. \subsection{Media types} Just like a video, a 3D scene is composed of different types of media. -In video, those media are mostly images, sounds, and eventually subtitles, whereas in 3D, those media are geometry or textures. +In video, those media are mostly images, sounds, and subtitles, whereas in 3D, those media are geometry or textures. In both cases, an algorithm for content streaming has to acknowledge those different media types and manage them correctly. In video streaming, most of the data (in terms of bytes) is used for images. Thus, the most important thing a video streaming system should do is to optimise images streaming. -That is why, on a video on Youtube for example, there may be 6 resolutions for images (144p, 240p, 320p, 480p, 720p and 1080p) but only 2 resolutions for sound. +That is why, on a video on Youtube for example, there may be 6 \update{resolutions}{available qualities} for images (144p, 240p, 320p, 480p, 720p and 1080p) but only 2 \update{resolutions}{qualities} for sound. This is one of the main differences between video and 3D streaming: in a 3D scene, geometry and texture sizes are approximately the same, and leveraging between those two types of content is a key problem. \subsection{Interaction} @@ -53,40 +53,38 @@ This is one of the main differences between video and 3D streaming: in a 3D scen The ways of interacting with content is probably the most important difference between video and 3D. In a video interface, there is only one degree of freedom: time. The only things a user can do is letting the video play, pausing, resuming, or jumping to another time in the video. -Even though these interactions seem easy to handle, giving the best possible experience to the user is already challenging. For example, to perform these few actions, Youtube provides the user with multiple options. - -\begin{itemize} - - \item To pause or resume a video, the user can: - \begin{itemize} - \item click the video; - \item press the \texttt{K} key; - \item press the space key. - \end{itemize} - - \item To navigate to another time in the video, the user can: - \begin{itemize} - \item click the timeline of the video where they want; - \item press the left arrow key to move 5 seconds backwards; - \item press the right arrow key to move 5 seconds forwards; - \item press the \texttt{J} key to move 10 seconds backwards; - \item press the \texttt{L} key to move 10 seconds forwards; - \item press one of the number key (on the first row of the keyboard, below the function keys, or on the numpad) to move the corresponding tenth of the video; - \item press the home key to go the beginning of the video, or the end key to go to the end. - \end{itemize} - -\end{itemize} - -There are also controls for other options that are described \href{https://web.archive.org/web/20191014131350/https://support.google.com/youtube/answer/7631406?hl=en}{on this help page}, for example: -\begin{itemize} - \item up and down arrows change the sound volume; - \item \texttt{M} mutes the sound; - \item \texttt{C} activates the subtitles; - \item \texttt{F} puts the player in fullscreen mode; - \item \texttt{T} activates the theater mode (where the video occupies the total width of the screen, instead of occupying two thirds of the screen, the last third being advertising or recommendations); - \item \texttt{I} activates the mini-player (allowing to search for other videos while keeping the current video playing in the bottom right corner). -\end{itemize} -All the interactions are summed up in Figure~\ref{i:youtube-keyboard}. +There are also controls for other options that are described \href{https://web.archive.org/web/20191014131350/https://support.google.com/youtube/answer/7631406?hl=en}{on this help page}. +% For example, to perform these few actions, Youtube provides the user with multiple options. +% \begin{itemize} +% +% \item To pause or resume a video, the user can: +% \begin{itemize} +% \item click the video; +% \item press the \texttt{K} key; +% \item press the space key. +% \end{itemize} +% +% \item To navigate to another time in the video, the user can: +% \begin{itemize} +% \item click the timeline of the video where they want; +% \item press the left arrow key to move 5 seconds backwards; +% \item press the right arrow key to move 5 seconds forwards; +% \item press the \texttt{J} key to move 10 seconds backwards; +% \item press the \texttt{L} key to move 10 seconds forwards; +% \item press one of the number key (on the first row of the keyboard, below the function keys, or on the numpad) to move the corresponding tenth of the video; +% \item press the home key to go the beginning of the video, or the end key to go to the end. +% \end{itemize} +% +% \end{itemize} +% \begin{itemize} +% \item up and down arrows change the sound volume; +% \item \texttt{M} mutes the sound; +% \item \texttt{C} activates the subtitles; +% \item \texttt{F} puts the player in fullscreen mode; +% \item \texttt{T} activates the theater mode (where the video occupies the total width of the screen, instead of occupying two thirds of the screen, the last third being advertising or recommendations); +% \item \texttt{I} activates the mini-player (allowing to search for other videos while keeping the current video playing in the bottom right corner). +% \end{itemize} +All the keyboard shortcuts are summed up in Figure~\ref{i:youtube-keyboard}. \newcommand{\relativeseekcontrol}{LightBlue} \newcommand{\absoluteseekcontrol}{LemonChiffon} @@ -276,7 +274,7 @@ All the interactions are summed up in Figure~\ref{i:youtube-keyboard}. \node[right=0.3cm] at (12.5, 0.5) {\small Play or pause keys}; \keystrokebg{18}{19}{0}{1}{}{\othercontrol}; - \node[right=0.3cm] at (18.5, 0.5) {\small Other keys}; + \node[right=0.3cm] at (18.5, 0.5) {\small Other shortcuts}; \end{tikzpicture} @@ -300,7 +298,7 @@ Some interfaces mimic the video scenario, where the only variable is the time an These interfaces are not interactive, and can be frustrating to the user who might feel constrained. Some other interfaces add 2 degrees of freedom to the timeline: the user does not control the position of the camera but can control the angle. This mimics the scenario of the 360 video. -This is typically the case of the video game \emph{nolimits 2: roller coaster simulator} which works with VR devices (oculus rift, HTC vive, etc.) where the only interaction the user has is turning their head. +This is typically the case of the video game \href{http://nolimitscoaster.com/}{\emph{nolimits 2: roller coaster simulator}} which works with VR devices (oculus rift, HTC vive, etc.) where the only interaction the user has is turning their head. Finally, most of the other interfaces give at least 5 degrees of freedom to the user: 3 being the coordinates of the position of the camera, and 2 being the angle (assuming the up vector is unchangeable, some interfaces might allow that, giving a sixth degree of freedom). The most common controls are the trackball controls where the user rotate the object like a ball \href{https://threejs.org/examples/?q=controls\#misc_controls_trackball}{(live example here)} and the orbit controls, which behave like the trackball controls but preserving the up vector \href{https://threejs.org/examples/?q=controls\#misc_controls_orbit}{(live example here)}. diff --git a/src/introduction/challenges.tex b/src/introduction/challenges.tex index 9c282c2..ca824a7 100644 --- a/src/introduction/challenges.tex +++ b/src/introduction/challenges.tex @@ -1,39 +1,39 @@ \section{Open problems\label{i:challenges}} -The objective of our work is to design a system that allows a user to access remote 3D content and that guarantees both good quality of service and good quality of experience. +The objective of our work is to design a system which allows a user to access remote 3D content \update{and that guarantees both good quality of service and good quality of experience}{}. A 3D streaming client has lots of tasks to accomplish: \begin{itemize} - \item decide what part of the model to download next; - \item download the next part; - \item parse the downloaded content; - \item add the parsed result to the scene; - \item render the scene; - \item manage the interaction with the user. + \item Decide what part of the \update{model}{content} to download next, + \item Download the next part, + \item Parse the downloaded content, + \item Add the parsed result to the scene, + \item Render the scene, + \item Manage the interaction with the user. \end{itemize} -This opens multiple problems that need to be considered and will be studied in this thesis. +This opens multiple problems which need to be considered and will be studied in this thesis. \paragraph{Content preparation.} % Any preprocessing that can be done on our 3D data gives us a strategical advantage since it consists in computations that will not be needed live, neither for the server nor for the client. % Furthermore, for streaming, data needs to be split into chunks that are requested separately, so perparing those chunks in advance can also help the streaming. Before streaming content, it needs to be prepared. The segmentation of the content into chunks is particularly important for streaming since it allows transmitting only a portion of the data to the client. -A partial model consisting in the downloaded content, it can be rendered while downloading more chunks. +\update{A partial model consisting in the downloaded content, it}{The downloaded chunks} can be rendered while \update{downloading more chunks}{more chunks are being downloaded}. Content preparation also includes compression. One of the questions this thesis has to answer is: \emph{what is the best way to prepare 3D content so that a streaming client can progressively download and render the 3D model?} \paragraph{Streaming policies.} Once our content is prepared and split in chunks, a client needs to determine which chunks should be downloaded first. -A chunk that contains data in the field of view of the user is more relevant than a chunk that is not inside; a chunk that is close to the camera is more relevant than a chunk far away from the camera, etc. -This should also include other contextual parameters, such as the size of a chunk, the bandwidth, the user's behaviour, etc. -The most important questions we have to answer are: \emph{how to estimate a chunk utility, and how to determine which chunks need to be downloaded depending the user's interactions?} +A chunk that contains data in the field of view of the user is more relevant than a chunk that is not inside; a chunk that is close to the camera is more relevant than a chunk far away from the camera. +This should also include other contextual parameters, such as the size of a chunk, the bandwidth and the user's behaviour. +\update{The most important questions we have to answer are:}{In order to propose efficient streaming policies, we need to know} \emph{how to estimate a chunk utility, and how to determine which chunks need to be downloaded depending the user's interactions?} \paragraph{Evaluation.} -In such systems, the two most important criteria for evaluation are quality of service, and quality of experience. +In such systems, \update{the}{} two \update{most important}{commonly used} criteria for evaluation are quality of service, and quality of experience. The quality of service is a network-centric metric, which considers values such as throughput and measures how well the content is served to the client. The quality of experience is a user-centric metric: it relies on user perception and can only be measured by asking how users feel about a system. -To be able to know which streaming policies are best, one needs to know \emph{how to compare streaming policies and evaluate the impact of their parameters in terms of quality of service and quality of experience?} +To be able to know which streaming policies are best, one needs to know \emph{how to compare streaming policies and evaluate the impact of their parameters \update{in terms of}{on the} quality of service \update{}{of the streaming system} and \update{}{on the} quality of experience \update{}{of the final user}?} \paragraph{Implementation.} The objective of our work is to setup a client-server architecture that answers the above problems: content preparation, chunk utility, streaming policies. diff --git a/src/introduction/main.tex b/src/introduction/main.tex index 1a99045..809ef0e 100644 --- a/src/introduction/main.tex +++ b/src/introduction/main.tex @@ -1,8 +1,8 @@ \chapter{Introduction\label{i}} During the last years, 3D acquisition and modeling techniques have made tremendous progress. -Recent software use 2D images from cameras to reconstruct 3D data, e.g. \href{https://alicevision.org/\#meshroom}{Meshroom} is free and open source software that got almost \numprint{200000} downloads on \href{https://www.fosshub.com/Meshroom.html}{fosshub}, that use \emph{structure-from-motion} and \emph{multi-view-stereo} to infer a 3D model. -There are more and more devices that are specifically built to harvest 3D data: some still very expensive and provide precise information such as LIDAR (Light Detection And Ranging, as in RADAR but with light instead of radio waves), while some cheaper devices can obtain coarse data such as the Kinect. +Recent software uses 2D images from cameras to reconstruct 3D data, e.g. \href{https://alicevision.org/\#meshroom}{Meshroom} is a free and open source software which got almost \numprint{200000} downloads on \href{https://www.fosshub.com/Meshroom.html}{fosshub}, which use \emph{structure-from-motion} and \emph{multi-view-stereo} to infer a 3D model. +More and more devices are specifically built to harvest 3D data: \update{some still very expensive and provide precise information such as LIDAR (Light Detection And Ranging, as in RADAR but with light instead of radio waves), while some cheaper devices can obtain coarse data such as the Kinect.}{for example, LIDAR (Light Detection And Ranging) can compute 3D distances by measuring time of flight of light. The recent research interest for autonomous vehicles allowed more companies to develop cheaper LIDARs, which increase the potential for new 3D content creation.} Thanks to these techniques, more and more 3D data become available. These models have potential for multiple purposes, for example, they can be printed, which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses are based on visualisation. For example, they can be used for augmented reality, to provide user with feedback that can be useful to help worker with complex tasks, but also for fashion (for example, \emph{Fittingbox} is a company that develops software to virtually try glasses, as in Figure~\ref{i:fittingbox}). diff --git a/src/introduction/outline.tex b/src/introduction/outline.tex index e79e4bb..f4da859 100644 --- a/src/introduction/outline.tex +++ b/src/introduction/outline.tex @@ -10,7 +10,7 @@ The last section of this chapter focuses on 3D interaction. Then, in Chapter~\ref{bi}, we present our first contribution: an in-depth analysis of the impact of the UI on navigation and streaming in a 3D scene. We first develop a basic interface for navigating in 3D and then, we introduce 3D objects called \emph{bookmarks} that help users navigating in the scene. -We then present a user study that we conducted on 50 people that shows that bookmarks ease user navigation: they improve performance at tasks such as finding objects. +We then present a user study that we conducted on 50 people which shows that bookmarks ease user navigation: they improve performance at tasks such as finding objects. % Then, we setup a basic 3D streaming system that allows us to replay the traces collected during the user study and simulate 3D streaming at the same time. We analyse how the presence of bookmarks impacts the streaming: we propose and evaluate streaming policies based on pre-computations relying on bookmarks and that measurably increase the quality of experience. @@ -18,7 +18,7 @@ In Chapter~\ref{d3}, we present the most important contribution of this thesis: DASH-3D is an adaptation of DASH (Dynamic Adaptive Streaming over HTTP): the video streaming standard, to 3D streaming. We first describe how we adapt the concepts of DASH to 3D content, including the segmentation of content. We then define utility metrics that associate score to each chunk depending on the user's position. -Then, we present a client and various streaming policies based on our utilities that can benefit from DASH format. +Then, we present a client and various streaming policies based on our utilities which can benefit from DASH format. We finally evaluate the different parameters of our client. In Chapter~\ref{sb}, we present our last contribution: the integration of the interaction ideas that we developed in Chapter~\ref{bi} into DASH-3D. diff --git a/src/state-of-the-art/main.tex b/src/state-of-the-art/main.tex index 186ebbe..a0050dd 100644 --- a/src/state-of-the-art/main.tex +++ b/src/state-of-the-art/main.tex @@ -1,5 +1,8 @@ \chapter{Related work\label{sote}} +\minitoc{} +\newpage + \input{state-of-the-art/intro} \input{state-of-the-art/video} \input{state-of-the-art/3d-streaming}