From beee676839cf75b806f4c40befdaa484e890976f Mon Sep 17 00:00:00 2001 From: Thomas Forgione Date: Thu, 10 Oct 2019 17:20:29 +0200 Subject: [PATCH] Updates --- src/foreword/3d-model.tex | 8 ++++---- src/foreword/implementation.tex | 10 +++++----- src/foreword/video-vs-3d.tex | 8 ++++---- src/introduction/challenges.tex | 6 +++--- src/introduction/main.tex | 6 +++--- src/introduction/outline.tex | 4 ++-- src/system-bookmarks/main.tex | 15 +++++++++++++++ 7 files changed, 36 insertions(+), 21 deletions(-) diff --git a/src/foreword/3d-model.tex b/src/foreword/3d-model.tex index e54a09f..340d65e 100644 --- a/src/foreword/3d-model.tex +++ b/src/foreword/3d-model.tex @@ -1,6 +1,6 @@ A 3D streaming system is a system that collects 3D data and dynamically renders it. The previous chapter voluntarily remained vague about what \emph{3D data} actually is. -This chapter presents in detail what 3D data is and how it is reenderer, and give insights about interaction and streaming by comparing the 3D case to the video one. +This chapter presents in detail what 3D data is and how it is renderer, and gives insights about interaction and streaming by comparing the 3D case to the video one. \section{What is a 3D model?} @@ -10,8 +10,8 @@ A 3D model consists in a set of data. \begin{itemize} \item \textbf{Vertices} are simply 3D points; \item \textbf{Faces} are polygons defined from vertices (most of the time, they are triangles); - \item \textbf{Textures} are images that can be use to paint faces to add visual richness; - \item \textbf{Texture coordinates} are information added to a face to describe how the texture should be applied on a face; + \item \textbf{Textures} are images that can be used for painting faces, to add visual richness; + \item \textbf{Texture coordinates} are information added to a face, describing how the texture should be applied on a face; \item \textbf{Normals} are 3D vectors that can give information about light behaviour on a face. \end{itemize} @@ -19,7 +19,7 @@ The Wavefront OBJ is one of the most popular format and describes all these elem A 3D model encoded in the OBJ format typically consists in two files: the materials file (\texttt{.mtl}) and the object file (\texttt{.obj}). \paragraph{} -The materials file declare all the materials that the object file will reference. +The materials file declares all the materials that the object file will reference. A material consists in name, and other photometric properties such as ambient, diffuse and specular colors, as well as texture maps. Each face correspond to a material and a renderer can use the material's information to render the faces in a specific way. A simple material file is visible on Listing~\ref{i:mtl}. diff --git a/src/foreword/implementation.tex b/src/foreword/implementation.tex index 1ae9b31..52a5efe 100644 --- a/src/foreword/implementation.tex +++ b/src/foreword/implementation.tex @@ -1,7 +1,7 @@ \fresh{} \section{Implementation details} -During this thesis, a lot of software has be written, and for this software to be successful and efficient, we took care of choosing the right languages. +During this thesis, a lot of software has been written, and for this software to be successful and efficient, we took care of choosing the right languages. When it comes to 3D streaming systems, there are two kind of software that we need. \begin{itemize} \item \textbf{Interactive applications} that can run on as many devices as possible whether it be desktop or mobile in order to try and to conduct user studies. For this context, we chose the \textbf{JavaScript language}, since it can run on many devices and it has great support for WebGL\@. @@ -21,7 +21,7 @@ THREE.js acts as a 3D engine built on WebGL\@. It provides classes to deal with everything we need: \begin{itemize} \item the \textbf{Renderer} class contains all the WebGL code needed to render a scene on the web page; - \item the \textbf{Object} class contain all the boilerplate needed to manage the tree structure of the content, it contains a transform and it can have children that are other objects; + \item the \textbf{Object} class contains all the boilerplate needed to manage the tree structure of the content, it contains a transform and it can have children that are other objects; \item the \textbf{Scene} class is the root object, it contains all of the objects we want to render and it is passed as argument to the render function; \item the \textbf{Geometry} and \textbf{BufferGeometry} classes are the classes that hold the vertices buffers, we will discuss that more in Section~\ref{f:geometries}; \item the \textbf{Material} class is the class that holds the properties used to render geometry (the most important information being the texture), there are many classes derived from Material, and the developer can choose what material he wants for its objects; @@ -41,8 +41,8 @@ A snippet of the basic usage of these classes is given in Listing~\ref{f:three-h Geometries are the classes that hold the vertices, texture coordinates, normals and faces. There are two most important geometry classes in THREE.js: \begin{itemize} - \item the \textbf{Geometry} class, which is made to be developer friendly and allows easy editing but can suffer issues of performance; - \item the \textbf{BufferGeometry} class, which is harder to use for a developer, but allows better performance since the developer controls and data is transmitted to the GPU\@. + \item the \textbf{Geometry} class, which is made to be developer friendly and allows easy editing but can suffer from issues of performance; + \item the \textbf{BufferGeometry} class, which is harder to use for a developer, but allows better performance since the developer controls how data is transmitted to the GPU\@. \end{itemize} @@ -101,7 +101,7 @@ The equivalent code in Rust is in Listings~\ref{f:undefined-behaviour-rs} and~\r ]{assets/dash-3d-implementation/undefined-behaviour-it.rs} \end{minipage} \end{figure} -What happens is that the iterator needs to borrow of the vector. +What happens is that the iterator needs to borrow the vector. Since it is borrowed, it can no longer be borrowed as mutable since mutating it could invalidate the other borrowers. And effectively, the borrow checker will crash the compiler with the error in Listing~\ref{f:undefined-behaviour-rs-error}. \begin{figure}[ht] diff --git a/src/foreword/video-vs-3d.tex b/src/foreword/video-vs-3d.tex index b284a56..4b768db 100644 --- a/src/foreword/video-vs-3d.tex +++ b/src/foreword/video-vs-3d.tex @@ -77,7 +77,7 @@ Even though these interactions seem easy to handle, giving the best possible exp \end{itemize} -There are even ways of controlling the other options, for example, \texttt{F} puts the player in fullscreen mode, up and down arrows changes the sound volume, \texttt{M} mutes the sound and \texttt{C} activates the subtitles. +There are even ways of controlling the other options, for example, \texttt{F} puts the player in fullscreen mode, up and down arrows change the sound volume, \texttt{M} mutes the sound and \texttt{C} activates the subtitles. All the interactions are summed up in Figure~\ref{i:youtube-keyboard}. \newcommand{\relativeseekcontrol}{LightBlue} @@ -282,7 +282,7 @@ These interfaces are not interactive, and can be frustrating to the user who mig Some other interfaces add 2 degrees of freedom to the previous one: the user does not control the position of the camera but they can control the angle. This mimics the scenario of the 360 video. This is typically the case of the video game \emph{nolimits 2: roller coaster simulator} which works with VR devices (oculus rift, HTC vive, etc\ldots) where the only interaction the user has is turning the head. -Finally, most of the other interfaces give at least 5 degrees of freedom to the user: 3 being the coordinates of the position of the camera, and 2 being the angle (assuming the up vector is unchangeable, some interfaces might allow that giving a sixth degree of freedom). +Finally, most of the other interfaces give at least 5 degrees of freedom to the user: 3 being the coordinates of the position of the camera, and 2 being the angle (assuming the up vector is unchangeable, some interfaces might allow that, giving a sixth degree of freedom). The most common controls are the trackball controls where the user rotate the object like a ball \href{https://threejs.org/examples/?q=controls\#misc_controls_trackball}{(live example here)} and the orbit controls, which behave like the trackball controls but preserving the up vector \href{https://threejs.org/examples/?q=controls\#misc_controls_orbit}{(live example here)}. Another popular way of controlling a free camera in a virtual environment is the first person controls \href{https://threejs.org/examples/?q=controls\#misc_controls_pointerlock}{(live example here)}. These controls are typically used in shooting video games, the mouse rotates the camera and the keyboard is used to translate it. @@ -290,7 +290,7 @@ These controls are typically used in shooting video games, the mouse rotates the \subsection{Relationship between interface, interaction and streaming} In both video and 3D systems, streaming affects the interaction. -For example, in a video streaming scenario, if a user sees that the video is fully loaded, they might start moving around on the timeline, but if they sees that the streaming is just enough to not stall, they might prefer staying peaceful and just watch the video. +For example, in a video streaming scenario, if a user sees that the video is fully loaded, they might start moving around on the timeline, but if they see that the streaming is just enough to not stall, they might prefer staying peaceful and just watch the video. If the streaming stalls for too long, the user might seek somewhere else hoping for the video to resume, or get frustrated and leave the video. The same types of behaviour occur in 3D streaming: if a user is somewhere in a scene, and sees more data appearing, they might wait until enough data has arrived, but if they sees nothing happens, they might leave to look for data somewhere else. @@ -303,7 +303,7 @@ Moving slowly allows the system to collect and display data to the user, whereas Moreover, the interface and the way elements are displayed to the user also impacts his behaviour. A streaming system can use this effect to its users benefit by providing feedback on the streaming to the user via the interface. For example, on Youtube, the buffered portion of the video is displayed in light grey on the timeline, whereas the portion that remains to be downloaded is displayed in dark grey. -A user is more likely to click on the light grey part of the timeline that on the dark grey part, preventing the streaming from stalling. +A user is more likely to click on the light grey part of the timeline than on the dark grey part, preventing the streaming from stalling. \begin{figure}[th] \centering diff --git a/src/introduction/challenges.tex b/src/introduction/challenges.tex index 608a2c2..929a3d2 100644 --- a/src/introduction/challenges.tex +++ b/src/introduction/challenges.tex @@ -21,19 +21,19 @@ This opens multiple problems that we need to take care of. Before streaming content, it needs to be prepared. The segmentation of the content into chunks is particularly important for streaming since it allows transmitting only a portion of the data to the client, that it can render before downloading more chunks. Content preparation also includes compression. -One of the question this thesis has to answer is \emph{what is the best way to prepare 3D content so that a client can benefit from it?} +One of the questions this thesis has to answer is \emph{``what is the best way to prepare 3D content so that a client can benefit from it?''} \paragraph{Streaming policies.} Once our content is prepared and split in chunks, a client needs to determine which chunks it needs to download. A chunk that contains data in the field of view of the user is more relevant than a chunk outside of it; a chunk that is close to the camera is more relevant than a chunk far away from the camera, etc\ldots. This should also include other contextual parameters, such as the size of a chunk, the bandwidth, the user's behaviour, etc\ldots. -The most important question we have to answer is \emph{how do we determine which chunks need to be downloaded depending on the chunks themselves and the user's interactions?} +The most important question we have to answer is \emph{how to determine which chunks need to be downloaded depending on the chunks themselves and the user's interactions?} \paragraph{Evaluation.} In such systems, the two most important criteria for evaluation are quality of service, and quality of experience. The quality of service is a network-centric metric, which considers values such as throughput. The quality of experience is a user-centric metric, and can only be measured by asking how users feel about a system. -To be able to know which streaming policies are best, one needs to know \emph{how can we compare streaming policies and evalute the impact of their parameters in terms of quality of service and quality of experience?} +To be able to know which streaming policies are best, one needs to know \emph{how to compare streaming policies and evalute the impact of their parameters in terms of quality of service and quality of experience?} \paragraph{Implementation.} The objective of our work is to setup a client-server architecture that answers the problems mentioned earlier (content preparation, chunk utility, streaming policies). diff --git a/src/introduction/main.tex b/src/introduction/main.tex index f35ed3a..8be9c4e 100644 --- a/src/introduction/main.tex +++ b/src/introduction/main.tex @@ -3,10 +3,10 @@ \fresh{} During the last years, 3D acquisition and modeling techniques have progressed a lot. -Recent software such as \href{https://alicevision.org/\#meshroom}{Meshroom} use \emph{structure from motion} and \emph{multi view stero} to infer a 3D model from a set of photographs. +Recent software such as \href{https://alicevision.org/\#meshroom}{Meshroom} use \emph{structure from motion} and \emph{multi view stereo} to infer a 3D model from a set of photographs. There are more and more devices that are specifically built to obtain 3D data: some are more expensive and provide with very precise information such as Lidar, and some cheaper devices can obtain coarse data such as the Kinect. -Thanks to these techniques, more and more 3D data becomes available. -These models have potential for multiple purposes, for example, they can be 3D printed which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses will consist in visualisation. +Thanks to these techniques, more and more 3D data become available. +These models have potential for multiple purposes, for example, they can be 3D printed, which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses will consist in visualisation. For example, they can be used for augmented reality, to provide user with feedback that can be useful to help worker with complex tasks, but also for fashion (for example, \emph{Fitting Box} is a company that develops software to virtually try glasses). 3D acquisition and visualisation is also useful to preserve cultural heritage, and software such as Google Heritage or 3DHop are such examples, or to allow users navigating in a city (as in Google Earth or Google Maps in 3D). \href{https://sketchfab.com}{Sketchfab} is an example of a website allowing users to share their 3D models and visualise models from other users. diff --git a/src/introduction/outline.tex b/src/introduction/outline.tex index c4d7ebd..b76bd07 100644 --- a/src/introduction/outline.tex +++ b/src/introduction/outline.tex @@ -9,7 +9,7 @@ Then it reviews the different manners of performing 3D streaming. The last section of this chapter focuses on 3D interaction. Then, in Chapter~\ref{bi}, we present our first contribution: an in-depth analysis of the impact of the UI on navigation and streaming in a 3D scene. -We first develop a basic interface for navigating in 3D and we introduce 3D objects called \emph{bookmarks} that help users navigate in the scene. +We first develop a basic interface for navigating in 3D and we introduce 3D objects called \emph{bookmarks} that help users navigating in the scene. We then present a user study that we conducted on 50 people that shows that bookmarks have a great impact on how easy it is for a user to perform tasks such as finding objects. % Then, we setup a basic 3D streaming system that allows us to replay the traces collected during the user study and simulate 3D streaming at the same time. We analyse how the presence of bookmarks impacts the streaming, and we propose and evaluate a few streaming policies that rely on pre-computations that can be made thanks to bookmarks and that can increase the quality of experience. @@ -17,7 +17,7 @@ We analyse how the presence of bookmarks impacts the streaming, and we propose a In Chapter~\ref{d3}, we present the most important contribution of this thesis: DASH-3D. DASH-3D is an adaptation of the video streaming standard to 3D streaming. We first describe how we adapt the concepts of DASH to 3D content, including the segmentation of content. -We then define utility metrics that associates score to each chunk depending on the camera's position. +We then define utility metrics that associate score to each chunk depending on the camera's position. Then, we present a client and various streaming policies based on our utilities that can benefit from the DASH format. We finally evaluate the different parameters of our client. diff --git a/src/system-bookmarks/main.tex b/src/system-bookmarks/main.tex index 55e83ae..551c8e8 100644 --- a/src/system-bookmarks/main.tex +++ b/src/system-bookmarks/main.tex @@ -3,6 +3,21 @@ \minitoc{} \newpage +Nowadays, smartphones are more and more powerful, and people slowly move their applications from their computers to smartphones or tablets. +This is why we decided to port our interface to mobile devices. +Desktop devices and mobile devices are very different. +There are many differences in terms of performance: desktop devices tend to be much more powerful and have much better memory network connection than mobile devices. +Also, the interaction is not comparable in any way: the desktop mostly uses keyboard and mouse, whereas most of the mobile devices only have a touchscreen, as well as many sensors (accelerometer, gyroscope, GPS, etc\ldots). +This is why porting our DASH-3D client to mobile is not an easy task. + +To do so, we add some widgets on the screen to support touch interaction: a virtual joystick is displayed on the screen and the user can touch it to translate the camera, instead of using the W, A, S and D keys on a computer. +Since most mobile devices embed a gyroscope, we allow users to rotate the camera by physically rotating the device. +This interaction is more precise and intuitive to the user, but it is also more tiring, this is why we also added a touch interaction to rotate the screen: a user can also touch any place on the screen that does not correspond to the joystick to rotate the camera by moving the scene. +In order to ease navigation, we integrate bookmarks back, and we enhance the precomputations explained in Chapter~\ref{sb}. +We then conducted a user study on 18 participants, to test both the interaction and the streaming aspect of the bookmarks. + +\newpage + \input{system-bookmarks/introduction} \resetstyle{}