From 6d1ef9fdd96274f1d43fe39cbaba82a6deeb1019 Mon Sep 17 00:00:00 2001 From: Thomas Forgione Date: Tue, 11 Feb 2020 10:26:46 +0100 Subject: [PATCH] s vs z, list of lists --- src/abstracts/abstract-en.tex | 2 +- src/abstracts/abstract-simple-en.tex | 2 +- src/acknowledgments.tex | 4 ++-- src/conclusion/contributions.tex | 2 +- src/config.sty | 10 +++++++++- src/dash-3d/conclusion.tex | 2 +- src/dash-3d/main.tex | 4 ++-- src/foreword/implementation.tex | 2 +- src/foreword/video-vs-3d.tex | 2 +- src/introduction/main.tex | 10 +++++----- src/main.tex | 13 ++++++++----- src/state-of-the-art/3d-streaming.tex | 10 +++++----- src/state-of-the-art/video.tex | 2 +- 13 files changed, 38 insertions(+), 27 deletions(-) diff --git a/src/abstracts/abstract-en.tex b/src/abstracts/abstract-en.tex index 054011d..318e329 100644 --- a/src/abstracts/abstract-en.tex +++ b/src/abstracts/abstract-en.tex @@ -1,5 +1,5 @@ With the advances in 3D models editing and 3D reconstruction techniques, more and more 3D models are available and their quality is increasing. -Furthermore, the support of 3D visualisation on the web has become standard during the last years. +Furthermore, the support of 3D visualization on the web has become standard during the last years. A major challenge is thus to deliver these remote heavy models and to allow users to visualise and navigate in these virtual environments. This thesis focuses on 3D content streaming and interaction, and proposes three major contributions. diff --git a/src/abstracts/abstract-simple-en.tex b/src/abstracts/abstract-simple-en.tex index aa43db5..e2f42c8 100644 --- a/src/abstracts/abstract-simple-en.tex +++ b/src/abstracts/abstract-simple-en.tex @@ -1,4 +1,4 @@ -More and more 3D models are made available online, and web browsers have now full support for 3D visualisation: +More and more 3D models are made available online, and web browsers have now full support for 3D visualization: this thesis focuses on remote 3D virtual environments streaming and interaction, and describes three major contributions. First, we propose an interface for 3D navigation with bookmarks, which are small virtual objects added to the scene that the user can click to move towards a recommended location. diff --git a/src/acknowledgments.tex b/src/acknowledgments.tex index 826d7a2..f2ce260 100644 --- a/src/acknowledgments.tex +++ b/src/acknowledgments.tex @@ -10,8 +10,8 @@ Then, I want to thank Sidonie \textsc{Christophe} and Gwendal \textsc{Simon} for I also want to thank all the members of the jury, for their attention and the interesting discussions during the defense. % Potes -I want to thank all the kids of the lab that contributed to the mood (and beers\footnote{I mean, seriously, drink responsibly}): Bastien, Vincent, Julien, Sonia, Matthieu, Jean, Damien, Thibault, Clément, Arthur, Thierry, the other Matthieu, the other Julien, Paul, Maxime, Quentin, Adrian, Salomé. -I also want to thank Praveen: it was a pleasure working with him. +I want to thank all the kids of the lab and elsewhere that contributed to the mood (and beers\footnote{I mean, seriously, drink responsibly}): Bastien, Vincent, Julien, Sonia, Matthieu, Jean, Damien, Richard, Thibault, Clément, Arthur, Thierry, the other Matthieu, the other Julien, Paul, Maxime, Quentin, Adrian, Salomé. +I also want to thank Praveen: working with him was a pleasure. % Permanents I would also like to thank the big ones (whom I forgot to thank during the defense, \emph{oopsies}), Sylvie, Jean-Denis, Simone, Yvain, Pierre. diff --git a/src/conclusion/contributions.tex b/src/conclusion/contributions.tex index cc9ea07..40db4f1 100644 --- a/src/conclusion/contributions.tex +++ b/src/conclusion/contributions.tex @@ -16,7 +16,7 @@ This work has been published at the ACM MMSys conference in 2016~\citep{bookmark \paragraph{} After studying the interactive aspect of 3D navigation, we proposed a contribution focusing on the content preparation and the streaming policies of such a system. The objective of this contribution was to introduce a system able to perform \textbf{scalable, view-dependent 3D streaming}. -This new framework brought many improvements upon the basic system described in our first contribution: support for texture, externalisation of necessary computations from the server to the clients, support for multi-resolution textures, rendering performances considerations. +This new framework brought many improvements upon the basic system described in our first contribution: support for texture, externalization of necessary computations from the server to the clients, support for multi-resolution textures, rendering performances considerations. We drew massive inspiration from the DASH technology, a standard for video streaming used for its scalability and its adaptability. We exploited the fact that DASH is made to be content agnostic to fit 3D content into its structure. Following the path set by DASH-SRD, we proposed to tile 3D content using a tree and encode this partition into a description file (MPD) to allow view-dependent streaming, without the need for computation on the server side. diff --git a/src/config.sty b/src/config.sty index 261e3db..e8ba37f 100644 --- a/src/config.sty +++ b/src/config.sty @@ -23,7 +23,6 @@ anchorcolor = blue]{hyperref} \usepackage{subcaption} \usepackage{todonotes} \usepackage{booktabs} -\usepackage{etoc, blindtext} \usepackage{fontspec,fontawesome} \usepackage{setspace} \onehalfspacing{} @@ -31,6 +30,15 @@ anchorcolor = blue]{hyperref} \usepackage{numprint} \usepackage{pdfpages} \usepackage{enumitem} +\usepackage{tocloft} +\usepackage{etoc, blindtext} +\newcommand{\listexamplename}{\hfill{}List of Lists} +\newlistof{example}{exp}{\listexamplename} +\newcommand{\example}[1]{% +\refstepcounter{example} +{} +\addcontentsline{exp}{example} +{\protect\numberline{\quad\,\,\,\theexample}\quad\quad\,\,\,\,#1}\par} \usepackage[normalem]{ulem} \setitemize{noitemsep,topsep=4pt,parsep=4pt,partopsep=0pt} diff --git a/src/dash-3d/conclusion.tex b/src/dash-3d/conclusion.tex index cb44144..429d431 100644 --- a/src/dash-3d/conclusion.tex +++ b/src/dash-3d/conclusion.tex @@ -3,7 +3,7 @@ Our work in this chapter started with the question: can DASH be used for NVE\@? The answer is \emph{yes}. In answering this question, we contributed by showing how to organize a polygon soup and its textures into a DASH-compliant format that (i) includes a minimal amount of metadata that is useful for the client, (ii) organizes the data to allow the client to get the most useful content first. -We further show that the data organisation and its description with metadata (precomputed offline) is sufficient to design and build a DASH client that is adaptive --- it selectively downloads segments within its view, makes intelligent decisions about what to download, balances between geometry and texture while adapting to network bandwidth. +We further show that the data organization and its description with metadata (precomputed offline) is sufficient to design and build a DASH client that is adaptive --- it selectively downloads segments within its view, makes intelligent decisions about what to download, balances between geometry and texture while adapting to network bandwidth. This way, our system addresses the open problems we mentioned in Chapter~\ref{i:challenges}. \begin{itemize} diff --git a/src/dash-3d/main.tex b/src/dash-3d/main.tex index 60dda7d..7f20a6e 100644 --- a/src/dash-3d/main.tex +++ b/src/dash-3d/main.tex @@ -15,10 +15,10 @@ Dynamic Adaptive Streaming over HTTP (DASH) is now a widely deployed standard for video streaming, and even though video streaming and 3D streaming are different problems, many of DASH features can inspire us for 3D streaming. In this chapter, we present the most important contribution of this thesis: adapting DASH to 3D streaming. -First, we show how to prepare 3D data into a format that complies with DASH data organisation, and we store enough metadata to enable a client to perform efficient streaming. +First, we show how to prepare 3D data into a format that complies with DASH data organization, and we store enough metadata to enable a client to perform efficient streaming. The data preparation consists in partitioning the scene into spatially coherent cells and segmenting each cell into chunks with a fixed number of faces, which are sorted by area so that faces of a different level of detail are not grouped together. We also export each texture at different resolutions. -We encode the metadata that describes the data organisation into a 3D version of the Media Presentation Description (MPD) that DASH uses for video. +We encode the metadata that describes the data organization into a 3D version of the Media Presentation Description (MPD) that DASH uses for video. All this prepared content is then stored on a simple static HTTP server: a clients can request the content without any need for computation on the server side, allowing a server to support an arbitrary number of clients. % Namely, we store in the metadata the coordinates of the cells of the $k$-d tree, the areas of geometry chunks, and the average colors of textures. diff --git a/src/foreword/implementation.tex b/src/foreword/implementation.tex index 7c75257..26e17a7 100644 --- a/src/foreword/implementation.tex +++ b/src/foreword/implementation.tex @@ -114,7 +114,7 @@ And effectively, the borrow checker will crash the compiler with the error in Sn \end{figure} This example is one of the many examples of how powerful the borrow checker is: in Rust code, there can be no dangling reference, and all the segmentation faults coming from them are detected by the compiler. -The borrow checker may seem like an enemy to newcomers because it often rejects code that seem correct, but once they get used to it, they understand what is the problem with their code and either fix the problem easily, or realise that the whole architecture is wrong and understand why. +The borrow checker may seem like an enemy to newcomers because it often rejects code that seem correct, but once they get used to it, they understand what is the problem with their code and either fix the problem easily, or realize that the whole architecture is wrong and understand why. It is probably for those reasons that Rust is the \emph{most loved programming language} according to the Stack Overflow Developer Survey in~\citeyear{so-survey-2016}, \citeyear{so-survey-2017}, \citeyear{so-survey-2018} and~\citeyear{so-survey-2019}. diff --git a/src/foreword/video-vs-3d.tex b/src/foreword/video-vs-3d.tex index fb85740..6f47c87 100644 --- a/src/foreword/video-vs-3d.tex +++ b/src/foreword/video-vs-3d.tex @@ -44,7 +44,7 @@ In video, those media are mostly images, sounds, and subtitles, whereas in 3D, t In both cases, an algorithm for content streaming has to acknowledge those different media types and manage them correctly. In video streaming, most of the data (in terms of bytes) is used for images. -Thus, the most important thing a video streaming system should do is to optimise images streaming. +Thus, the most important thing a video streaming system should do is to optimize images streaming. That is why, on a video on Youtube for example, there may be 6 available qualities for images (144p, 240p, 320p, 480p, 720p and 1080p) but only 2 qualities for sound. This is one of the main differences between video and 3D streaming: in a 3D setting, the ratio between geometry and texture varies from one scene to another, and leveraging between those two types of content is a key problem. diff --git a/src/introduction/main.tex b/src/introduction/main.tex index 3d6385d..c01b0f2 100644 --- a/src/introduction/main.tex +++ b/src/introduction/main.tex @@ -4,7 +4,7 @@ During the last years, 3D acquisition and modeling techniques have made tremendo Recent software uses 2D images from cameras to reconstruct 3D data, e.g. \href{https://alicevision.org/\#meshroom}{Meshroom} is a free and open source software which got almost \numprint{200000} downloads on \href{https://www.fosshub.com/Meshroom.html}{fosshub}, which use \emph{structure-from-motion} and \emph{multi-view-stereo} to infer a 3D model. More and more devices are specifically built to harvest 3D data: for example, LIDAR (Light Detection And Ranging) can compute 3D distances by measuring time of flight of light. The recent research interest for autonomous vehicles allowed more companies to develop cheaper LIDARs, which increase the potential for new 3D content creation. Thanks to these techniques, more and more 3D data become available. -These models have potential for multiple purposes, for example, they can be printed, which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses are based on visualisation. +These models have potential for multiple purposes, for example, they can be printed, which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses are based on visualization. For example, they can be used for augmented reality, to provide user with feedback that can be useful to help worker with complex tasks, but also for fashion (for example, \emph{Fittingbox} is a company that develops software to virtually try glasses, as in Figure~\ref{i:fittingbox}). \begin{figure}[ht] @@ -15,8 +15,8 @@ For example, they can be used for augmented reality, to provide user with feedba \newpage -3D acquisition and visualisation is also useful to preserve cultural heritage, and software such as Google Heritage or 3DHop are such examples, or to allow users navigating in a city (as in Google Earth or Google Maps in 3D). -\href{https://sketchfab.com}{Sketchfab} (see Figure~\ref{i:sketchfab}) is an example of a website allowing users to share their 3D models and visualise models from other users. +3D acquisition and visualization is also useful to preserve cultural heritage, and software such as Google Heritage or 3DHop are such examples, or to allow users navigating in a city (as in Google Earth or Google Maps in 3D). +\href{https://sketchfab.com}{Sketchfab} (see Figure~\ref{i:sketchfab}) is an example of a website allowing users to share their 3D models and visualize models from other users. \begin{figure}[ht] \centering @@ -24,9 +24,9 @@ For example, they can be used for augmented reality, to provide user with feedba \caption{Sketchfab interface\label{i:sketchfab}} \end{figure} -In most 3D visualisation systems, the 3D data is stored on a server and needs to be transmitted to a terminal before the user can visualise it. +In most 3D visualization systems, the 3D data is stored on a server and needs to be transmitted to a terminal before the user can visualize it. The improvements in the acquisition setups we described lead to an increasing quality of the 3D models, thus an increasing size in bytes as well. -Simply downloading 3D content and waiting until it is fully downloaded to let the user visualise it is no longer a satisfactory solution, so adaptive streaming is needed. +Simply downloading 3D content and waiting until it is fully downloaded to let the user visualize it is no longer a satisfactory solution, so adaptive streaming is needed. In this thesis, we propose a full framework for navigation and streaming of large 3D scenes, such as districts or whole cities. % With the progress in data acquisition and modeling techniques, networked virtual environments, or NVE, are increasing in scale. diff --git a/src/main.tex b/src/main.tex index ba5139f..ec7b193 100644 --- a/src/main.tex +++ b/src/main.tex @@ -20,9 +20,11 @@ \begin{document} \let\rawref\ref% +\renewcommand*{\listfigurename}{\hfill{}List of Figures} +\renewcommand*{\listtablename}{\hfill{}List of Tables} %\renewcommand{\ref}[1]{\rawref{#1} (page~\pageref{#1})} -% \includepdf[pages=-]{assets/ugly-cover.pdf} +\includepdf[pages=-]{assets/ugly-cover.pdf} \frontmatter @@ -63,10 +65,11 @@ \newpage \let\LaTeXStandardClearpage\clearpage \let\clearpage\relax -\listoffigures% -\listofalgorithms% -\listoftables% -\lstlistoflistings% +\example{List of Figures}{\listoffigures}% +\example{List of Tables}{\listoftables}% +\example{List of Algorithms}{\listofalgorithms}% +\example{List of Snippets}{\lstlistoflistings}% +\example{List of Lists}{\listofexample} \let\clearpage\LaTeXStandardClearpage% \bibliography{src/bib.bib} diff --git a/src/state-of-the-art/3d-streaming.tex b/src/state-of-the-art/3d-streaming.tex index 81f9cdd..101ab63 100644 --- a/src/state-of-the-art/3d-streaming.tex +++ b/src/state-of-the-art/3d-streaming.tex @@ -101,7 +101,7 @@ However, users are often interested in scenes that contain multiple meshes, and To answer those issues, the Khronos group proposed a generic format called glTF (GL Transmission Format,~\citep{gltf}) to handle all types of 3D content representations: point clouds, meshes, animated models, etc.\ glTF is based on a JSON file, which encodes the structure of a scene of 3D objects. It contains a scene graph with cameras, meshes, buffers, materials, textures and animations. -Although relevant for compression, transmission and in particular streaming, this standard does not yet consider view-dependent streaming, which is required for large scene remote visualisation and which we address in our work. +Although relevant for compression, transmission and in particular streaming, this standard does not yet consider view-dependent streaming, which is required for large scene remote visualization and which we address in our work. % Zampoglou @@ -138,7 +138,7 @@ Level of details have been initially used for efficient 3D rendering~\citep{lod} When the change from one level of detail to another is direct, it can create visual discomfort to the user. This is called the \emph{popping effect} and level of details have the advantage of enabling techniques, such as geomorhping \citep{hoppe-lod}, to transition smoothly from one level of detail to another. Level of details have then been used for 3D streaming. -For example, \citep{streaming-hlod} propose an out-of-core viewer for remote model visualisation based by adapting hierarchical level of details~\citep{hlod} to the context of 3D streaming. +For example, \citep{streaming-hlod} propose an out-of-core viewer for remote model visualization based by adapting hierarchical level of details~\citep{hlod} to the context of 3D streaming. Level of details can also be used to perform viewpoint dependant streaming, such as \citep{view-dependent-lod}. \subsection{Texture streaming} @@ -152,7 +152,7 @@ Using these lower resolutions can be especially interesting for streaming. Since 3D data can contain many textures, \citep{simon2019streaming} propose a way to stream a set of textures by encoding them into a video. Each texture is segmented into tiles of a fixed size. -Those tiles are then ordered to minimise dissimilarities between consecutive tiles, and encoded as a video. +Those tiles are then ordered to minimize dissimilarities between consecutive tiles, and encoded as a video. By benefiting from the video compression techniques, the authors are able to reach a better rate-distortion ratio than webp, which is the new standard for texture transmission, and jpeg. \subsection{Geometry and textures} @@ -179,12 +179,12 @@ This is why optimized engines for video games use techniques that are reused for Some other online games, such as \href{https://secondlife.com}{Second Life}, rely on user generated data, and thus are forced to send data from users to others. In such scenarios, 3D streaming is appropriate and this is why the idea of streaming 3D content for video games has been investigated. -For example, \citep{game-on-demand} proposes an online game engine based on geometry streaming, that addresses the challenge of streaming 3D content at the same time as synchronisation of the different players. +For example, \citep{game-on-demand} proposes an online game engine based on geometry streaming, that addresses the challenge of streaming 3D content at the same time as synchronization of the different players. \subsection{NVE streaming frameworks} An example of NVE streaming framework is 3D Tiles \citep{3d-tiles}, which is a specification for visualizing massive 3D geospatial data developed by Cesium and built on top of glTF\@. -Their main goal is to display 3D objects on top of regular maps, and their visualisation consists in a top-down view, whereas we seek to let users freely navigate in our scenes, whether it be flying over the scene or moving along the roads. +Their main goal is to display 3D objects on top of regular maps, and their visualization consists in a top-down view, whereas we seek to let users freely navigate in our scenes, whether it be flying over the scene or moving along the roads. \begin{figure}[ht] \centering diff --git a/src/state-of-the-art/video.tex b/src/state-of-the-art/video.tex index f6acc3e..49d6660 100644 --- a/src/state-of-the-art/video.tex +++ b/src/state-of-the-art/video.tex @@ -49,7 +49,7 @@ If a user wants to seek somewhere else in the video, only one segment of data is \subsubsection{Content preparation and server} -Encoding a video in DASH format consists in partitioning the content into periods, adaptation sets, representations and segments as explained above, and generating a Media Presentation Description file (MPD) which describes this organisation. +Encoding a video in DASH format consists in partitioning the content into periods, adaptation sets, representations and segments as explained above, and generating a Media Presentation Description file (MPD) which describes this organization. Once the data is prepared, it can simply be hosted on a static HTTP server which does no computation other than serving files when it receives requests. All the intelligence and the decision making is moved to the client side. This is one of the DASH strengths: no powerful server is required, and since static HTTP server are stable and efficient, all DASH clients can benefit from it.