diff --git a/src/state-of-the-art/3d-streaming.tex b/src/state-of-the-art/3d-streaming.tex index fc6b584..2f614e4 100644 --- a/src/state-of-the-art/3d-streaming.tex +++ b/src/state-of-the-art/3d-streaming.tex @@ -6,11 +6,11 @@ In the next sections, we review the 3D streaming related work, from 3D compressi \subsection{Compression and structuring} -According to \citep{maglo20153d}, mesh compression can be divided in four categories: +According to \citep{maglo20153d}, mesh compression can be divided into four categories: \begin{itemize} \item single-rate mesh compression, seeking to reduce the size of a mesh; \item progressive mesh compression, encoding meshes in many levels of resolution that can be downloaded and rendered one after the other; - \item random accessible mesh compression, where parts of the models can be decoded in any order; + \item random accessible mesh compression, where parts of the models can be decoded in an arbitrary order; \item mesh sequence compression, compressing mesh animations. \end{itemize} @@ -78,7 +78,7 @@ To do so, an algorithm, called \emph{decimation algorithm}, starts from the orig Every time two vertices are merged, a vertex and two faces are removed from the original mesh, decreasing the resolution of the model. After content preparation, the mesh consists in a base mesh and a sequence of partially ordered edge split operations. Thus, a client can start by downloading the base mesh, display it to the user, and keep downloading refinement operations (vertex splits) and display details as time goes by. -This process reduces the time a user has to wait before seeing something, thus increases the quality of experience. +This process reduces the time a user has to wait before seeing a downloaded 3D object, thus increases the quality of experience. \begin{figure}[ht] \centering @@ -91,16 +91,16 @@ These methods have been vastly researched \citep{bayazit20093,mamou2010shape}, b \citep{streaming-compressed-webgl} develop a dedicated progressive compression algorithm based on iterative decimation, for efficient decoding, in order to be usable on web clients. With the same objective, \citep{pop-buffer} proposes pop buffer, a progressive compression method based on quantization that allows efficient decoding. -Following this, many approaches use multi triangulation, which creates mesh fragments at different levels of resolution and encodes the dependencies between fragments in a directed acyclic graph. +Following these, many approaches use multi triangulation, which creates mesh fragments at different levels of resolution and encodes the dependencies between fragments in a directed acyclic graph. In \citep{batched-multi-triangulation}, the authors propose Nexus: a GPU optimized version of multi triangulation that pushes its performances to make real time rendering possible. -It is notably used in 3DHOP (3D Heritage Online Presenter, \citep{3dhop}), a framework to easily build web interfaces to present 3D models to users in the context of cultural heritage. +It is notably used in 3DHOP (3D Heritage Online Presenter, \citep{3dhop}), a framework to easily build web interfaces to present 3D objects to users in the context of cultural heritage. Each of these approaches define its own compression and coding for a single mesh. However, users are frequently interested in scenes that contain many meshes, and the need to structure content emerged. To answer those issues, the Khronos group proposed a generic format called glTF (GL Transmission Format,~\citep{gltf}) to handle all types of 3D content representations: point clouds, meshes, animated model, etc.\ glTF is based on a JSON file, which encodes the structure of a scene of 3D objects. -It contains a scene graph with cameras, meshes, buffers, materials, textures, animations an skinning information. +It contains a scene graph with cameras, meshes, buffers, materials, textures and animations. Although relevant for compression, transmission and in particular streaming, this standard does not yet consider view-dependent streaming which is required for large scene remote visualisation and that we address in our work. % Zampoglou @@ -120,7 +120,7 @@ Their approach works well for several objects, but does not handle view-dependen 3D streaming means that content is downloaded while the user is interacting with the 3D object. In terms of quality of experience, it is desirable that the downloaded content is visible to the user. -This means that the progressive compression must allow a decoder to choose what it needs to decode, and to guess what it needs to decode from the users point of view. +This means that the progressive compression must allow a decoder to choose what it needs to decode, and to guess what it needs to decode according to the users point of view. This is typically called \emph{random accessible mesh compression}. \citep{maglo2013pomar} is such an example of random accessible progressive mesh compression. \citep{cheng2008receiver} proposes a receiver driven way of achieving viewpoint dependency with progressive mesh: the client starts by downloading the base mesh, and then is able to estimate the importance of vertex splits and choose which ones to download. @@ -130,14 +130,14 @@ In the case of streaming a large 3D scene, viewpoint dependent streaming is a mu A simple way to implement viewpoint dependency is to access the content near the user's camera. This approach, implemented in Second Life and several other NVEs (e.g.,~\citep{peer-texture-streaming}), only depends on the location of the avatar, not on its viewing direction. -It exploits spatial locality and works well for any continuous movement of the user, including turning. +It exploits spatial coherence and works well for any continuous movement of the user, including turning. Once the set of objects that are likely to be accessed by the user is determined, the next question is in what order should these objects be retrieved. A simple approach is to retrieve the objects based on distance: the spatial distance from the user's virtual location and rotational distance from the user's view. More recently, Google integrated Google Earth 3D module into Google Maps. Users are now able to go to Google Maps, and click the 3D button which shifts the camera from the top-down view. -Even though there are no associated publications, it seems that the interface does view dependent streaming: low resolution from the center of the point of view gets downloaded right away, and then, data farther away or higher resolution data gets downloaded since it appears in a second time. -The choice of the nearby can be based based on an a priori, discretized, partitioned version of the environment; for example, \citep{3d-tiles} developed 3D Tiles, is a specification for visualizing massive 3D geospatial data developed by Cesium and built on top of glTF\@. +Even though there are no associated publications, it seems that the interface does view dependent streaming: low resolution from the center of the point of view gets downloaded right away, and then, data farther away or higher resolution data gets downloaded since it appears at a later time. +The choice of the nearby can be based based on an a priori, discretized, partitioned version of the environment; for example, \citep{3d-tiles} developed 3D Tiles, a specification for visualizing massive 3D geospatial data developed by Cesium and built on top of glTF\@. Their main goal is to display 3D objects on top of regular maps. \begin{figure}[ht] @@ -156,7 +156,7 @@ Their approaches combine the distortion caused by having lower resolution meshes \citep{progressive-compression-textured-meshes} also deals with the geometry / texture compromise. This work designs a cost driven framework for 3D data compression, both in terms of geometry and textures. This framework generates an atlas for textures that enables efficient compression and multiresolution scheme. -All four works considered a single, manifold textured mesh model with progressive meshes, and are not applicable in our work since we deal with large scenes with autointersections and T vertices. +All four works considered a single, manifold textured mesh model with progressive meshes, and are not applicable in our work since we deal with large and potentially non-manifold scenes. Regarding texture streaming, \citep{simon2019streaming} propose a way to stream a set of textures by encoding the textures into a video. Each texture is segmented into tiles of a fixed size. @@ -164,6 +164,8 @@ Those tiles are then ordered to minimise dissimilarities between consecutive til By benefiting from the video compression techniques, they are able to reach a better rate-distortion ratio than webp, which is the new standard for texture transmission, and jpeg. However, the geometry / texture compromise is not the point of that paper. +This thesis proposes a scalable streaming framework for large textured 3D scenes based on DASH, like~\citep{zampoglou}, but featuring a space partitioning of scenes in order to provide viewpoint dependent streaming. + % \subsection{Prefetching in NVE} % The general prefetching problem can be described as follows: what are the data most likely to be accessed by the user in the near future, and in what order do we download the data? % diff --git a/src/state-of-the-art/intro.tex b/src/state-of-the-art/intro.tex index 51e9f27..b9b50d2 100644 --- a/src/state-of-the-art/intro.tex +++ b/src/state-of-the-art/intro.tex @@ -1,4 +1,4 @@ -In this chapter, we review a part of the state of the art on Multimedia streaming and interaction. +In this chapter, we review the part of the state of the art on Multimedia streaming and interaction that is relevant for this thesis. As discussed in the previous chapter, video and 3D share many similarities and since there is already a very important body of work on video streaming, we start this chapter with a review of this domain with a particular focus on the DASH standard. -Then, we proceed with presenting various topics related to 3D streaming, including compression, geometry and texture compromise, and viewpoint dependent streaming. +Then, we proceed with presenting topics related to 3D streaming, including compression and streaming, geometry and texture compromise, and viewpoint dependent streaming. Finally, we end this chapter by reviewing the related work regarding 3D navigation and interfaces.