phd/src/introduction/main.tex

50 lines
4.9 KiB
TeX
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

\chapter{Introduction\label{i}}
During the last years, 3D acquisition and modeling techniques have made tremendous progress.
Recent software uses 2D images from cameras to reconstruct 3D data, e.g. \href{https://alicevision.org/\#meshroom}{Meshroom} is a free and open source software which got almost \numprint{200000} downloads on \href{https://www.fosshub.com/Meshroom.html}{fosshub}, which use \emph{structure-from-motion} and \emph{multi-view-stereo} to infer a 3D model.
More and more devices are specifically built to harvest 3D data: for example, LIDAR (Light Detection And Ranging) can compute 3D distances by measuring time of flight of light. The recent research interest for autonomous vehicles allowed more companies to develop cheaper LIDARs, which increase the potential for new 3D content creation.
Thanks to these techniques, more and more 3D data become available.
These models have potential for multiple purposes, for example, they can be printed, which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses are based on visualization.
For example, they can be used for augmented reality, to provide user with feedback that can be useful to help worker with complex tasks, but also for fashion (for example, \href{https://www.fittingbox.com}{Fittingbox} is a company that develops software to virtually try glasses, as in Figure~\ref{i:fittingbox}).
\begin{figure}[ht]
\centering
\includegraphics[width=0.45\textwidth]{assets/introduction/fittingbox.png}
\caption{My face with augmented glasses\label{i:fittingbox}}
\end{figure}
\newpage
3D acquisition and visualization is also useful to preserve cultural heritage, and software such as Google Heritage or 3DHop are such examples, or to allow users navigating in a city (as in Google Earth or Google Maps in 3D).
\href{https://sketchfab.com}{Sketchfab} (see Figure~\ref{i:sketchfab}) is an example of a website allowing users to share their 3D models and visualize models from other users.
\begin{figure}[ht]
\centering
\includegraphics[width=\textwidth]{assets/introduction/sketchfab.png}
\caption{Sketchfab interface\label{i:sketchfab}}
\end{figure}
In most 3D visualization systems, the 3D data are stored on a server and need to be transmitted to a terminal before the user can visualize them.
The improvements in the acquisition setups we described lead to an increasing quality of the 3D models, thus an increasing size in bytes as well.
Simply downloading 3D content and waiting until it is fully downloaded to let the user visualize it is no longer a satisfactory solution, so adaptive streaming is needed.
In this thesis, we propose a full framework for navigation and streaming of large 3D scenes, such as districts or whole cities.
% With the progress in data acquisition and modeling techniques, networked virtual environments, or NVE, are increasing in scale.
% For instance,~\cite{urban-data-visualisation} reported that the 3D scene for the city of Lyon takes more than 30 GB of data.
% It has become impractical to download the whole 3D scene before the user begins to navigate in the scene.
% A more common approach is to stream the required 3D content (models and textures) on demand, as the user moves around the scene.
% Downloading the required 3D content the moment the user demands it, however, leads to ``popping effect'' where 3D objects materialize suddenly in the view of the user, due to the latency between requesting for and receiving the 3D content from the server~\cite{visibility-determination}.
% Such latency can be quite high --- Varvello et al.\ reported a median of about 30 seconds for all 3D data in an avatar's surrounding to be loaded in high density Second Life regions under their experimental network conditions, due to a bottleneck at the server~\cite{second-life}.
%
% For a smoother user experience, NVE typically prefetch 3D content, so that a 3D object is readily available for rendering when the object falls into the view of the user.
% Efficient prefetching, however, requires the client or the server to predict where the user would navigate to in the future and retrieve the corresponding 3D content before the user reaches there.
% In a typical scenario, users navigate along a continuous path in a NVE, leading to a significant overlap between the 3D content visible from the user's known current position and possible next positions (i.e., \textit{spatial data locality}).
% Furthermore, there is a significant overlap between the 3D content visible from the current point in time to the next point in time (i.e., \textit{temporal data locality}).
% Both forms of locality lead to content overlaps, thus making a correct prediction easier and a wrong prediction less costly. 3D content overlaps are particularly common in a NVE with open space, such as a 3D archaeological site or a 3D city.
\newpage
\input{introduction/challenges}
\input{introduction/outline}