This commit is contained in:
Thomas Forgione 2019-10-14 16:18:37 +02:00
parent 9029df1ec1
commit e7d9de164a
No known key found for this signature in database
GPG Key ID: 203DAEA747F48F41
8 changed files with 77 additions and 55 deletions

View File

@ -10,7 +10,7 @@ On the other hand, we showed that this help in 3D navigation comes at the cost o
However, we also showed that this cost is not a fatality\todo{not sure of that sentence, we could also say \emph{is not inevitable}}. However, we also showed that this cost is not a fatality\todo{not sure of that sentence, we could also say \emph{is not inevitable}}.
Due to the prior knowledge we have about bookmarks, we are able to precompute data offline that we are then able to use when users click on bookmarks to improve the quality of service. Due to the prior knowledge we have about bookmarks, we are able to precompute data offline that we are then able to use when users click on bookmarks to improve the quality of service.
We then ran simulations on the traces we collected during the user study to show how these precomputations increase the quality of service. We then ran simulations on the traces we collected during the user study to show how these precomputations increase the quality of service.
This work has been published at the conference MMSys in 2016~\cite{bookmarks-impact}. This work has been published at the conference MMSys in 2016~\citep{bookmarks-impact}.
\paragraph{} \paragraph{}
Then, we put the focus on the streaming aspect of the system. Then, we put the focus on the streaming aspect of the system.
@ -20,7 +20,7 @@ We exploited the fact that DASH is made to be content agnostic to fit 3D content
We used DASH-SRD extension to cut our 3D content into a $k$-d tree and profit from this structure to perform view-dependant streaming, without having any computation to run on the server side at all. We used DASH-SRD extension to cut our 3D content into a $k$-d tree and profit from this structure to perform view-dependant streaming, without having any computation to run on the server side at all.
We implemented a few loading policies based on a utility metric that gives a score for each portion of the model. We implemented a few loading policies based on a utility metric that gives a score for each portion of the model.
We compared different values for a set of parameters, as well as our different loading policies by running simulations. We compared different values for a set of parameters, as well as our different loading policies by running simulations.
This work has been published at the conference ACMMM in 2018~\cite{dash-3d}. A demo paper was also published~\cite{dash-3d-demo}. This work has been published at the conference ACMMM in 2018~\citep{dash-3d}. A demo paper was also published~\citep{dash-3d-demo}.
\paragraph{} \paragraph{}
Finally, we brought back the 3D navigation aspect in DASH-3D. Finally, we brought back the 3D navigation aspect in DASH-3D.
@ -29,4 +29,4 @@ The setup of our first contribution had some simplifications that made precomput
In DASH-3D, the data is structured and chunks are precomputed and do not depend on the client's need. In DASH-3D, the data is structured and chunks are precomputed and do not depend on the client's need.
However, this does not mean that all hope is lost: we showed that we are still able to precompute an optimal order for chunks from each bookmark, and keep using the policies from the previous contribution, switching to this optimal order when a user clicks a bookmark. However, this does not mean that all hope is lost: we showed that we are still able to precompute an optimal order for chunks from each bookmark, and keep using the policies from the previous contribution, switching to this optimal order when a user clicks a bookmark.
We then ran simulations to show how the quality of service is impacted by those techniques. We then ran simulations to show how the quality of service is impacted by those techniques.
A demo paper was published at the conference ACMMM in 2019~\cite{dash-3d-bookmarks-demo} showing the interfaces for desktop and mobile clients with bookmarks, but without any streaming aspect. A demo paper was published at the conference ACMMM in 2019~\citep{dash-3d-bookmarks-demo} showing the interfaces for desktop and mobile clients with bookmarks, but without any streaming aspect.

View File

@ -1,14 +1,14 @@
\usepackage{geometry} \usepackage{geometry}
\usepackage{natbib} \usepackage{natbib}
\bibliographystyle{abbrvnat} \bibliographystyle{abbrvnat}
\setcitestyle{authoryear,open={[},close={]},citesep={,}} \setcitestyle{authoryear,open={[},close={]}}
\usepackage{titling} \usepackage{titling}
\usepackage{multirow} \usepackage{multirow}
\usepackage[colorlinks = true, \usepackage[colorlinks = true,
linkcolor = blue, linkcolor = blue,
urlcolor = blue, urlcolor = blue,
citecolor = blue, citecolor = blue,
anchorcolor = blue]{hyperref} anchorcolor = blue]{hyperref}
\usepackage{scrlayer-scrpage} \usepackage{scrlayer-scrpage}
\usepackage{amssymb} \usepackage{amssymb}

View File

@ -39,7 +39,7 @@ The recorded camera trace allows us to replay each camera path to perform our si
We collected 13 camera paths this way. We collected 13 camera paths this way.
\subsubsection{Network Setup} \subsubsection{Network Setup}
We tested our implementation under three network bandwidth of 2.5 Mbps, 5 Mbps, and 10 Mbps with an RTT of 38 ms, following the settings from DASH-IF~\cite{dash-network-profiles}. We tested our implementation under three network bandwidth of 2.5 Mbps, 5 Mbps, and 10 Mbps with an RTT of 38 ms, following the settings from DASH-IF~\citep{dash-network-profiles}.
The values are kept constant during the entire client session to analyze the difference in magnitude of performance by increasing the bandwidth. The values are kept constant during the entire client session to analyze the difference in magnitude of performance by increasing the bandwidth.
In our experiments, we set up a virtual camera that moves along a navigation path, and our access engine downloads segments in real time according to Algorithm~\ref{d3:next-segment}. In our experiments, we set up a virtual camera that moves along a navigation path, and our access engine downloads segments in real time according to Algorithm~\ref{d3:next-segment}.

View File

@ -73,12 +73,21 @@ Even though these interactions seem easy to handle, giving the best possible exp
\item press the right arrow key to move 5 seconds forwards; \item press the right arrow key to move 5 seconds forwards;
\item press the \texttt{J} key to move 10 seconds backwards; \item press the \texttt{J} key to move 10 seconds backwards;
\item press the \texttt{L} key to move 10 seconds forwards; \item press the \texttt{L} key to move 10 seconds forwards;
\item press one of the number key (on the first row of the keyboard, below the function keys, or on the numpad) to move the corresponding decile of the video. \item press one of the number key (on the first row of the keyboard, below the function keys, or on the numpad) to move the corresponding tenth of the video;
\item press the home key to go the beginning of the video, or the end key to go to the end.
\end{itemize} \end{itemize}
\end{itemize} \end{itemize}
There are also controls for other options: for example, \texttt{F} puts the player in fullscreen mode, up and down arrows change the sound volume, \texttt{M} mutes the sound and \texttt{C} activates the subtitles. There are also controls for other options that are described \href{https://web.archive.org/web/20191014131350/https://support.google.com/youtube/answer/7631406?hl=en}{on this help page}, for example:
\begin{itemize}
\item up and down arrows change the sound volume;
\item \texttt{M} mutes the sound;
\item \texttt{C} activates the subtitles;
\item \texttt{F} puts the player in fullscreen mode;
\item \texttt{T} activates the theater mode (where the video occupies the total width of the screen, instead of occupying two thirds of the screen, the last third being advertising or recommendations);
\item \texttt{I} activates the mini-player (allowing to search for other videos while keeping the current video playing in the bottom right corner).
\end{itemize}
All the interactions are summed up in Figure~\ref{i:youtube-keyboard}. All the interactions are summed up in Figure~\ref{i:youtube-keyboard}.
\newcommand{\relativeseekcontrol}{LightBlue} \newcommand{\relativeseekcontrol}{LightBlue}
@ -155,8 +164,11 @@ All the interactions are summed up in Figure~\ref{i:youtube-keyboard}.
% First alphabetic row % First alphabetic row
\begin{scope}[shift={(1.5,0)}] \begin{scope}[shift={(1.5,0)}]
\foreach \key/\offset in {Q/0,W/1,E/2,R/3,T/4,Y/5,U/6,I/7,O/8,P/9,[/10,]/11} \foreach \key/\offset in {Q/0,W/1,E/2,R/3,Y/5,U/6,O/8,P/9,[/10,]/11}
\keystroke{\offset}{1+\offset}{-2.5}{-1.75}{\key}; \keystroke{\offset}{1+\offset}{-2.5}{-1.75}{\key};
\keystrokebg{4}{5}{-2.5}{-1.75}{T}{\othercontrol};
\keystrokebg{7}{8}{-2.5}{-1.75}{I}{\othercontrol};
\end{scope} \end{scope}
% Caps lock % Caps lock
@ -217,6 +229,15 @@ All the interactions are summed up in Figure~\ref{i:youtube-keyboard}.
\keystrokebg{18}{19}{-4.75}{-4}{$\rightarrow$}{\relativeseekcontrol}; \keystrokebg{18}{19}{-4.75}{-4}{$\rightarrow$}{\relativeseekcontrol};
\keystrokebg{17}{18}{-4}{-3.25}{$\uparrow$}{\othercontrol}; \keystrokebg{17}{18}{-4}{-3.25}{$\uparrow$}{\othercontrol};
% Control keys
\keystroke{16}{17}{-1.75}{-1}{\tiny Inser};
\keystrokebg{17}{18}{-1.75}{-1}{\tiny Home}{\absoluteseekcontrol};
\keystroke{18}{19}{-1.75}{-1}{\tiny PgUp};
\keystroke{16}{17}{-2.5}{-1.75}{\tiny Del};
\keystrokebg{17}{18}{-2.5}{-1.75}{\tiny End}{\absoluteseekcontrol};
\keystroke{18}{19}{-2.5}{-1.75}{\tiny PgDown};
% Numpad % Numpad
\keystroke{19.5}{20.5}{-1.75}{-1}{Lock}; \keystroke{19.5}{20.5}{-1.75}{-1}{Lock};
\keystroke{20.5}{21.5}{-1.75}{-1}{/}; \keystroke{20.5}{21.5}{-1.75}{-1}{/};
@ -285,6 +306,7 @@ This is typically the case of the video game \emph{nolimits 2: roller coaster si
Finally, most of the other interfaces give at least 5 degrees of freedom to the user: 3 being the coordinates of the position of the camera, and 2 being the angle (assuming the up vector is unchangeable, some interfaces might allow that, giving a sixth degree of freedom). Finally, most of the other interfaces give at least 5 degrees of freedom to the user: 3 being the coordinates of the position of the camera, and 2 being the angle (assuming the up vector is unchangeable, some interfaces might allow that, giving a sixth degree of freedom).
The most common controls are the trackball controls where the user rotate the object like a ball \href{https://threejs.org/examples/?q=controls\#misc_controls_trackball}{(live example here)} and the orbit controls, which behave like the trackball controls but preserving the up vector \href{https://threejs.org/examples/?q=controls\#misc_controls_orbit}{(live example here)}. The most common controls are the trackball controls where the user rotate the object like a ball \href{https://threejs.org/examples/?q=controls\#misc_controls_trackball}{(live example here)} and the orbit controls, which behave like the trackball controls but preserving the up vector \href{https://threejs.org/examples/?q=controls\#misc_controls_orbit}{(live example here)}.
These types of controls are notably used on the popular mesh editor \href{http://www.meshlab.net/}{MeshLab} and \href{https://sketchfab.com/}{SketchFab}, the YouTube for 3D models.
Another popular way of controlling a free camera in a virtual environment is the first person controls \href{https://threejs.org/examples/?q=controls\#misc_controls_pointerlock}{(live example here)}. Another popular way of controlling a free camera in a virtual environment is the first person controls \href{https://threejs.org/examples/?q=controls\#misc_controls_pointerlock}{(live example here)}.
These controls are typically used in shooting video games, the mouse rotates the camera and the keyboard is used to translate it. These controls are typically used in shooting video games, the mouse rotates the camera and the keyboard is used to translate it.

View File

@ -7,11 +7,11 @@ The content provider of the NVE may want to highlight certain interesting featur
To allow users to easily find these interesting locations within the NVE, \textit{3D bookmarks} or \textit{bookmarks} for short, can be provided. To allow users to easily find these interesting locations within the NVE, \textit{3D bookmarks} or \textit{bookmarks} for short, can be provided.
A bookmark is simply a 3D virtual camera (with position and camera parameters) predefined by the content provider, and can be presented to users in different ways, including as a text link (URL), a thumbnail image, or a 3D object embedded within the NVE itself. A bookmark is simply a 3D virtual camera (with position and camera parameters) predefined by the content provider, and can be presented to users in different ways, including as a text link (URL), a thumbnail image, or a 3D object embedded within the NVE itself.
When users click on a bookmark, NVEs commonly provide a ``fly-to'' animation to transit the camera from the current viewpoint to the destination~\cite{controlled-movement-virtual-3d,browsing-3d-bookmarks} to help orient the users within the 3D space. When users click on a bookmark, NVEs commonly provide a ``fly-to'' animation to transit the camera from the current viewpoint to the destination~\citep{controlled-movement-virtual-3d,browsing-3d-bookmarks} to help orient the users within the 3D space.
Clicking on a bookmark to fly to another viewpoint leads to reduced data locality. Clicking on a bookmark to fly to another viewpoint leads to reduced data locality.
The 3D content at the bookmarked destination viewpoint may overlap less with the current viewpoint. The 3D content at the bookmarked destination viewpoint may overlap less with the current viewpoint.
In the worst case, the 3D objects corresponding to the current and destination viewpoints can be completely disjoint. In the worst case, the 3D objects corresponding to the current and destination viewpoints can be completely disjoint.
Such movement to a bookmark may lead to a \textit{discovery latency}~\cite{second-life}, in which users have to wait for the 3D content for the new viewpoint to be loaded and displayed. Such movement to a bookmark may lead to a \textit{discovery latency}~\citep{second-life}, in which users have to wait for the 3D content for the new viewpoint to be loaded and displayed.
An analogy for this situation, in the context of video streaming, is seeking into a segment of video that has not been prefetched yet. An analogy for this situation, in the context of video streaming, is seeking into a segment of video that has not been prefetched yet.
In this chapter, we explore the impact of bookmarks on NVE navigation and streaming, and make several contributions. In this chapter, we explore the impact of bookmarks on NVE navigation and streaming, and make several contributions.

View File

@ -3,33 +3,32 @@
\section{3D Bookmarks and Navigation Aids} \section{3D Bookmarks and Navigation Aids}
Devising an ergonomic technique for browsing 3D environments through a 2D interface is difficult. Devising an ergonomic technique for browsing 3D environments through a 2D interface is difficult.
Controlling the viewpoint in 3D (6 DOFs) with 2D devices is not only inherently challenging but also strongly task-dependent. In their recent review,~\cite{interaction-3d-environment} distinguish between several types of camera movements: general movements for exploration (e.g., navigation with no explicit target), targeted movements (e.g., searching and/or examining a model in detail), specified trajectory (e.g., a cinematographic camera path), etc. Controlling the viewpoint in 3D (6 DOFs) with 2D devices is not only inherently challenging but also strongly task-dependent. In their recent review,~\citep{interaction-3d-environment} distinguish between several types of camera movements: general movements for exploration (e.g., navigation with no explicit target), targeted movements (e.g., searching and/or examining a model in detail), specified trajectory (e.g., a cinematographic camera path), etc.
For each type of movement, specialized 3D interaction techniques can be designed. For each type of movement, specialized 3D interaction techniques can be designed.
In most cases, rotating, panning, and zooming movements are required, and users are consequently forced to switch back and forth among several navigation modes, leading to interactions that are too complicated overall for a layperson. In most cases, rotating, panning, and zooming movements are required, and users are consequently forced to switch back and forth among several navigation modes, leading to interactions that are too complicated overall for a layperson.
Navigation aids and smart widgets are required and subject to research efforts both in 3D companies (see \url{sketchfab.com}, \url{cl3ver.com} among others) and in academia, as reported below. Navigation aids and smart widgets are required and subject to research efforts both in 3D companies (see \url{sketchfab.com}, \url{cl3ver.com} among others) and in academia, as reported below.
Translating and rotating the camera can be simply specified by a \textit{lookat} point. Translating and rotating the camera can be simply specified by a \textit{lookat} point.
This is often known as point-of-interest movement (or \textit{go-to}, \textit{fly-to} interactions)~\cite{controlled-movement-virtual-3d}. This is often known as point-of-interest movement (or \textit{go-to}, \textit{fly-to} interactions)~\citep{controlled-movement-virtual-3d}.
Given such a point, the camera automatically animates from its current position to a new position that looks at the specified point. Given such a point, the camera automatically animates from its current position to a new position that looks at the specified point.
One key issue of these techniques is to correctly orient the camera at destination. One key issue of these techniques is to correctly orient the camera at destination.
In Unicam (\cite{two-pointer-input}), the so-called click-to-focus strategy automatically chooses the destination viewpoint depending on 3D orientations around the contact point. In Unicam \citep{two-pointer-input}, the so-called click-to-focus strategy automatically chooses the destination viewpoint depending on 3D orientations around the contact point.
The recent Drag'n Go interaction (\cite{drag-n-go}) also hits a destination point while offering control on speed and position along the camera path. The recent Drag'n Go interaction \citep{drag-n-go} also hits a destination point while offering control on speed and position along the camera path.
This 3D interaction is designed in the screen space (it is typically a mouse-based camera control), where cursor's movements are mapped to camera movements following the same direction as the on-screen optical-flow. This 3D interaction is designed in the screen space (it is typically a mouse-based camera control), where cursor's movements are mapped to camera movements following the same direction as the on-screen optical-flow.
Some 3D browsers provide a viewpoint menu offering a choice of viewpoints (\cite{visual-perception-3d},~\cite{showmotion}). Some 3D browsers provide a viewpoint menu offering a choice of viewpoints \citep{visual-perception-3d,showmotion}.
Authors of 3D scenes can place several viewpoints (typically for each POI) in order to allow easy navigation for users, who can then easily navigate from viewpoint to viewpoint just by selecting a menu item. Authors of 3D scenes can place several viewpoints (typically for each POI) in order to allow easy navigation for users, who can then easily navigate from viewpoint to viewpoint just by selecting a menu item.
Such viewpoints can be either static, or dynamically adapted:~\cite{dual-mode-ui} report that users clearly prefer navigating in 3D using a menu with animated viewpoints than with static ones. Such viewpoints can be either static, or dynamically adapted:~\citep{dual-mode-ui} report that users clearly prefer navigating in 3D using a menu with animated viewpoints than with static ones.
Early 3D VRML environments (\cite{browsing-3d-bookmarks}) offer 3D bookmarks with animated transitions between bookmarked views. Early 3D VRML environments \citep{browsing-3d-bookmarks} offer 3D bookmarks with animated transitions between bookmarked views.
These transitions prevent disorientation since users see how they got there. These transitions prevent disorientation since users see how they got there.
Hyperlinks can also ease rapid movements between distant viewpoints and naturally support non-linear and non-continuous access to 3D content. Hyperlinks can also ease rapid movements between distant viewpoints and naturally support non-linear and non-continuous access to 3D content.
Navigating with 3D hyperlinks is potentially faster, but is likely to cause disorientation, as shown by the work of~\cite{ve-hyperlinks}. Navigating with 3D hyperlinks is potentially faster, but is likely to cause disorientation, as shown by the work of~\citep{ve-hyperlinks}.
\cite{linking-behavior-ve} examine explicit landmark links as well as implicit avatar-chosen links in Second Life. \citep{linking-behavior-ve} examine explicit landmark links as well as implicit avatar-chosen links in Second Life.
These authors point out that linking is appreciated by users and that easing linking would likely result in a richer user experience. These authors point out that linking is appreciated by users and that easing linking would likely result in a richer user experience.
\cite{dual-mode-ui} developed the Dual-Mode User Interface (DMUI) that coordinates and links hypertext to 3D graphics in order to access information in a 3D space. \citep{dual-mode-ui} developed the Dual-Mode User Interface (DMUI) that coordinates and links hypertext to 3D graphics in order to access information in a 3D space.
Our results are consistent with the results on 3D hyperlinks, as we showed that in our NVE 3D bookmarks also improve users performance.
The use of in-scene 3D navigation widgets can also facilitate 3D navigation tasks. The use of in-scene 3D navigation widgets can also facilitate 3D navigation tasks.
\cite{navigation-aid-multi-floor} propose and evaluate 2D and 3D maps as navigation aids for complex virtual buildings and find that the 2D navigation aid outperforms the 3D one for searching tasks. \citep{navigation-aid-multi-floor} propose and evaluate 2D and 3D maps as navigation aids for complex virtual buildings and find that the 2D navigation aid outperforms the 3D one for searching tasks.
The ViewCube widget (\cite{viewcube}) serves as a proxy for the 3D scene and offers viewpoint switching between 26 views while clearly indicating associated 3D orientations. The ViewCube widget \citep{viewcube} serves as a proxy for the 3D scene and offers viewpoint switching between 26 views while clearly indicating associated 3D orientations.
Interactive 3D arrows that point to objects of interest have also been proposed as navigation aids by Chittaro and Burigat~\cite{location-pointing-navigation-aid,location-pointing-effect}: when clicked, the arrows transfer the viewpoint to the destination through a simulated walk or a faster flight. Interactive 3D arrows that point to objects of interest have also been proposed as navigation aids by Chittaro and Burigat~\citep{location-pointing-navigation-aid,location-pointing-effect}: when clicked, the arrows transfer the viewpoint to the destination through a simulated walk or a faster flight.

View File

@ -3,7 +3,7 @@
\subsection{Compression and structuring} \subsection{Compression and structuring}
The most popular compression model for 3D is progressive meshes: they were introduced by~\citet{progressive-meshes} and allow transmitting a mesh by sending a low resolution mesh first, called \emph{base mesh}, and then transmitting detail information that a client can use to increase the resolution. The most popular compression model for 3D is progressive meshes: they were introduced in~\citep{progressive-meshes} and allow transmitting a mesh by sending a low resolution mesh first, called \emph{base mesh}, and then transmitting detail information that a client can use to increase the resolution.
To do so, an algorithm, called \emph{decimation algorithm} removes vertices and faces by merging vertices (Figure~\ref{sote:progressive-scheme}). To do so, an algorithm, called \emph{decimation algorithm} removes vertices and faces by merging vertices (Figure~\ref{sote:progressive-scheme}).
\begin{figure}[ht] \begin{figure}[ht]
@ -66,12 +66,12 @@ When the model is light enough, it is encoded as is, and the operations needed t
Thus, a client can start by downloading the low resolution model, display it to the user, and keep downloading and displaying details as time goes by. Thus, a client can start by downloading the low resolution model, display it to the user, and keep downloading and displaying details as time goes by.
This process reduces the time a user has to wait before seeing something, and increases the quality of experience. This process reduces the time a user has to wait before seeing something, and increases the quality of experience.
More recently, to answer the need for a standard format for 3D data, the Khronos group has proposed a generic format called glTF (GL Transmission Format,~\cite{gltf}) to handle all types of 3D content representations: point clouds, meshes, animated model, etc. More recently, to answer the need for a standard format for 3D data, the Khronos group has proposed a generic format called glTF (GL Transmission Format,~\citep{gltf}) to handle all types of 3D content representations: point clouds, meshes, animated model, etc.
glTF is based on a JSON file, which encodes the structure of a scene of 3D objects. glTF is based on a JSON file, which encodes the structure of a scene of 3D objects.
It can contain a scene tree with cameras, meshes, buffers, materials, textures, animations an skinning information. It can contain a scene tree with cameras, meshes, buffers, materials, textures, animations an skinning information.
Although relevant for compression, transmission and in particular streaming, this standard does not yet consider view-dependent streaming which is required for large scene remote visualisation. Although relevant for compression, transmission and in particular streaming, this standard does not yet consider view-dependent streaming which is required for large scene remote visualisation.
3D Tiles (\cite{3d-tiles}) is a specification for visualizing massive 3D geospatial data developed by Cesium and built on glTF\@. 3D Tiles \citep{3d-tiles} is a specification for visualizing massive 3D geospatial data developed by Cesium and built on glTF\@.
Their main goal is to display 3D objects on top of regular maps, and the data they use is quite different from ours: while they have nice and regular polygons with all the semantic they need, we only work on a polygon soup with textures. Their main goal is to display 3D objects on top of regular maps, and the data they use is quite different from ours: while they have nice and regular polygons with all the semantic they need, we only work on a polygon soup with textures.
Their use case is also different from ours, while their interface allows a user to have a top vision of a city, we want our users to move inside a city. Their use case is also different from ours, while their interface allows a user to have a top vision of a city, we want our users to move inside a city.
@ -80,41 +80,41 @@ Their use case is also different from ours, while their interface allows a user
The general prefetching problem can be described as follows: what are the data most likely to be accessed by the user in the near future, and in what order do we download the data? The general prefetching problem can be described as follows: what are the data most likely to be accessed by the user in the near future, and in what order do we download the data?
The simplest answer to the first question assumes that the user would likely access content close to the current position, thus would retrieve the 3D content within a given radius of the user (also known as the \textit{area of interest}, or AoI). The simplest answer to the first question assumes that the user would likely access content close to the current position, thus would retrieve the 3D content within a given radius of the user (also known as the \textit{area of interest}, or AoI).
This approach, implemented in Second Life and several other NVEs (e.g.,~\cite{peer-texture-streaming}), only depends on the location of the avatar, not on its viewing direction. This approach, implemented in Second Life and several other NVEs (e.g.,~\citep{peer-texture-streaming}), only depends on the location of the avatar, not on its viewing direction.
It exploits spatial locality and works well for any continuous movement of the user, including turning. It exploits spatial locality and works well for any continuous movement of the user, including turning.
Once the set of objects that are likely to be accessed by the user is determined, the next question is in what order should these objects be retrieved. Once the set of objects that are likely to be accessed by the user is determined, the next question is in what order should these objects be retrieved.
A simple approach is to retrieve the objects based on distance: the spatial distance from the user's virtual location and rotational distance from the user's view. A simple approach is to retrieve the objects based on distance: the spatial distance from the user's virtual location and rotational distance from the user's view.
Other approaches consider the movement of the user and attempt to predict where the user will move to in the future. Other approaches consider the movement of the user and attempt to predict where the user will move to in the future.
\cite{motion-prediction} and~\cite{walkthrough-ve} predict the direction of movement from the user's mouse input pattern. \citep{motion-prediction} and~\citep{walkthrough-ve} predict the direction of movement from the user's mouse input pattern.
The predicted mouse movement direction is then mapped to the navigation path in the NVE\@. The predicted mouse movement direction is then mapped to the navigation path in the NVE\@.
Objects that fall in the predicted path are then prefetched. Objects that fall in the predicted path are then prefetched.
CyberWalk~\cite{cyberwalk} uses an exponentially weighted moving average of past movement vectors, adjusted with the residual of prediction, to predict the next location of the user. CyberWalk~\citep{cyberwalk} uses an exponentially weighted moving average of past movement vectors, adjusted with the residual of prediction, to predict the next location of the user.
\cite{prefetching-walkthrough-latency} cluster the navigation paths of users and use them to predict the future navigation paths. \citep{prefetching-walkthrough-latency} cluster the navigation paths of users and use them to predict the future navigation paths.
Objects that fall within the predicted navigation path are prefetched. Objects that fall within the predicted navigation path are prefetched.
All these approaches work well for a navigation path that is continuous --- once the user clicks on a bookmark and jumps to a new location, the path is no longer continuous and the prediction becomes wrong. All these approaches work well for a navigation path that is continuous --- once the user clicks on a bookmark and jumps to a new location, the path is no longer continuous and the prediction becomes wrong.
Moving beyond ordering objects to prefetch based on distance only,~\cite{caching-prefetching-dve} propose to predict the user's interest in an object as well. Moving beyond ordering objects to prefetch based on distance only,~\citep{caching-prefetching-dve} propose to predict the user's interest in an object as well.
Objects within AoI are then retrieved in decreasing order of predicted interest value to the user. Objects within AoI are then retrieved in decreasing order of predicted interest value to the user.
\cite{learning-user-access-patterns} investigates how to render large-scale 3-D scenes on a thin client. % \cite{learning-user-access-patterns} investigates how to render large-scale 3-D scenes on a thin client.
Efficient scene prefetching to provide timely data with a limited cache is one of the most critical issues for remote 3-D data scheduling in networked virtual environment applications. % Efficient scene prefetching to provide timely data with a limited cache is one of the most critical issues for remote 3-D data scheduling in networked virtual environment applications.
Existing prefetching schemes predict the future positions of each individual user based on user traces. % Existing prefetching schemes predict the future positions of each individual user based on user traces.
In this paper, we investigate scene content sequences accessed by various users instead of user viewpoint traces and propose a user access pattern-based 3-D scene prefetching scheme. % In this paper, we investigate scene content sequences accessed by various users instead of user viewpoint traces and propose a user access pattern-based 3-D scene prefetching scheme.
We make a relationship graph-based clustering to partition history user access sequences into several clusters and choose representative sequences from among these clusters as user access patterns. % We make a relationship graph-based clustering to partition history user access sequences into several clusters and choose representative sequences from among these clusters as user access patterns.
Then, these user access patterns are prioritized by their popularity and users' personal preference. % Then, these user access patterns are prioritized by their popularity and users' personal preference.
Based on these access patterns, the proposed prefetching scheme predicts the scene contents that will most likely be visited in the future and delivers them to the client in advance. % Based on these access patterns, the proposed prefetching scheme predicts the scene contents that will most likely be visited in the future and delivers them to the client in advance.
\cite{remote-rendering-streaming} investigate remote image-based rendering (IBR) as the most suitable solution for rendering complex 3D scenes on mobile devices, where the server renders the 3D scene and streams the rendered images to the client. \citep{remote-rendering-streaming} investigate remote image-based rendering (IBR) as the most suitable solution for rendering complex 3D scenes on mobile devices, where the server renders the 3D scene and streams the rendered images to the client.
However, sending a large number of images is inefficient due to the possible limitations of wireless connections. However, sending a large number of images is inefficient due to the possible limitations of wireless connections.
They propose a prefetching scheme at the server side that predicts client movements and hence prefetches the corresponding images. They propose a prefetching scheme at the server side that predicts client movements and hence prefetches the corresponding images.
Prefetching techniques easing 3D data streaming and real-time rendering for remote walkthroughs are considered in~\cite{prefetching-remote-walkthroughs}. Prefetching techniques easing 3D data streaming and real-time rendering for remote walkthroughs are considered in~\citep{prefetching-remote-walkthroughs}.
Culling methods, that don't possess frame to frame coherence, can successfully be combined with remote scene databases, if the prefetching algorithm is adapted accordingly. Culling methods, that don't possess frame to frame coherence, can successfully be combined with remote scene databases, if the prefetching algorithm is adapted accordingly.
We present a quantitative transmission policy, that takes the limited bandwidth of the network and the limited memory available at the client computer into account. We present a quantitative transmission policy, that takes the limited bandwidth of the network and the limited memory available at the client computer into account.
Also in the context remote visualization,~\cite{cache-remote-visualization} study caching and prefetching and optimize configurations of remote visualization architectures. Also in the context remote visualization,~\citep{cache-remote-visualization} study caching and prefetching and optimize configurations of remote visualization architectures.
They aim at minimizing the fetch time in a remote visualization system and defend a practical infrastructure software to adaptively optimize the caching architecture of such systems under varying conditions (e.g.\ when network ressources vary). They aim at minimizing the fetch time in a remote visualization system and defend a practical infrastructure software to adaptively optimize the caching architecture of such systems under varying conditions (e.g.\ when network ressources vary).

View File

@ -5,8 +5,8 @@
\subsection{DASH\@: the standard for video streaming\label{sote:dash}} \subsection{DASH\@: the standard for video streaming\label{sote:dash}}
\copied{} \copied{}
Dynamic Adaptive Streaming over HTTP (DASH), or MPEG-DASH (\citet{dash-std,dash-std-2}), is now a widely deployed Dynamic Adaptive Streaming over HTTP (DASH), or MPEG-DASH \citep{dash-std,dash-std-2}, is now a widely deployed
standard for streaming adaptive video content on the Web (\citet{dash-std-full}), made to be simple and scalable. standard for streaming adaptive video content on the Web \citep{dash-std-full}, made to be simple and scalable.
\fresh{} \fresh{}
DASH is based on a clever way of structuring the content that allows a great adaptability during the streaming without requiring any server side computation. DASH is based on a clever way of structuring the content that allows a great adaptability during the streaming without requiring any server side computation.
@ -48,14 +48,14 @@ This is one of the strengths of DASH\@: no powerful server is required, and sinc
\subsection{DASH-SRD} \subsection{DASH-SRD}
DASH-SRD (Spatial Relationship Description,~\cite{dash-srd}) is a feature that extends the DASH standard to allow streaming only a spatial subpart of a video to a device. DASH-SRD (Spatial Relationship Description,~\citep{dash-srd}) is a feature that extends the DASH standard to allow streaming only a spatial subpart of a video to a device.
It works by encoding a video at multiple resolutions, and tiling the highest resolutions as shown in Figure~\ref{sota:srd-png}. It works by encoding a video at multiple resolutions, and tiling the highest resolutions as shown in Figure~\ref{sota:srd-png}.
That way, a client can choose to download either the low resolution of the whole video or higher resolutions of a subpart of the video. That way, a client can choose to download either the low resolution of the whole video or higher resolutions of a subpart of the video.
\begin{figure}[th] \begin{figure}[th]
\centering \centering
\includegraphics[width=0.6\textwidth]{assets/state-of-the-art/video/srd.png} \includegraphics[width=0.6\textwidth]{assets/state-of-the-art/video/srd.png}
\caption{DASH-SRD~\cite{dash-srd}\label{sota:srd-png}} \caption{DASH-SRD~\citep{dash-srd}\label{sota:srd-png}}
\end{figure} \end{figure}
For each tile of the video, an adaptation set is declared in the MPD, and a supplemental property is defined in order to give the client information about the tile. For each tile of the video, an adaptation set is declared in the MPD, and a supplemental property is defined in order to give the client information about the tile.
@ -92,19 +92,20 @@ This is especially interesting in the context of 3D streaming since we have this
We briefly survey other research on prefetching that focuses on non-continuous interaction in other types of media. We briefly survey other research on prefetching that focuses on non-continuous interaction in other types of media.
In the context of navigating in a video, a recent work by~\cite{video-bookmarks} prefetches video chunks located after bookmarks along the video timeline. In the context of navigating in a video, a recent work in~\citep{video-bookmarks} prefetches video chunks located after bookmarks along the video timeline.
Their work, however, focuses on changing the user behavior to improve the prefetching hit rate, by depicting the state of the prefetched buffer to the user. Their work, however, focuses on changing the user behavior to improve the prefetching hit rate, by depicting the state of the prefetched buffer to the user.
Carlier et al.\ also consider prefetching in the context of zoomable videos in an earlier work~\cite{zoomable-video}, and showed that predicting which region of videos the user will zoom into or pan to by analyzing interaction traces from users is difficult. Carlier et al.\ also consider prefetching in the context of zoomable videos in an earlier work~\citep{zoomable-video}, and showed that predicting which region of videos the user will zoom into or pan to by analyzing interaction traces from users is difficult.
Prefetching for navigation through a sequence of short online videos is considered by khemmarat et al in~\cite{user-generated-videos}. Prefetching for navigation through a sequence of short online videos is considered in~\citep{user-generated-videos}.
Each switch from the current video to the next can be treated as a non-continuous interaction. Each switch from the current video to the next can be treated as a non-continuous interaction.
The authors proposed recommendation-aware prefetching --- to prefetch the prefix of videos from the search result list and related video list, as these videos are likely to be of interest to the user and other users from the same community. The authors proposed recommendation-aware prefetching --- to prefetch the prefix of videos from the search result list and related video list, as these videos are likely to be of interest to the user and other users from the same community.
\cite{video-navigation-mpd} consider the problem of prefetching in the context of a hypervideo; non-continuous interaction happens when users click on a hyperlink in the video. \citep{video-navigation-mpd} consider the problem of prefetching in the context of a hypervideo; non-continuous interaction happens when users click on a hyperlink in the video.
They propose a formal framework that captures the click probability, the bandwidth, and the bit rate of videos as a markov decision problem, and derive an optimal prefetching policy. They propose a formal framework that captures the click probability, the bandwidth, and the bit rate of videos as a markov decision problem, and derive an optimal prefetching policy.
\cite{joserlin} propose Joserlin, a generic framework for prefetching that applies to any non-continuous media, but focuses on peer-to-peer streaming applications. \citep{joserlin} propose Joserlin, a generic framework for prefetching that applies to any non-continuous media, but focuses on peer-to-peer streaming applications.
They do not predict which item to prefetch, but rather focus on how to schedule the prefetch request and response. They do not predict which item to prefetch, but rather focus on how to schedule the prefetch request and response.
There is a huge body of work on prefetching web objects in the context of the world wide web. Interested readers can refer to numerous surveys written on this topic (e.g.,~\cite{survey-caching-prefetching}). There is a huge body of work on prefetching web objects in the context of the world wide web.
Interested readers can refer to numerous surveys written on this topic, e.g.~\citep{survey-caching-prefetching}.