- three parts of chapter 5 reviewed

This commit is contained in:
acarlier 2019-10-18 15:26:48 +02:00
parent b228286316
commit 5782e6e709
4 changed files with 86 additions and 83 deletions

View File

@ -8,7 +8,7 @@
%\input{foreword/main}
\resetstyle{}
\input{state-of-the-art/main}
%\input{state-of-the-art/main}
\resetstyle{}
%\input{preliminary-work/main}
@ -17,7 +17,7 @@
%\input{dash-3d/main}
\resetstyle{}
%\input{system-bookmarks/main}
\input{system-bookmarks/main}
\resetstyle{}
\backmatter{}

View File

@ -11,88 +11,69 @@ Regarding desktop interaction, we keep the interaction we described in Section~\
\end{itemize}
A screenshot of this interface is displayed in Figure~\ref{sb:desktop}.
\subsection{Mobile interaction}
\copied{}
Mobile interactions are more complex because the user does not have neither keyboard nor mouse to interact with.
However, there are some other sensors on most mobile devices that can help interaction.
One useful sensor for 3D interaction on mobile devices is definitely the gyroscope.
We use the gyroscope to enable a user to rotate his device to rotate the virtual camera.
We also add the possibility to rotate the camera by using touch controls.
This way, the user is not forced to perform a real-world half-turn to be able to look behind or to keep its device pointing to the sky (which can quickly become tiring) to look up.
These interactions, however, do not allow the user to move the camera: he can rotate it but not translate it.
For this reason, we display a small joystick on the bottom left corner of the screen that mimics the first person video games interactions and allows the user translating the camera:
\begin{itemize}
\item moving the joystick up makes the camera move forward;
\item moving the joystick down makes the camera move backwards;
\item moving the joystick sideways makes the camera move sideways.
\end{itemize}
A screenshot of this interface is displayed in Figure~\ref{sb:mobile}.
\section{Adding bookmarks into DASH NVE framework\label{sb:bookmarks}}
\copied{}
In this section, we explain how to include a new interaction in the system described in the previous chapter.
\fresh{}
\subsection{Interaction --- Visual}
In Chapter~\ref{bi} Section~\ref{bi:3d-bookmarks}, we described two 3D widgets that we used to display bookmarks to users.
One of the conclusions of the user-study, described in Section~\ref{bi:user-study}, was that the impact of the way we display bookmark was not significant.
In this work, we chose a slightly different way of representing bookmarks due to some concerns with our original representations:
\begin{itemize}
\item viewport bookmarks are simple, but people who are not computer vision scientists are not familiar with this representation;
\item arrow bookmarks are complex, and need to be regenerated when the camera moves, which can harm the framerate of the rendering.
\end{itemize}
For these reasons, we changed the display to a vertical bar with a 2D sprite of a pictorial representation of an eye.
This 2D sprite is always facing the camera to prevent it from being invisible when the camera would be on the side of it.
Screenshots of user interfaces with bookmarks are available in Figures~\ref{sb:desktop} and~\ref{sb:mobile}.
The size of the sprite changes over time following a sine function to help the user distinguish what is part of the scene and what is extra widgets.
Since our scene is static, a user knows that a changing object is not part of the scene, but part of the UI\@.
The other bookmark parameters remain unchanged since Chapter~\ref{bi}: in order to avoid users to lose context, clicking on a bookmark triggers an automatic, smooth, camera displacement that ends up at the bookmark.
We also display a thumbnail of the bookmark's viewpoint when the mouse hovers a bookmark.
Such thumbnail is displayed in Figure~\ref{sb:desktop}.
Note that since on mobile, there is no mouse and thus no pointer, thumbnails are never downloaded nor displayed.
\begin{figure}[ht]
\centering
\includegraphics[width=\textwidth]{assets/system-bookmarks/screenshots/desktop.png}
\caption{Screenshot of the desktop version, with a bookmark and its thumbnail on the bottom left corner\label{sb:desktop}}
\end{figure}
\subsection{Mobile interaction}
\copied{}
Mobile interactions are more complex because the user does not have the keyboard and mouse to interact with.
However, there are some other sensors on most mobile devices that can help interaction.
One useful sensor for 3D interaction on mobile devices is definitely the gyroscope.
We use the gyroscope to enable a user to rotate his device to rotate the virtual camera.
We also add the possibility to rotate the camera by using touch controls. The user can touch a part of the screen to get a hold at the virtual camera, and drag the camera direction along the two screen axis.
This way, the user is not forced to perform a real-world half-turn to be able to look behind or to point the device towards the sky (which can quickly become tiring) to look up.
These interactions, however, only allow the user to rotate the camera but not translate it.
For this reason, we display a small joystick on the bottom-left corner of the screen that mimics the first person video games interactions and allows the user translating the camera:
\begin{itemize}
\item moving the joystick up makes the camera move forward;
\item moving the joystick down makes the camera move backwards;
\item moving the joystick on the sides makes the camera move sideways.
\end{itemize}
A screenshot of this interface is displayed in Figure~\ref{sb:mobile}. The virtual joystick is rendered as a black circle inside a larger semi-transparent white circle. The black circle can be moved up, down, and sideways to define the direction in which the camera is translated.
\begin{figure}[ht]
\centering
\includegraphics[width=\textwidth]{assets/system-bookmarks/screenshots/mobile.png}
\caption{Screenshot of the mobile version, with its joystick on the bottom left corner\label{sb:mobile}}
\end{figure}
\subsection{Segments utility at bookmarked viewpoint\label{sb:utility}}
\copied{}
Introducing bookmarks is a way to make users navigation more predictable.
Indeed, since they are emphasized and, in a way, recommended viewpoints, bookmarks are more likely to be visited by a significant portion of users than any other viewpoint on the scene.
As such, bookmarks can be used as a way to optimize streaming by downloading segments in an optimal, pre-computed order.
More specifically, segment utility as introduced in Section~\ref{d3:utility} is only an approximation of the segment's true contribution to the current viewpoint rendering.
When bookmarks are defined, it is possible to obtain a better measure of segment utility by performing an offline rendering at each bookmark's viewpoint.
% Then, by simply counting the number of pixels that are rendered using each segment, we can rank the segments by order of importance in the rendering.
We define $\mathcal{U}^{*} (s,B_i)$ as being the optimized utility of a segment $s$ in a viewpoint defined at bookmark $B_i$.
\section{Adding bookmarks into DASH NVE framework\label{sb:bookmarks}}
\fresh{}
In order to know the optimized utility of a segment, we developed Algorithm~\ref{sb:algo-optimal-order}, that sorts segments according to their optimized utility.
It takes as input the considered viewpoint, the ground truth from this viewpoint and the set of segments to sort.
It starts with an empty model, individually tries adding each segment from the set of candidates, and computes the PSNR between the corresponding render and the ground truth render.
We can thus determine which segment brings the highest $\Delta\text{PSNR} / s$, $s$ being the size of the segment in bytes.
Once the best segment is found, it is registered, and a new iteration begins.
That way, it is able to generate an order of segments sorted by $\Delta\text{PSNR} / s$.
This order is then saved as a JSON file that a client can download in order to know which segments contribute the most to a certain viewpoint.
While the previously defined interactions allow users to navigate freely throughout the scene, controlling such a high number of degrees of freedom can feel overwhelming to some users. That is why we introduce bookmarks, i.e. widgets that help the users reach a distant part of the scene using only a single, simple, interaction.
\subsection{Bookmark interaction and visual aspect}
In Chapter~\ref{bi} Section~\ref{bi:3d-bookmarks}, we described two 3D widgets that we use to display bookmarks to users.
One of the conclusions of the user-study, described in Section~\ref{bi:user-study}, was that the impact of the way we display bookmark was not significant.
In this work, we chose a slightly different way of representing bookmarks due to some concerns with our original representations:
\begin{itemize}
\item viewport bookmarks are simple, but non computer vision experts may not be familiar with this type of representation;
\item arrow bookmarks are more intuitive to most users, but need to be regenerated when the camera moves, which can harm the rendering framerate.
\end{itemize}
For these reasons, we changed the bookmarks display to a vertical bar textured with a 2D sprite of a pictorial representation of an eye. The use of such symbol is partly inspired by the cartographic pictograms used to showcase a worthwhile panorama.
This 2D sprite is always facing the camera to prevent it from being invisible when the camera would be on the side of it.
Screenshots of user interfaces with bookmarks are available in Figures~\ref{sb:desktop} and~\ref{sb:mobile}.
The size of the sprite changes over time following a sine function to help the user distinguish what is part of the scene and what is extra widgets.
Since our scene is static, a user knows that a changing object is not part of the scene, but part of the UI\@.
The other bookmark parameters remain unchanged since Chapter~\ref{bi}: in order to avoid users to lose context, clicking on a bookmark triggers an automatic, smooth, camera displacement that ends up at the bookmarked camera position.
We also display a thumbnail of the bookmark's viewpoint when the mouse hovers a bookmark.
Such thumbnail is displayed in Figure~\ref{sb:desktop}.
Note that since on mobile, there is no mouse and thus no pointer, thumbnails are never downloaded nor displayed.
\begin{algorithm}[th]
\SetKwInOut{Input}{input}
\SetKwInOut{Output}{output}
\SetKwData{BookmarkViewpoint}{bookmark\_viewpoint}
\SetKwData{BookmarkViewpoint}{bookmarked\_viewpoint}
\SetKwData{OptimalOrder}{optimal\_order}
\SetKwData{TotalModel}{total\_model}
\SetKwData{EmptyModel}{empty\_model}
@ -112,7 +93,7 @@ This order is then saved as a JSON file that a client can download in order to k
\SetKwFunction{Render}{render}
\SetKwFunction{Psnr}{psnr}
\Input{The bookmark viewpoint, the ground truth render at this viewpoint, the candidate segments}
\Input{The bookmarked viewpoint, the ground truth render at this viewpoint, the candidate segments}
\Output{The optimal order of the segments}
\OptimalOrder\leftarrow{} []\;
@ -144,6 +125,28 @@ This order is then saved as a JSON file that a client can download in order to k
\caption{Computation of the optimal order of segments from a bookmark\label{sb:algo-optimal-order}}
\end{algorithm}
\subsection{Segments utility at bookmarked viewpoint\label{sb:utility}}
\copied{}
Introducing bookmarks is a way to make users navigation more predictable.
Indeed, since they are emphasized and, in a way, recommended viewpoints, bookmarks are more likely to be visited by a significant portion of users than any other viewpoint on the scene.
As such, bookmarks can be used as a way to optimize streaming by downloading segments in an optimal, pre-computed order.
More specifically, segment utility as introduced in Section~\ref{d3:utility} is only an approximation of the segment's true contribution to the current viewpoint rendering.
When bookmarks are defined, it is possible to obtain a better measure of segment utility by performing an offline rendering at each bookmark's viewpoint.
% Then, by simply counting the number of pixels that are rendered using each segment, we can rank the segments by order of importance in the rendering.
We define $\mathcal{U}^{*} (s,B_i)$ as being the optimized utility of a segment $s$ in a viewpoint defined at bookmark $B_i$.
\fresh{}
In order to compute the optimized utility of a segment, we developed Algorithm~\ref{sb:algo-optimal-order}, that sorts segments according to their optimized utility.
This algorithm takes as input the considered viewpoint, the ground truth rendering from this viewpoint and the set of segments (both geometry and texture) to sort.
Starting from an empty model, each segment from the set of candidates is independently added to the scene, and the PSNR between the corresponding render and the ground truth render is computed.
We can thus determine which segment brings the highest $\Delta\text{PSNR} / s$, $s$ being the size of the segment in bytes.
Once the best segment is found, it is registered, and a new iteration begins.
That way, we are able to generate an order of segments sorted by $\Delta\text{PSNR} / s$.
This order is then saved as a JSON file that a client can download in order to know which segments contribute the most to a certain viewpoint.
Sorting all the segments from the model would be an excessively time consuming computation.
To speed up this algorithm, we only sort the 200 first best segments, and we choose these segments among a filtered set of candidates.
To find those candidates, we reuse the ideas developed in Chapter~\ref{bi}.
@ -159,8 +162,8 @@ These renderings allow us to know which geometry segment and which texture corre
\end{figure}
Figure~\ref{sb:precomputation} shows how this precomputation improves the quality of rendering.
Each curve represents the PSNR one can obtain by downloading a certain amount of data, and they show that, for the same amount of data downloaded, the optimized order reaches a higher PSNR than the greedy order, which means that its utility metric is more accurate.
This curve is averaged on all the 9 bookmarks of the scene: we decided the locations of the bookmarks and each bookmark faces an interesting object in the scene.
Each curve represents the PSNR one can obtain by downloading a certain amount of data following the greedy policy introduced in Section~\ref{d3:dash-adaptation}. The blue curve, labelled "Default order", is obtained by optimizing the utilities as defined in Section~\ref{d3:utility}, whereas the green curve labelled "Proposed order" uses the sorting computed in Algorithm~\ref{sb:algo-optimal-order}. We can observe that for the same amount of data downloaded, the optimized order reaches a higher PSNR which means that its utility metric is more accurate.
Note that this curve is averaged over all the 9 bookmarks of the scene. These bookmarks are chosen to cover the widest area in the scene, and each one faces a particular object-of-interest.
\begin{figure}[th]
\centering
@ -192,7 +195,7 @@ This curve is averaged on all the 9 bookmarks of the scene: we decided the locat
\copied{}
We now present how to include bookmarks information in the Media Presentation Description (MPD) file.
Bookmarks are fully defined by a position, a direction, and the additional content needed to properly render and use a bookmark in a system consists in two files: a thumbnail of the point of view at the bookmark, along with the JSON file giving the optimal segment order for this viewpoint.
Bookmarks are fully defined by a position, a direction, and the additional content needed to properly render and use a bookmark in a system consists in two files: a thumbnail of the point of view at the bookmark, along with the JSON file giving the optimal segment order for this viewpoint, as computed by Algorithm~\ref{sb:algo-optimal-order}.
For this reason, for each bookmark, we create a separate adaptation set in the MPD\@.
The bookmarked viewpoint information is stored as a supplemental property.
Bookmarks adaptation set only contain one representation, composed of two segments: the thumbnail used as a preview for the desktop interface and the JSON file.
@ -225,7 +228,7 @@ The three first values in the supplemental property are the camera position coor
\subsection{Loader modifications}
We build on the loader introduced in Algorithm~\ref{d3:next-segment} to implement a client adaptation logic.
We include a bookmark adaptation logic such that (i) when a bookmark is hovered for the first time, the corresponding images (see Listing~\ref{sb:bookmark-as}) are downloaded, and (ii) when a bookmark is clicked, we switch from utility $\mathcal{U}$ to optimized utility $\mathcal{U}^*$ to determine which segments to download next.
We include a bookmark adaptation logic such that (i) when a bookmark is hovered for the first time, the corresponding thumbnail image as well as the JSON file containing the optimal order of the segments (see Listing~\ref{sb:bookmark-as}) are downloaded, and (ii) when a bookmark is clicked, we switch from utility $\mathcal{U}$ to optimized utility $\mathcal{U}^*$ to determine which segments to download next.
\begin{algorithm}[th]
\SetKwInOut{Input}{input}

View File

@ -3,8 +3,8 @@
\section{Introduction}
In Chapter~\ref{bi}, we described how it is possible to modify a user interface to ease user navigation in a 3D scene, and how the system can benefit from it.
In Chapter~\ref{d3}, we presented a streaming system that takes neither the interface nor the user interaction into account.
Hence, it is natural to study how the user interaction can impact performances of DASH-3D.
In Chapter~\ref{d3}, we presented the DASH-3D streaming system, which does not depend on the interface nor on the user interaction.
In this chapter, we will analyze how the user interaction can impact performances of DASH-3D.
In order to do so, we followed these two steps:
\begin{itemize}

View File

@ -3,18 +3,18 @@
\minitoc{}
\newpage
Nowadays, smartphones are more and more powerful, and people slowly move their applications from their computers to smartphones or tablets.
This is why we decided to port our interface to mobile devices.
Desktop devices and mobile devices are very different.
There are many differences in terms of performance: desktop devices tend to be much more powerful and have much better memory network connection than mobile devices.
Also, the interaction is not comparable in any way: the desktop mostly uses keyboard and mouse, whereas most of the mobile devices only have a touchscreen, as well as many sensors (accelerometer, gyroscope, GPS, etc.).
This is why porting our DASH-3D client to mobile is not an easy task.
The growing capabilities and usage of mobile devices, especially smartphones, nowadays incur a progressive shift of many applications from desktop to mobile devices. In order to be made available and usable by the greater audience, 3D streaming and visualization should also be possible on mobile devices.
However, desktop devices tend to be much more powerful, have a larger memory and better network connections than mobile devices.
In addition, the interactive modalities of these two types of devices are not comparable in any way: the desktop mostly uses keyboard and mouse, whereas most of the mobile devices only have a touchscreen, as well as various additional sensors (accelerometer, gyroscope, GPS, etc.).
For these reasons, using DASH to stream 3D on mobile devices requires specific adaptations, that we describe in this chapter.
To do so, we add some widgets on the screen to support touch interaction: a virtual joystick is displayed on the screen and the user can touch it to translate the camera, instead of using the W, A, S and D keys on a computer.
We add some widgets on the screen to support touch interactions: a virtual joystick is displayed on the screen and the user can touch it to translate the camera, instead of using the W, A, S and D keys on a computer keyboard.
Since most mobile devices embed a gyroscope, we allow users to rotate the camera by physically rotating the device.
This interaction is more precise and intuitive to the user, but it is also more tiring, this is why we also added a touch interaction to rotate the screen: a user can also touch any place on the screen that does not correspond to the joystick to rotate the camera by moving the scene.
In order to ease navigation, we integrate bookmarks back, and we enhance the precomputations explained in Chapter~\ref{sb}.
We then conducted a user study on 18 participants, to test both the interaction and the streaming aspect of the bookmarks.
This interaction is more precise and intuitive to the user, but it is also more tiring, this is why we also added a touch interaction to rotate the screen: a user can also "touch and drag" at any point on the screen that does not correspond to the joystick to rotate the camera.
In order to ease navigation, we integrate bookmarks back, and we include an enhanced version of the precomputations explained in Chapter~\ref{sb} in the DASH Media Presentation Description.
We then present a user study on 18 participants, that evaluate how users perceive the visual quality of the scene, and how their interactions affect it.
\newpage