Working on new chapter

This commit is contained in:
Thomas Forgione 2023-06-12 17:32:45 +02:00
parent a0608e9e83
commit b436c50bde
5 changed files with 305 additions and 0 deletions

View File

@ -26,6 +26,8 @@
#include "dash-3d/main.typ"
#include "mobile/main.typ"
#pagebreak()
#bibliography("bib.bib", style: "chicago-author-date")

257
mobile/bookmark.typ Normal file
View File

@ -0,0 +1,257 @@
== Desktop and mobile interactions<m:interaction>
=== Desktop interaction
Regarding desktop interaction, we keep the interaction we described in Section~\ref{bi:our-nve}, namely:
- W, A, S and D keys to translate the camera;
- mouse motions to rotate the camera.
A screenshot of this interface is displayed in @m:desktop.
#figure(
image("../assets/system-bookmarks/screenshots/desktop.png", width: 100%),
caption: [Screenshot of the desktop version, with a bookmark and its thumbnail on the bottom left corner and three bookmarks],
)<m:desktop>
=== Mobile interaction
Mobile interactions are more complex because the user does not have the keyboard and mouse to interact with.
However, there are some other sensors on most mobile devices that can help interaction.
One useful sensor for 3D interaction on mobile devices is definitely the gyroscope.
We use the gyroscope to enable a user to rotate his device to rotate the virtual camera.
We also add the possibility to rotate the camera by using touch controls. The user can touch a part of the screen to get a hold at the virtual camera, and drag the camera direction along the two screen axis.
This way, the user is not forced to perform a real-world half-turn to be able to look behind or to point the device towards the sky (which can quickly become tiring) to look up.
These interactions, however, only allow the user to rotate the camera but not translate it.
For this reason, we display a small joystick on the bottom-left corner of the screen that mimics the first person video games interactions and allows the user travelling in the scene:
- moving the joystick up makes the camera move forward;
- moving the joystick down makes the camera move backwards;
- moving the joystick on the sides makes the camera move sideways.
A screenshot of this interface is displayed in @m:mobile. The virtual joystick is rendered as a black circle inside a larger semi-transparent white circle. The black circle can be moved up, down, and sideways to define the direction in which the camera is translated.
#figure(
image("../assets/system-bookmarks/screenshots/mobile.png", width: 100%),
caption: [Screenshot of the mobile version, with its joystick on the bottom left corner],
)<m:mobile>
== Adding bookmarks into DASH NVE framework<m:bookmarks>
While the previously defined interactions allow users to navigate freely throughout the scene, controlling such a high number of degrees of freedom can feel overwhelming to some users. That is why we introduce bookmarks, i.e. widgets that help the users reach a distant part of the scene using only a single, simple, interaction.
=== Bookmark interaction and visual aspect
In Chapter~\ref{bi} Section~\ref{bi:3d-bookmarks}, we described two 3D widgets that we use to display bookmarks to users.
One of the conclusions of the user-study, described in Section~\ref{bi:user-study}, was that the impact of the way we display bookmark was not significant.
In this work, we chose a slightly different way of representing bookmarks due to some concerns with our original representations:
- viewport bookmarks are simple, but non computer vision experts may not be familiar with this type of representation;
- arrow bookmarks are more intuitive to most users, but need to be regenerated when the camera moves, which can harm the rendering framerate.
For these reasons, we changed the bookmarks display to a vertical bar textured with a 2D sprite of a pictorial representation of an eye. The use of such symbol is partly inspired by the cartographic pictograms used to showcase a worthwhile panorama.
This 2D sprite is always facing the camera to prevent it from being invisible when the camera would be on the side of it.
Screenshots of user interfaces with bookmarks are available in @m:desktop and @m:mobile.
The size of the sprite changes over time following a sine function to help the user distinguish what is part of the scene and what is extra widgets.
Since our scene is static, a user knows that a changing object is not part of the scene, but part of the UI.
The other bookmark parameters remain unchanged since @bi: in order to avoid users to lose context, clicking on a bookmark triggers an automatic, smooth, camera displacement that ends up at the bookmarked camera position.
We also display a thumbnail of the bookmark's viewpoint when the mouse hovers a bookmark.
Such thumbnail is displayed in @m:desktop.
Note that since on mobile, there is no mouse and thus no pointer, thumbnails are not used in the mobile setting.
// \begin{algorithm}[th]
// \SetKwInOut{Input}{input}
// \SetKwInOut{Output}{output}
//
// \SetKwData{BookmarkViewpoint}{bookmarked\_viewpoint}
// \SetKwData{OptimalOrder}{optimal\_order}
// \SetKwData{TotalModel}{total\_model}
// \SetKwData{EmptyModel}{empty\_model}
// \SetKwData{I}{i}
// \SetKwData{BestSegment}{best\_segment}
// \SetKwData{BestPsnr}{best\_delta\_psnr}
// \SetKwData{PreviousPsnr}{previous\_psnr}
// \SetKwData{None}{none}
// \SetKwData{CurrentModel}{current\_model}
// \SetKwData{EmptyRender}{empty\_render}
// \SetKwData{CurrentRender}{current\_render}
// \SetKwData{GroundTruthRender}{ground\_truth\_render}
// \SetKwData{CurrentPsnr}{current\_psnr}
// \SetKwData{CurrentDeltaPsnr}{current\_delta\_psnr}
// \SetKwData{Segment}{segment}
// \SetKwData{Candidates}{candidates}
// \SetKwFunction{Render}{render}
// \SetKwFunction{Psnr}{psnr}
//
// \Input{The bookmarked viewpoint, the ground truth render at this viewpoint, the candidate segments}
// \Output{The optimal order of the segments}
//
// \OptimalOrder\leftarrow{} []\;
// \EmptyRender\leftarrow\Render{\EmptyModel,\BookmarkViewpoint}\;
// \PreviousPsnr\leftarrow\Psnr{\EmptyRender,\GroundTruthRender}\;
//
// \TotalModel\leftarrow\EmptyModel\;
//
// \For{\I\in0\ldots200}{%
//
// \BestSegment\leftarrow\None\;
// \BestPsnr\leftarrow0\;
//
// \For{\Segment\in\Candidates}{%
//
// \CurrentModel\leftarrow\TotalModel\cup\Segment\;
// \CurrentRender\leftarrow\Render{\CurrentModel,\BookmarkViewpoint}\;
// \CurrentPsnr\leftarrow\Psnr{\CurrentRender,\GroundTruthRender}\;
// \CurrentDeltaPsnr\leftarrow\CurrentPsnr$-$\PreviousPsnr\;
//
// \If{\CurrentDeltaPsnr$>$\BestPsnr}{%
// \BestSegment\leftarrow\Segment\;
// \BestPsnr\leftarrow\CurrentDeltaPsnr\;
// }
// }
// \OptimalOrder\leftarrow\OptimalOrder+\BestSegment\;
// \TotalModel\leftarrow\TotalModel\cup\BestSegment\;
// }
// \caption{Computation of the optimal order of segments from a bookmark\label{sb:algo-optimal-order}}
// \end{algorithm}
=== Segments utility at bookmarked viewpoint<m:utility>
Introducing bookmarks is a way to make users navigation more predictable.
Indeed, since they are emphasized and, in a way, recommended viewpoints, bookmarks are more likely to be visited by a significant portion of users than any other viewpoint on the scene.
As such, bookmarks can be used as a way to optimize streaming by downloading segments in an optimal, precomputed order.
More specifically, segment utility as introduced in Section~\ref{d3:utility} is only an approximation of the segment's true contribution to the current viewpoint rendering.
When bookmarks are defined, it is possible to obtain a better measure of segment utility by performing an offline rendering at each bookmark's viewpoint.
// Then, by simply counting the number of pixels that are rendered using each segment, we can rank the segments by order of importance in the rendering.
We define $cal(U)^* (s,B_i)$ as being the optimized utility of a segment $s$ in a viewpoint defined at bookmark $B_i$.
In order to compute the optimized utility of a segment, we developed Algorithm~\ref{sb:algo-optimal-order}, that sorts segments according to their optimized utility.
This algorithm takes as input the considered viewpoint, the ground truth rendering from this viewpoint and the set of segments (both geometry and texture) to sort.
Starting from an empty model, each segment from the set of candidates is independently added to the scene, and the PSNR between the corresponding render and the ground truth render is computed.
We can thus determine which segment brings the highest $Delta"PSNR" slash s$, $s$ being the size of the segment in bytes.
Once the best segment is found, it is registered, and a new iteration begins.
That way, we are able to generate an order of segments sorted by $Delta"PSNR" slash s$.
This order is then saved as a JSON file that a client can download in order to know which segments contribute the most to a certain viewpoint.
Sorting all the segments from the model would be an excessively time consuming computation.
To speed up this algorithm, we only sort the 200 first best segments, and we choose these segments among a filtered set of candidates.
To find those candidates, we reuse the ideas developed in Chapter~\ref{bi}.
We render the "pixel to geometry segment" and "pixel to texture" maps, as shown in @m:bookmarks-utility.
These renderings allow us to know which geometry segment and which texture correspond to each pixel, and filter out useless candidates.
#figure(
grid(
columns: (1fr, 0.1fr, 1fr, 0.1fr, 1fr),
image("../assets/system-bookmarks/bookmark/ground-truth.png", width: 100%),
[],
image("../assets/system-bookmarks/bookmark/geometry.png", width: 100%),
[],
image("../assets/system-bookmarks/bookmark/texture.png", width: 100%),
),
caption: [A bookmarked viewpoint (left), a pixel to geometry segment map (center), and a pixel to texture map (right)]
)<m:bookmarks-utility>
Figure~\ref{sb:precomputation} shows how this precomputation improves the quality of rendering.
Each curve represents the PSNR one can obtain by downloading a certain amount of data following a streaming policy.
The blue curve, labelled "Default order", is obtained by optimizing the utilities as defined in Section~\ref{d3:utility}, whereas the green curve labelled "Proposed order" uses the sorting computed in Algorithm~\ref{sb:algo-optimal-order}.
We can observe that for the same amount of data downloaded, the optimized order reaches a higher PSNR which means that its utility metric is more accurate.
Note that this curve is averaged over all the 9 bookmarks of the scene. These bookmarks are chosen to cover the widest area in the scene, and each one faces a particular object-of-interest.
// \begin{figure}[th]
// \centering
// \begin{tikzpicture}
// \begin{axis}[
// xlabel=Data downloaded (in B),
// ylabel=PSNR,
// no markers,
// cycle list name=mystyle,
// width=\tikzwidth,
// height=\tikzheight,
// legend pos=south east,
// xmin=0,
// ]
//
// \addplot table [x=x, y=y]{assets/system-bookmarks/precomputation/greedy.dat};
// \addlegendentry{\scriptsize Default order $cal(U)$}
// \addplot table [x=x, y=y]{assets/system-bookmarks/precomputation/precomputed.dat};
// \addlegendentry{\scriptsize Proposed order $cal(U)^*$}
//
// \end{axis}
// \end{tikzpicture}
// \caption{Impact of using the precomputed information of bookmarks to select segments to download\label{sb:precomputation}}
// \end{figure}
=== MPD modification
We now present how to include bookmarks information in the Media Presentation Description (MPD) file.
Bookmarks are fully defined by a position, a direction, and the additional content needed to properly render and use a bookmark in a system.
This additional data consist in two files: a thumbnail of the point of view at the bookmark, along with the JSON file giving the optimal segment order for this viewpoint, as computed by Algorithm~\ref{sb:algo-optimal-order}.
For this reason, for each bookmark, we create a separate adaptation set in the MPD.
The bookmarked viewpoint information is stored as a supplemental property.
Bookmarks adaptation set only contain one representation, composed of two segments: the thumbnail used as a preview for the desktop interface and the JSON file.
#figure(
align(left,
raw(
read("../assets/system-bookmarks/bookmark-as.xml"),
block: true,
lang: "xml"
),
),
caption: [MPD description of a geometry adaptation set, and a texture adaptation set],
)<m:bookmark-as>
An example of a bookmark adaptation set is depicted on @m:bookmark-as.
The three first values in the supplemental property are the camera position coordinates, and the three last values are the target point coordinates.
=== Loader modifications
We build on the loader introduced in Algorithm~\ref{d3:next-segment} to implement a client adaptation logic.
We include a bookmark adaptation logic such that (i) when a bookmark is hovered for the first time, the corresponding thumbnail image as well as the JSON file containing the optimal order of the segments (see @m:bookmark-as) are downloaded, and (ii) when a bookmark is clicked, we switch from utility $cal(U)$ to optimized utility $cal(U)^*$ to determine which segments to download next.
// \begin{algorithm}[th]
// \SetKwInOut{Input}{input}
// \SetKwInOut{Output}{output}
//
// \SetKw{Continue}{continue}
// \SetKwData{Bw}{bw\_estimation}
// \SetKwData{Rtt}{rtt\_estimation}
// \SetKwData{CurrentSegment}{segment}
// \SetKwData{Segment}{best\_segment}
// \SetKwData{Candidates}{candidates}
// \SetKwData{AllSegments}{all\_segments}
// \SetKwData{DownloadedSegments}{downloaded\_segments}
// \SetKwData{Frustum}{frustum}
// \SetKwFunction{Argmax}{argmax}
// \SetKwFunction{Filter}{filter}
// \SetKwFunction{EstimateNetwork}{estimate\_network\_parameters}
// \SetKwFunction{Append}{append}
//
// \Input{Current index $i$, time $t_i$, viewpoint $v(t_i)$, buffer of already downloaded \texttt{segments}, MPD, utility metric $cal(U)$, streaming polic $\Omega$}
// \Output{Next segment to request, updated buffer}
// \BlankLine{}
//
// \uIf{bookmark clicking}{%
// \uIf{not optimal order fetched}{%
// \Return{} optimal order segment\;
// }
// \Else{%
// \Return{} next segment\;
// }
// }
// \Else{%
// \tcc{Loading policy from previous chapter}
// (\Bw, \Rtt) \leftarrow{} \EstimateNetwork{}\;
//
// \BlankLine{}
// \Candidates\leftarrow{} \AllSegments\newline\makebox[1cm]{}.\Filter{$\CurrentSegment\rightarrow\CurrentSegment\notin\DownloadedSegments$}\newline\makebox[1cm]{}.\Filter{$\CurrentSegment\rightarrow\CurrentSegment\in\Frustum$}\;
// \BlankLine{}
// \Segment\leftarrow{} \Argmax{\Candidates, \CurrentSegment\rightarrow{} $\Omega\left(cal(U),\CurrentSegment\right)$}\;
// \DownloadedSegments.\Append{\Segment}\;
// \Return\Segment;
// {\caption{Algorithm to identify the next segment to query\label{sb:next-segment}}}
//
// }
// \end{algorithm}
//

9
mobile/conclusion.typ Normal file
View File

@ -0,0 +1,9 @@
== Conclusion
In this chapter, our objective was to propose a mobile interface for DASH-3D and to integrate back the interaction aspects that we developed in @bi.
%We have seen that doing so is not trivial, and many improvements have been made.
For aesthetics and performance reasons, the UI of the bookmarks has changed, and new interactions were proposed for free navigation in the 3D scene.
We developed an algorithm that computes offline a better order of segments for bookmarks than what a greedy policy would do.
We encoded this optimal order in a JSON file and we modified our MPD in order to give metadata about bookmarks to the client, as well as modified our client implementation to benefit from this.
We then conducted a user study on 18 participants where users had to navigate in scenes with bookmarks and using various streaming policies.
The results indicate that users prefer the optimized version of the client streaming policy, which is coherent with the PSNR values that we computed. The results also show that users who enjoy an optimized policy tend to use the bookmarks more.

13
mobile/introduction.typ Normal file
View File

@ -0,0 +1,13 @@
== Introduction
In @bi, we described how it is possible to modify a user interface to ease user navigation in a 3D scene, and how the system can benefit from it.
In @d3, we presented the DASH-3D streaming system, which does not depend on the interface nor on the user interaction.
In this chapter, we will analyze how the user interaction can impact performances of DASH-3D.
In order to do so, we follow these two steps based on our DASH framework:
- we design an interface allowing to navigate in a 3D scene on both desktop and mobile devices;
- we improve and adapt the bookmarks described in @bi to the context of DASH-3D and to mobile interaction.
In @m:interaction, we present the different choices we made for the interfaces, and we describe the new mobile interface.
In Section~\ref{sb:bookmarks}, we describe how we embed the bookmarks into our DASH framework, and how we precompute data in order to improve the user quality of experience.
In Section~\ref{sb:evaluation}, we describe the user study we conducted, the data we collected and we analyse the results.

24
mobile/main.typ Normal file
View File

@ -0,0 +1,24 @@
#import "../template.typ"
#template.beforeChapter()
= Bookmarks for DASH-3D on mobile devices<m>
#template.afterNumberedChapter()
The growing capabilities and usage of mobile devices, especially smartphones, nowadays incur a progressive shift of many applications from desktop to mobile devices. In order to be made available and usable by the greater audience, 3D streaming and visualization should also be possible on mobile devices.
However, desktop devices tend to be much more powerful, have a larger memory and better network connections than mobile devices.
In addition, the interactive modalities of these two types of devices are not comparable in any way: the desktop mostly uses keyboard and mouse, whereas most of the mobile devices only have a touchscreen, as well as various additional sensors (accelerometer, gyroscope, GPS, etc.).
For these reasons, using DASH to stream 3D on mobile devices requires specific adaptations, that we describe in this chapter.
We add some widgets on the screen to support touch interactions: a virtual joystick is displayed on the screen and the user can touch it to translate the camera, instead of using the W, A, S and D keys on a computer keyboard.
Since most mobile devices embed a gyroscope, we allow users to rotate the camera by physically rotating the device.
This interaction is more precise and intuitive to the user, but it is also more tiring, this is why we also added a touch interaction to rotate the screen: a user can also "touch and drag" at any point on the screen that does not correspond to the joystick to rotate the camera.
In order to ease navigation, we integrate bookmarks back, and we propose an enhanced version of the precomputations explained in @d3 that we encode in the DASH Media Presentation Description.
We then present a user study on 18 participants, which evaluates how users perceive the visual quality of the scene, and how their interactions affect it.
#pagebreak()
#include("introduction.typ")
#include("bookmark.typ")
#include("conclusion.typ")