phd/src/system-bookmarks/user-study.tex

289 lines
17 KiB
TeX

\section{Evaluation}\label{sb:evaluation}
\subsection{Preliminary user study}
Before conducting the user study on mobile devices, we designed a preliminary user study for desktop devices.
This experiment was conducted on twelve users, using the 3D model described in the previous chapter (i.e.\ the Marina Bay district in Singapore).
Bookmarks were sampled from the set of locations of user-uploaded panoramic pictures available on Google Maps, and the task consisted in matching real-world pictures to their virtual location on the 3D model: users were presented with an image coming from Google Street View and they were asked to find the exact same location in the 3D model.
Due to the great difficulty of the task was, as well as the relative familiarity of the users with 3D navigation, the user behaviour was biased towards navigating slowly in the scene. The users almost never clicked the bookmarks, much less as they did during the experiment we ran in Chapter~\ref{bi}.
For these reasons, we decided to setup a new experiment, with a less complex task, and we decided to conduct this experiment on mobile device exclusively, to see how bookmarks help people navigate in a scene when controls are more cumbersome.
\subsection{Mobile navigation user study}
\subsubsection{Models}
In this user study, we display two successive 3D models to the users:
\begin{itemize}
\item For the tutorial phase, we use a model derived from a video game, representing a small scene, in order to prevent users from getting lost in the scene.
\item For all the other parts of the experiment, we use a larger version of the Singaporean district 3D model, that include neighbouring districts such as Central Business District.
\end{itemize}
\begin{figure}[ht]
\begin{subfigure}[b]{0.395\textwidth}
\includegraphics[width=\textwidth]{assets/system-bookmarks/models/before.png}
\caption{Previous chapter model}
\end{subfigure}
\begin{subfigure}[b]{0.605\textwidth}
\includegraphics[width=\textwidth]{assets/system-bookmarks/models/after.png}
\caption{Extended model}
\end{subfigure}
\caption{Models used in our user studies}
\end{figure}
\subsubsection{Experiment}
The experiment is articulated into four phases: a tutorial, a comparison between interfaces with and without bookmarks, a comparison between two streaming policies, and a final navigation during which the user is looking for objects in the scene.
\paragraph{Tutorial}
The experiment starts with a tutorial, to get the users accustomed to the controls and the interface.
This tutorial showcases the different types of interactions available, including bookmarks, and explains how to use them.
\paragraph{Bookmark}
The second part of the experiment consists in two 1 minute long sessions: the first session displays a bookmarks-free interface where the only available interactions are translations and rotations of the camera, and the second one augments the interface with bookmarks.
There are no special tasks other than to navigate around the model.
The part ends with a small questionnaire where users are asked whether they prefer navigating with or without bookmarks, along with a text field to explain their answer.
The main objective of this part of the experiment is not really to know whether people like using the bookmarks or not: we already know from our previous work and from the other parts of this experiment that they do like using the bookmarks.
This part most importantly acts as an extended tutorial: the first half trains the users with the basic controls, and the second half trains them to specifically use the bookmarks. This is why we decided not to randomize those two steps at this point.
\paragraph{Streaming}
This part of the experiment also consists in two 1 minute long sessions that use different streaming policies.
One of those experiment has the default greedy policy described in~\ref{d3:dash-adaptation}, and the other one has the enhanced policy for bookmarks described in the previous section.
The order of those two sessions is randomized to avoid biases.
% Since we know that the difference between our streaming policies is subtle, we designed a task a little more complex in order to highlight the differences so that the user can see it.
Since the behaviours of our streaming policy only differ when the user clicks a bookmark, we design a task where the users have to perform a guided tour of the scene, where each bookmark is a step of the tour.
The user starts from anywhere in the scene, and one of the bookmarks is blinking.
The user has to touch the bookmark, and observe the recommended viewpoint for a while when arriving at destination.
Once some data has been downloaded and the user could get a feeling about the quality of the streaming, another bookmark starts blinking to move one with the tour.
This setup is repeated for each streaming policy, and after the two sessions, the users have to answer a questionnaire asking the question \emph{In what session did you find the streaming the smoothest?}
The questionnaire also has a text field for users to explain their answer if they wish.
\paragraph{Free navigation}
The last part of the experiment is a free navigation with an object-finding task.
Diamonds are hidden in the scene, and are invisible until the user is close enough.
The users have to find the diamonds, and they can navigate by using indifferently the controls and the bookmarks.
The loading policy is the default greedy policy for half of the users, and the enhanced policy for bookmarks for the other half, and this order has been randomized.
This is the most important part of the study, as we aim at observing several aspects. First, we hope that users navigate using the bookmarks. Since no guideline has been given to them as to how to interact, we want to observe whether they naturally use the bookmarks or not.
In addition, we want to prove the superiority of our bookmark-optimized streaming policy by observing that users tend to perceive a better visual quality (as measured by the PSNR).
\subsubsection{Setup}
During these experiments, we need a server and a client.
The server is hosted on an Acer Aspire V3 with an Intel Core i7 3632QM processor.
The user is given a One Plus 5 that is connected to the server via Wi-fi.
There is no artificial bandwidth limitation due to the fact that the bandwidth is already limited by the Wi-fi network and by the performances of the mobile device.
\subsubsection{Participants}
18 users participated in this user-study, 15 males and 3 females, average age is 20.7 and standard deviation is 0.53.
We only proposed this user study to relatively young people to ensure they are used to mobile devices: this is a requirement for our user-study since navigating in a 3D scene on a mobile device is hard, and people who are not familiar with it will likely drop out of the experiment.
\subsection{Results}
\begin{figure}[th]
\centering
\begin{tikzpicture}
\begin{axis}[
ylabel=PSNR,
no markers,
width=\tikzwidth,
height=\tikzheight,
cycle list name=mystyle,
legend pos=south east,
xmin=0,
xmax=60,
ymin=0,
name=first plot,
xmajorticks=false,
]
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/second-experiment-0.dat};
\addlegendentry{Greedy policy}
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/second-experiment-1.dat};
\addlegendentry{Optimized policy}
\end{axis}
\begin{axis}[
xlabel=Time (in s),
ylabel=Ratio of clicks,
no markers,
width=\tikzwidth,
height=\tikzhalfheight,
cycle list name=mystyle,
legend pos=south east,
xmin=0,
xmax=60,
ymin=0,
ymax=1,
at=(first plot.south),
anchor=north,
yshift=-0.5cm,
]
\addplot[smooth, color=blue] table [y=y, x=x]{assets/system-bookmarks/final-results/second-experiment-2.dat};
\addplot[smooth, color=DarkGreen] table [dashed, y=y, x=x]{assets/system-bookmarks/final-results/second-experiment-3.dat};
\end{axis}
\end{tikzpicture}
\caption{Comparison of the PSNR during the second experiment: above, PSNR for greedy and greedy optimized for bookmarks; below, ratio of people clicking on a bookmark.\label{sb:psnr-second-experiment}}
\end{figure}
\subsubsection{Qualitative results and observations}
We were able to draw several qualitative observations while users were interacting. First, people tend to use and enjoy using the bookmarks, mostly because it helps them navigating in the scene. For the few people who verbalized they do not want to use bookmarks, most often one of the following reasons was invoked:
\begin{itemize}
\item they are comfortable enough with using the virtual joystick;
\item they find the virtual joystick funnier to use.
\end{itemize}
We also observe that the gyrocope-based interaction to rotate the camera tends to be either used a lot, either never used: we will not focus on this particular phenomenom as it is out of scope of this study, but it would make an interesting Computer-Human Interaction study.
\subsubsection{Quantitative results}
Among the 18 participants of this user study, the answers given by users at the end of the \textbf{streaming} part of the experiment were as follows: 10 indicated that they preferred the optimized policy, 4 preferred the greedy policy, and 4 did not perceive the difference.
One should note that the difference between the two policies can be described in the following terms. The greedy policy tends to favor the largest geometry segments and as a result, the scene structure tends to appear a little bit faster with this method. On the other hand, because it explicitly uses PSNR as an objective function, the optimized policy may result in downloading important textures (that appear large on the screen) before some mid-size geometry segments (that, for example, are typically far from the camera). Some of the users managed to precisely describe these differences.
Figure~\ref{sb:psnr-second-experiment} shows the evolution of the PSNR along time during the second experiment (bookmark guided tour), averaged over all users.
Below the PSNR curve is a curve that shows how many users were moving to or staying at a bookmark position at each point in time.
As we can see, the two policies have a similar performance at the beginning when very few users have clicked bookmarks.
This changes after 10 seconds, when most users have started clicking on bookmarks. A performance gap grows and the optimized policy performs better than the greedy policy. It is natural to observe such performance, as it reflects the fact that the optimized policy makes better decisions in terms of PSNR (as previously shown in Figure~\ref{sb:precomputation}). This probably explains the previous result in which users tend to prefer the optimized policy.
Figure~\ref{sb:psnr-second-experiment-after-click} shows the PSNR evolution after a click on a bookmark, averaged over all users and all clicks on bookmarks.
To compute these curves, we isolated the ten seconds after each click on a bookmark that occurs and we averaged them all.
These curves isolate the effect of our optimized policy, and shows the difference a user can feel when clicking on a bookmark.
Figures~\ref{sb:psnr-third-experiment} and~\ref{sb:psnr-third-experiment-after-click} represent the same curves on the third experiment (free navigation).
On average, the difference in terms of PSNR is less obvious, and both strategies seem to perform the same way at least in the first 50 seconds of the experiment. The optimized policy performs slightly better than the greedy policy in the end, which can be correlated with a peak in bookmark use occuring around the 50th second.
Figure~\ref{sb:psnr-third-experiment-after-click} also shows an interesting effect: the optimized policy still performs way better after a click on a bookmark, but the two curves converge to the same PSNR value after 9 seconds. This is largely task-dependent: users are encouraged to observe the scene in experiment 2, while they are encouraged to visit as much of the scene as possible in experiment 3. In average, users therefore tend to stay less long at a bookmarked point of view in the third experiment than in the second.
The most interesting fact is that on the last part of the experiment (the free navigation), the average number of clicks on bookmarks is 3 for users having the greedy policy, and 5.3 for users having the optimized policy.
The p-value for statistical significance of this observed difference is 0.06 which is almost low enough to reach the conclusion that a policy optimized for bookmarks could lead users to click on bookmarks more.
\begin{table}[th]
\centering
\begin{tabular}{ccccccccccc}
\toprule \textbf{Policy} & \multicolumn{9}{c}{\textbf{Number of clicks}} & \textbf{Average} \\
\midrule Greedy & 4 & 1 & 1 & 1 & 3 & 3 & 1 & 7 & 6 & \textbf{3}\\
Bookmark & 3 & 5 & 2 & 5 & 10 & 7& 6& 4& 6 & \textbf{5.33}\\ \bottomrule
\end{tabular}
\caption{Number of click on bookmarks on the last experiment\label{sb:table-bookmark-clicks}}
\end{table}
Table~\ref{sb:table-bookmark-clicks} illustrates the number of bookmark clicks for each user (note that distinct users did this experiment on greedy and optimized policy). As we can see, all users clicked at least once on a bookmark in this experiment, regardless of the policy they experiences. However, in the greedy policy setup, 4 users clicked only one bookmark whereas in the optimized policy setup, only one user clicked less than three bookmarks.
Everything happens as if users ere encouraged to click on bookmarks with the optimized policy, or that at least some users were discouraged to click on bookmarks with the greedy policy.
\begin{figure}[th]
\centering
\begin{tikzpicture}
\begin{axis}[
xlabel=Time (in s),
ylabel=PSNR,
no markers,
width=\tikzwidth,
height=\tikzheight,
cycle list name=mystyle,
legend pos=south east,
xmin=0,
xmax=10,
]
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/second-experiment-after-clicks-0.dat};
\addlegendentry{Greedy policy}
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/second-experiment-after-clicks-1.dat};
\addlegendentry{Optimized policy}
\end{axis}
\end{tikzpicture}
\caption{Comparison of the PSNR after a click on a bookmark during the second experiment\label{sb:psnr-second-experiment-after-click}}
\end{figure}
\begin{figure}[th]
\centering
\begin{tikzpicture}
\begin{axis}[
ylabel=PSNR,
no markers,
width=\tikzwidth,
height=\tikzheight,
cycle list name=mystyle,
legend pos=south east,
xmin=0,
xmax=100,
ymin=0,
name=first plot,
xmajorticks=false,
]
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/third-experiment-0.dat};
\addlegendentry{Greedy policy}
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/third-experiment-1.dat};
\addlegendentry{Optimized policy}
\end{axis}
\begin{axis}[
xlabel=Time (in s),
ylabel=Ratio of clicks,
no markers,
width=\tikzwidth,
height=\tikzhalfheight,
cycle list name=mystyle,
legend pos=south east,
xmin=0,
xmax=100,
ymin=0,
ymax=1,
at=(first plot.south),
anchor=north,
yshift=-0.5cm,
]
\addplot[smooth, color=blue] table [y=y, x=x]{assets/system-bookmarks/final-results/third-experiment-2.dat};
\addplot[smooth, color=DarkGreen] table [dashed, y=y, x=x]{assets/system-bookmarks/final-results/third-experiment-3.dat};
\end{axis}
\end{tikzpicture}
\caption{Comparison of the PSNR during the third experiment: above, PSNR for greedy and greedy optimized for bookmarks; below, ratio of people clicking on a bookmark.\label{sb:psnr-third-experiment}}
\end{figure}
\begin{figure}[th]
\centering
\begin{tikzpicture}
\begin{axis}[
xlabel=Time (in s),
ylabel=PSNR,
no markers,
width=\tikzwidth,
height=\tikzheight,
cycle list name=mystyle,
legend pos=south east,
xmin=0,
xmax=10,
]
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/third-experiment-after-clicks-0.dat};
\addlegendentry{Greedy policy}
\addplot table [y=y, x=x]{assets/system-bookmarks/final-results/third-experiment-after-clicks-1.dat};
\addlegendentry{Optimized policy}
\end{axis}
\end{tikzpicture}
\caption{Comparison of the PSNR after a click on a bookmark during the third experiment\label{sb:psnr-third-experiment-after-click}}
\end{figure}