phd/src/foreword/implementation.tex

152 lines
11 KiB
TeX

\section{Implementation details}
During this thesis, a lot of software has been developed, and for this software to be successful and efficient, we chose appropriate languages.
When it comes to 3D streaming systems, we need two kind of software.
\begin{itemize}
\item \textbf{Interactive applications} which can run on as many devices as possible so we can easily conduct user studies. For this context, we chose the \textbf{JavaScript} language.% , since it can run on many devices and it has great support for WebGL\@.
\item \textbf{Native applications} which can run fast on desktop devices, in order to prepare data, run simulations and evaluate our ideas. For this context, we chose the \textbf{Rust} language.% , which is a somewhat recent language that provides both the efficiency of C and C++ and the safety of functional languages.
\end{itemize}
\subsection{JavaScript}
\paragraph{THREE.js.}
On the web browser, it is now possible to perform 3D rendering by using WebGL\@.
However, WebGL is very low level and it can be painful to write code, even to render a simple triangle.
For example, \href{https://www.tutorialspoint.com/webgl/webgl_drawing_a_triangle.htm}{this tutorial}'s code contains 121 lines of JavaScript, 46 being code (not comments or empty lines) to render a simple, non-textured triangle.
For this reason, it seems unreasonable to build a system like the one we are describing in raw WebGL\@.
There are many libraires that wrap WebGL code and that help people building 3D interfaces, and \href{https://threejs.org}{THREE.js} is a very popular one (56617 stars on github, making it the 35th most starred repository on GitHub as of November 26th, 2019\footnote{\url{https://web.archive.org/web/20191126151645/https://gitstar-ranking.com/mrdoob/three.js}}).
THREE.js acts as a 3D engine built on WebGL\@.
It provides classes to deal with everything we need:
\begin{itemize}
\item the \textbf{Renderer} class contains all the WebGL code needed to render a scene on the web page;
\item the \textbf{Object} class contains all the boilerplate needed to manage the tree structure of the content, it contains a transform (translation and rotation) and it can have children that are other objects;
\item the \textbf{Scene} class is the root object, it contains all of the objects we want to render and it is passed as argument to the render function;
\item the \textbf{Geometry} and \textbf{BufferGeometry} classes are the classes that hold the vertex buffers, we will discuss it more in the next paragraph;
\item the \textbf{Material} class is the class that holds the properties used to render geometry (the most important information being the texture), there are many classes derived from Material, and the developer can choose what material they want for their objects;
\item the \textbf{Mesh} class is the class that links the geometry and the material, it derives the Object class and can thus be added to a scene and rendered.
\end{itemize}
A snippet of the basic usage of these classes is given in Snippet~\ref{f:three-hello-world}.
\begin{figure}[th]
\lstinputlisting[%
language=javascript,
caption={A THREE.js \emph{hello world}},
label=f:three-hello-world,
]{assets/dash-3d-implementation/base.js}
\end{figure}
\paragraph{Geometries.\label{f:geometries}}
Geometries are the classes that hold the vertices, texture coordinates, normals and faces.
THREE.js proposes two classes for handling geometries:
\begin{itemize}
\item the \textbf{Geometry} class, which is made to be developer friendly and allows easy editing but can suffer from performance issues;
\item the \textbf{BufferGeometry} class, which is harder to use for a developer, but allows better performance since the developer controls how data is transmitted to the GPU\@.
\end{itemize}
\subsection{Rust}
In this section, we explain the specificities of Rust and why it is an adequate language for writing efficient native software safely.
\subsubsection{Borrow checker}
Rust is a system programming language focused on safety.
It is made to be efficient (and effectively has performances comparable to C\footnote{\url{https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust.html}} or C++\footnote{\url{https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust-gpp.html}}) but with some extra features.
C++ users might see it as a language like C++ but that forbids undefined behaviours.\footnote{in Rust, when you need to execute code that might lead to undefined behaviours, you need to put it inside an \texttt{unsafe} block. Many operations are not available outside an \texttt{unsafe} block (e.g., dereferencing a pointer, or mutating a static variable). The idea is that you can use \texttt{unsafe} blocks when you require it, but you should avoid it as much as possible and when you do it, you must be particularly careful.}
The most powerful concept from Rust is \emph{ownership}.
Basically, every value has a variable that we call its \emph{owner}.
To be able to use a value, you must either be its owner or borrow it.
There are two types of borrow, the immutable borrow and the mutable borrow (roughly equivalent to references in C++).
The compiler comes with the \emph{borrow checker} which makes sure you only use variables that you are allowed to use.
For example, the owner can only use the value if it is not being borrowed, and it is only possible to either mutably borrow a value once, or immutably borrow a value many times.
At first, the borrow checker seems particularly efficient to detect bugs in concurrent software, but in fact, it is also decisive in non concurrent code.
Consider the piece of C++ code in Snippets~\ref{f:undefined-behaviour-cpp} and~\ref{f:undefined-behaviour-cpp-it}.
\begin{figure}[ht]
\centering
\begin{minipage}[c]{0.85\textwidth}
\lstinputlisting[
language=c++,
label={f:undefined-behaviour-cpp},
caption={Undefined behaviour with for each syntax}
]{assets/dash-3d-implementation/undefined-behaviour.cpp}
\lstinputlisting[
language=c++,
label={f:undefined-behaviour-cpp-it},
caption={Undefined behaviour with iterator syntax}
]{assets/dash-3d-implementation/undefined-behaviour-it.cpp}
\end{minipage}
\end{figure}
This loop should go endlessly because the vector grows in size as we add elements in the loop.
But the most important thing here is that since we add elements to the vector, it will eventually need to be reallocated, and that reallocation will invalidate the iterator, meaning that the following iteration will provoke an undefined behaviour.
The equivalent code in Rust is in Snippets~\ref{f:undefined-behaviour-rs} and~\ref{f:undefined-behaviour-rs-it}.
\begin{figure}[ht]
\centering
\begin{minipage}[c]{0.45\textwidth}
\lstinputlisting[
language=rust,
caption={Rust version of Snippet~\rawref{f:undefined-behaviour-cpp}},
label={f:undefined-behaviour-rs}
]{assets/dash-3d-implementation/undefined-behaviour.rs}
\end{minipage}
\quad\quad\quad
\begin{minipage}[c]{0.45\textwidth}
\lstinputlisting[
language=rust,
caption={Rust version of Snippet~\rawref{f:undefined-behaviour-cpp-it}},
label={f:undefined-behaviour-rs-it}
]{assets/dash-3d-implementation/undefined-behaviour-it.rs}
\end{minipage}
\end{figure}
What happens is that the iterator needs to borrow the vector.
Because it is borrowed, it can no longer be borrowed as mutable since mutating it could invalidate the other borrowers.
And effectively, the borrow checker will crash the compiler with the error in Snippet~\ref{f:undefined-behaviour-rs-error}.
\begin{figure}[ht]
\lstinputlisting[
language=XML,
caption={Error given by the compiler on Snippet~\ref{f:undefined-behaviour-rs}},
label={f:undefined-behaviour-rs-error}
]{assets/dash-3d-implementation/undefined-behaviour-error.txt}
\end{figure}
This example is one of the many examples of how powerful the borrow checker is: in Rust code, there can be no dangling reference, and all the segmentation faults coming from them are detected by the compiler.
The borrow checker may seem like an enemy to newcomers because it often rejects code that seem correct, but once they get used to it, they understand what is the problem with their code and either fix the problem easily, or realize that the whole architecture is wrong and understand why.
It is probably for those reasons that Rust is the \emph{most loved programming language} according to the Stack Overflow Developer Survey in~\citeyear{so-survey-2016}, \citeyear{so-survey-2017}, \citeyear{so-survey-2018} and~\citeyear{so-survey-2019}.
\subsubsection{Tooling}
Moreover, Rust comes with many programs that help developers.
\begin{itemize}
\item \href{https://github.com/rust-lang/rust}{\textbf{\texttt{rustc}}} is the Rust compiler. It is comfortable due to the clarity and precise explanations of its error messages.
\item \href{https://github.com/rust-lang/cargo}{\textbf{\texttt{cargo}}} is the official Rust's project and package manager. It manages compilation, dependencies, documentation, tests, etc.
\item \href{https://github.com/racer-rust/racer}{\textbf{\texttt{racer}}}, \href{https://github.com/rust-lang/rls}{\textbf{\texttt{rls}} (Rust Language Server)} and \href{https://github.com/rust-analyzer/rust-analyzer}{\textbf{\texttt{rust-analyzer}}} are software that manage automatic compilation to display errors in code editors as well as providing semantic code completion.
\item \href{https://github.com/rust-lang/rustfmt}{\textbf{\texttt{rustfmt}}} auto formats code.
\item \href{https://github.com/rust-lang/rust-clippy}{\textbf{\texttt{clippy}}} is a linter that detects unidiomatic code and suggests modifications.
\end{itemize}
\subsubsection{Glium}
When we need to perform rendering for 3D content analysis or for evaluation, we use the \href{https://github.com/glium/glium}{\texttt{glium}} library.
Glium has many advantages over using raw OpenGL calls.
Its objectives are:
\begin{itemize}
\item to be easy to use: it exposes functions that are higher level than raw OpenGL calls, but still low enough level to let the developer free;
\item to be safe: debugging OpenGL code can be a nightmare, and glium does its best to use the borrow checker to its advantage to avoid OpenGL bugs;
\item to be fast: the binary produced use optimized OpenGL functions calls;
\item to be compatible: glium seeks to support the latest versions of OpenGL functions and falls back to older functions if the most recent ones are not supported on the device.
\end{itemize}
\subsubsection{Conclusion}
In our work, many tasks will consist in 3D content analysis, reorganization, rendering and evaluation.
Many of these tasks require long computations, lasting from hours to entire days.
To perform them, we need a programming language that has good performances.
In addition, the extra features that Rust provides ease tremendously development, and this is why we use Rust for all tasks that do not require having a web interface.