This commit is contained in:
Thomas Forgione 2023-04-20 11:44:45 +02:00
parent 209aab16c7
commit 5e81bb61fc
9 changed files with 184 additions and 31 deletions

View File

@ -22,4 +22,3 @@ f 6/1 5/3 7/4 8/2
f 5/1 1/3 3/4 7/2
f 4/1 8/3 7/4 3/2
f 2/1 1/3 5/4 6/2

20
chapter.typ Normal file
View File

@ -0,0 +1,20 @@
// Chapter management
#let chapter(title, count: true) = {
if count {
counter("chapter").step()
}
align(right, {
v(100pt)
if (count) {
text(size: 50pt)[Chapter ]
text(size: 150pt, fill: rgb(173, 216, 230), counter("chapter").display())
linebreak()
v(50pt)
}
text(size: 40pt, title)
})
if count {
pagebreak()
}
}

View File

@ -20,7 +20,7 @@ A 3D model encoded in the OBJ format typically consists in two files: the materi
The material file declares all the materials that the object file will reference.
A material consists in name, and other photometric properties such as ambient, diffuse and specular colors, as well as texture maps, which are images that are painted on faces.
Each face corresponds to a material.
A simple material file is visible on Snippet X. // TODO
A simple material file is visible on @cubemtl.
The object file declares the 3D content of the objects.
It declares vertices, texture coordinates and normals from coordinates (e.g. `v 1.0 2.0 3.0` for a vertex, `vt 1.0 2.0` for a texture coordinate, `vn 1.0 2.0 3.0` for a normal).
@ -35,28 +35,46 @@ Faces are declared by using the indices of these elements. A face is a polygon w
An object file can include materials from a material file (`mtllib path.mtl`) and apply the materials that it declares to faces.
A material is applied by using the `usemtl` keyword, followed by the name of the material to use.
The faces declared after a `usemtl` are painted using the material in question.
An example of object file is visible on Snippet X. // TODO
An example of object file is visible on @cube.
// \begin{figure}[th]
// \centering
// \begin{subfigure}[c]{0.4\textwidth}
// \lstinputlisting[
// language=XML,
// caption={An object file describing a cube},
// label=i:obj,
// ]{assets/introduction/cube.obj}
// \end{subfigure}\quad%
// \begin{subfigure}[c]{0.4\textwidth}
// \lstinputlisting[
// language=XML,
// caption={A material file describing a material},
// label=i:mtl,
// ]{assets/introduction/materials.mtl}
// \includegraphics[width=\textwidth]{assets/introduction/cube.png}
// \captionof{figure}{A rendering of the cube}
// \end{subfigure}
// \caption{The OBJ representation of a cube and its render\label{i:cube}}
// \end{figure}
#figure(
grid(
columns: (1fr, 0.2fr, 1fr),
align(center + horizon)[
#figure(
align(left,
raw(
read("../assets/introduction/cube.obj"),
block: true,
)
),
caption: [An object file describing a cube]
)<cubeobj>
],
[],
align(center + horizon)[
#figure(
align(left,
raw(
read("../assets/introduction/materials.mtl"),
block: true,
)
),
caption: [A material file describing a material]
)<cubemtl>
#figure(
align(left,
image("../assets/introduction/cube.png", width: 100%)
),
caption: [A rendering of the cube]
)
],
),
caption: [The OBJ representation of a cube and its render]
)<cube>
== Rendering a 3D model

View File

@ -8,7 +8,7 @@ When it comes to 3D streaming systems, we need two kind of software.
== JavaScript
=== THREE.js
#heading(level: 3, numbering: none)[THREE.js]
On the web browser, it is now possible to perform 3D rendering by using WebGL.
However, WebGL is very low level and it can be painful to write code, even to render a simple triangle.
@ -37,7 +37,7 @@ A snippet of the basic usage of these classes is given in @three-hello-world.
caption: [A THREE.js _hello world_]
)<three-hello-world>
=== Geometries
#heading(level: 3, numbering: none)[Geometries]
Geometries are the classes that hold the vertices, texture coordinates, normals and faces.
THREE.js proposes two classes for handling geometries:
@ -49,7 +49,7 @@ THREE.js proposes two classes for handling geometries:
In this section, we explain the specificities of Rust and why it is an adequate language for writing efficient native software safely.
=== Borrow checker
#heading(level: 3, numbering: none)[Borrow checker]
Rust is a system programming language focused on safety.
It is made to be efficient (and effectively has performances comparable to C // TODO \footnote{\url{https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust.html}} or C++\footnote{\url{https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust-gpp.html}})
@ -142,7 +142,7 @@ The borrow checker may seem like an enemy to newcomers because it often rejects
It is probably for those reasons that Rust is the _most loved programming language_ according to the Stack Overflow
Developer Survey // TODO in~\citeyear{so-survey-2016}, \citeyear{so-survey-2017}, \citeyear{so-survey-2018} and~\citeyear{so-survey-2019}.
=== Tooling
#heading(level: 3, numbering: none)[Tooling]
Moreover, Rust comes with many programs that help developers.
- #link("https://github.com/rust-lang/rust")[*`rustc`*] is the Rust compiler. It is comfortable due to the clarity and precise explanations of its error messages.
@ -152,7 +152,7 @@ Moreover, Rust comes with many programs that help developers.
- #link("https://github.com/rust-lang/rustfmt")[*`rustfmt`*] auto formats code.
- #link("https://github.com/rust-lang/rust-clippy")[*`clippy`*] is a linter that detects unidiomatic code and suggests modifications.
=== Glium
#heading(level: 3, numbering: none)[Glium]
When we need to perform rendering for 3D content analysis or for evaluation, we use the #link("https://github.com/glium/glium")[*`glium`*] library.
Glium has many advantages over using raw OpenGL calls.
@ -163,7 +163,7 @@ Its objectives are:
- to be fast: the binary produced use optimized OpenGL functions calls;
- to be compatible: glium seeks to support the latest versions of OpenGL functions and falls back to older functions if the most recent ones are not supported on the device.
=== Conclusion
#heading(level: 3, numbering: none)[Conclusion]
In our work, many tasks will consist in 3D content analysis, reorganization, rendering and evaluation.
Many of these tasks require long computations, lasting from hours to entire days.

View File

@ -1,3 +1,7 @@
#import "../chapter.typ"
#chapter.chapter[Foreword]
#include "3d-model.typ"
#include "video-vs-3d.typ"
#include "implementation.typ"

View File

@ -0,0 +1,36 @@
= Open problems
The objective of our work is to design a system which allows a user to access remote 3D content.
A 3D streaming client has lots of tasks to accomplish:
- Decide what part of the content to download next,
- Download the next part,
- Parse the downloaded content,
- Add the parsed result to the scene,
- Render the scene,
- Manage the interaction with the user.
This opens multiple problems which need to be considered and will be studied in this thesis.
#heading(level: 4, numbering: none)[Content preparation]
Before streaming content, it needs to be prepared.
The segmentation of the content into chunks is particularly important for streaming since it allows transmitting only a portion of the data to the client.
The downloaded chunks can be rendered while more chunks are being downloaded.
Content preparation also includes compression.
One of the questions this thesis has to answer is: _what is the best way to prepare 3D content so that a streaming client can progressively download and render the 3D model?_
#heading(level: 4, numbering: none)[Streaming policies]
Once our content is prepared and split in chunks, a client needs to determine which chunks should be downloaded first.
A chunk that contains data in the field of view of the user is more relevant than a chunk that is not inside; a chunk that is close to the camera is more relevant than a chunk far away from the camera.
This should also include other contextual parameters, such as the size of a chunk, the bandwidth and the user's behaviour.
In order to propose efficient streaming policies, we need to know _how to estimate a chunk utility, and how to determine which chunks need to be downloaded depending the user's interactions?_
#heading(level: 4, numbering: none)[Evaluation]
In such systems, two commonly used criteria for evaluation are quality of service, and quality of experience.
The quality of service is a network-centric metric, which considers values such as throughput and measures how well the content is served to the client.
The quality of experience is a user-centric metric: it relies on user perception and can only be measured by asking how users feel about a system.
To be able to know which streaming policies are best, one needs to know _how to compare streaming policies and evaluate the impact of their parameters on the quality of service of the streaming system and on the quality of experience of the final user?_
#heading(level: 4, numbering: none)[Implementation]
The objective of our work is to setup a client-server architecture that answers the above problems: content preparation, chunk utility, streaming policies.
In this regard, we have to find out _how do we build this architecture that keeps a low computational load on the server so it scales up and on the client so that it has enough resources to perform the tasks described above?_

38
introduction/main.typ Normal file
View File

@ -0,0 +1,38 @@
#import "../chapter.typ"
#chapter.chapter(count: false)[Introduction]
During the last years, 3D acquisition and modeling techniques have made tremendous progress.
Recent software uses 2D images from cameras to reconstruct 3D data, e.g.
#link("https://alicevision.org/\#meshroom")[Meshroom] is a free and open source software which got almost 200.000 downloads on #link("https://www.fosshub.com/Meshroom.html")[fosshub], which use _structure-from-motion_ and _multi-view-stereo_ to infer a 3D model.
More and more devices are specifically built to harvest 3D data: for example, LIDAR (Light Detection And Ranging) can compute 3D distances by measuring time of flight of light. The recent research interest for autonomous vehicles allowed more companies to develop cheaper LIDARs, which increase the potential for new 3D content creation.
Thanks to these techniques, more and more 3D data become available.
These models have potential for multiple purposes, for example, they can be printed, which can reduce the production cost of some pieces of hardware or enable the creation of new objects, but most uses are based on visualization.
For example, they can be used for augmented reality, to provide user with feedback that can be useful to help worker
with complex tasks, but also for fashion (for example, #link("https://www.fittingbox.com")[Fittingbox] is a company that develops software to virtually try glasses, as in @fittingbox).
#v(50pt)
#figure(
image("../assets/introduction/fittingbox.png", width: 45%),
caption: [My face with augmented glasses]
)<fittingbox>
#pagebreak()
3D acquisition and visualization is also useful to preserve cultural heritage, and software such as Google Heritage or 3DHop are such examples, or to allow users navigating in a city (as in Google Earth or Google Maps in 3D).
#link("https://sketchfab.com")[Sketchfab] (see @sketchfab) is an example of a website allowing users to share their 3D models and visualize models from other users.
#figure(
image("../assets/introduction/sketchfab.png", width: 100%),
caption: [Sketchfab interface]
)<sketchfab>
In most 3D visualization systems, the 3D data are stored on a server and need to be transmitted to a terminal before the user can visualize them.
The improvements in the acquisition setups we described lead to an increasing quality of the 3D models, thus an increasing size in bytes as well.
Simply downloading 3D content and waiting until it is fully downloaded to let the user visualize it is no longer a satisfactory solution, so adaptive streaming is needed.
In this thesis, we propose a full framework for navigation and streaming of large 3D scenes, such as districts or whole cities.
#pagebreak()
#include("challenges.typ")
#include("outline.typ")

27
introduction/outline.typ Normal file
View File

@ -0,0 +1,27 @@
= Thesis outline
First, in Chapter X, we give some preliminary information required to understand the types of objects we are manipulating in this thesis.
We then proceed to compare 3D and video content: video and 3D share many features, and analyzing video setting gives inspiration for building a 3D streaming system.
In Chapter X, we present a review of the state of the art in multimedia interaction and streaming.
This chapter starts with an analysis of the video streaming standards.
Then it reviews the different 3D streaming approaches.
The last section of this chapter focuses on 3D interaction.
Then, in Chapter X, we present our first contribution: an in-depth analysis of the impact of the UI on navigation and streaming in a 3D scene.
We first develop a basic interface for navigating in 3D and then, we introduce 3D objects called _bookmarks_ that help users navigating in the scene.
We then present a user study that we conducted on 51 people which shows that bookmarks ease user navigation: they improve performance at tasks such as finding objects.
% Then, we setup a basic 3D streaming system that allows us to replay the traces collected during the user study and simulate 3D streaming at the same time.
We analyze how the presence of bookmarks impacts the streaming: we propose and evaluate streaming policies based on precomputations relying on bookmarks and that measurably increase the quality of experience.
In Chapter X, we present the most important contribution of this thesis: DASH-3D.
DASH-3D is an adaptation of DASH (Dynamic Adaptive Streaming over HTTP): the video streaming standard, to 3D streaming.
We first describe how we adapt the concepts of DASH to 3D content, including the segmentation of content.
We then define utility metrics that rate each chunk depending on the user's position.
Then, we present a client and various streaming policies based on our utilities which can benefit from DASH format.
We finally evaluate the different parameters of our client.
In Chapter X, we present our last contribution: the integration of the interaction ideas that we developed in Chapter X into DASH-3D.
We first develop an interface that allows desktop as well as mobile devices to navigate in streamed 3D scenes, and that introduces a new style of bookmarks.
We then explain why simply applying the ideas developed in Chapter X is not sufficient and we propose more efficient precomputations that enhance the streaming.
Finally, we present a user study that provides us with traces on which we evaluate the impact of our extension of DASH-3D on the quality of service and on the quality of experience.

View File

@ -1,4 +1,5 @@
#set page(paper: "a4")
#show link: content => {
set text(fill: blue)
content
@ -64,13 +65,13 @@
]
]
#set text(fill: black)
#set par(first-line-indent: 1em, justify: true, leading: 1em)
// Abstracts
#pagebreak()
#set page(background: none)
#pagebreak()
#h(1em) *Titre :* Transmission Adaptative de Modèles 3D Massifs
@ -78,6 +79,7 @@
#include "abstracts/fr.typ"
#pagebreak()
#pagebreak()
#set page(background: none)
@ -92,7 +94,16 @@
#include "acknowledgments.typ"
// Content of the thesis
#set heading(numbering: "1.1 ")
#pagebreak()
#set heading(numbering: "1.1")
#include "introduction/main.typ"
#set heading(numbering: (..nums) =>
counter("chapter").display() + "." + nums
.pos()
.map(str)
.join(".")
)
#pagebreak()
#include "foreword/main.typ"