Autonomous environment identification for planetary exploration rovers

(1)

POLITECNICO DI MILANO

School of Industrial and Information Engineering

Department of Aerospace Science and Technologies

Master Thesis in Space Engineering

AUTONOMOUS ENVIRONMENT IDENTIFICATION

FOR PLANETARY EXPLORATION ROVERS

Advisor :

Prof. Mauro Massari

Thesis by:

Gerardo Marenzi

841474

(2)

(3)

(4)

(5)

.

Se Dio ha creato il mondo, non possiamo dire che si sia preoccupato molto di facilitarne la comprensione. Albert Einstein

(6)

(7)

Abstract

The planetary exploration rovers are very important tools in order to collect scientific information on the surfaces of the Solar System planets and their satellites. Although the robots used in current missions have reached very complex levels, to navigate safely in an extraterrestrial environment they must be constantly controlled by operators located on the Earth.

The above mentioned remarks led to this work, which is intended to improve rovers autonomy trough the development of a system able to identify and classify the examined environment so that rovers can choose, by themselves, focused areas to reach and study.

Identify means to recognize an object in the environment and compute its coordinates

and classify means to compute the size of the identified object and match a type of material found in the database.

By means of specific information achievable from a pair of colored stereoscopic images such a system has been developed. Analyzing this pair of images it was possible to reconstruct the three-dimensional coordinates of the environment and analyze the texture features and the color. Furthermore also the information regarding the infrared emissions of the environment, has been taken into consideration. However, in this work this information has been simulated analyzing the red channel of the colored images.

All those information have been analyzed before being collected. Afterwards they have been processed by crossing them in order to classify the soil and some potential objects in the survey area.

To validate the system an environment with similar features to the Martian ones has been set outdoor. This allowed to test the developed system which proved to be robust and precise in the identification and classification of the soil and the objects.

(8)

(9)

Sommario

I rover per l’esplorazione planetaria sono uno strumento molto importante per la raccolta di informazioni scientifiche sulla superficie dei pianeti del Sistema Solare e dei loro satelliti. I robot impiegati nelle missioni attuali, pur avendo raggiunto un alto livello di complessità, per navigare in modo sicuro in un ambiente extraterrestre devono essere costantemente gestiti da operatori a Terra.

Da queste considerazioni prende avvio il presente lavoro che ha l’intento di aumentare l’autonomia dei rover attraverso lo sviluppo di un sistema che sia in grado di identificare e classificare l’ambiente osservato in modo tale che i rover possano scegliere autonomamente delle zone di interesse da avvicinare ed analizzare.

Per identificare si intende distinguere un oggetto nell’ambiente e calcolarne le coordinate nello spazio mentre per classificare si intende calcolare le dimensioni dell’oggetto identificato e assegnarne un tipo di materiale presente in un database.

Per realizzare un sistema di questo tipo ci si è basati sulle informazioni ottenibili da una coppia di immagini stereoscopiche a colori. Analizzando questa coppia di immagini è stato possibile ricostruire le coordinate tridimensionali dell’ambiente ed analizzarne le caratteristiche della texture e del colore. Inoltre sono state considerate anche le informazioni sulle emissioni ad infrarosso dell’ambiente che però nel presente lavoro sono state simulate analizzando il canale rosso delle immagini a colori.

Tutte queste informazioni sono state prima analizzate e raccolte ed in un secondo momento sono state elaborate incrociandole fra loro per identificare e classificare il terreno e i gli eventuali oggetti presenti nell’ambiente osservato.

Per validare il sistema è stato allestito in uno spazio aperto un ambiente con caratteristiche simili a quelle marziane in modo da mettere alla prova il sistema sviluppato, il quale ha dimostrato di avere buona robustezza e precisione nell’identificazione e nella classificazione del terreno e degli oggetti presenti.

(10)

(11)

List of Figures

Figure 1.1 Artistic representation of a greenhouse on Mars ... 2

Figure 1.2. The first missions on Mars surface. ... 3

Figure 1.3 Three generation of NASA’s Mars rovers: Sojourner, the smallest; Opportunity, in the middle; Curiosity, the largest. ... 4

Figure 1.4. Arrangement of cameras on board the rover Spirit and Opportunity on the left (a), zoom and cameras details of the “head” of the rover Curiosity on the right (b). ... 5

Figure 1.5 Block diagram of the general algorithm. ... 8

Figure 2.1 Pairs of stereo images. ... 11

Figure 2.2 Epipolar geometry. ... 12

Figure 2.3 Example of epipolar line. ... 12

Figure 2.4 Prospective projection and standard stereo system………13

Figure 2.5 Pixel disparity in a standard stereo system. ... 14

Figure 2.6 Disparity Map computed starting from the images of figure 2.1. ... 15

Figure 2.7 Depth accuracy. ... 16

Figure 2.8 Depth accuracy and field of view in function of the baseline. ... 16

Figure 2.9 Camera and image plane reference frames. ... 18

Figure 2.10 Chessboard reference frame ... 19

Figure 2.11 Set of the images acquired for the camera calibration. ... 20

Figure 2.12 Example of vertices identification of an image. ... 20

Figure 2.13 Chessboard spatial arrangements in the camera-centric view. ... 21

Figure 2.14 Image before and after the rectification processes. ... 21

Figure 2.15 Epipolar rectification.. ... 22

Figure 2.16 Comparison of the disparity maps obtained from the stereo pair of the figure 2.15 after the rectification. ... 23

Figure 2.17 disparity computation method. Scan line strategy on the top and cost function trend on the bottom (source: [13]). ... 24

(14)

Figure 2.19 Comparison of different algorithms that compute the disparity map in the test

1... 26

Figure 2.20 Comparison of different algorithms that compute the disparity map in the test 2... 27

Figure 2.21 Occlusions map of the left disparity map of DMAG5 and DMAG6 in test 1. ... 28

Figure 2.22 Disparity map computed by DMAG6 on the left and the same disparity map smoothed by EPS9 on the right. ... 29

Figure 2.23 Rover and field of view of the stereo system (top view). ... 30

Figure 2.24 Rover and field of view of the stereo system (lateral view). ... 31

Figure 2.25 Trend of d2 with respect to the baseline using ∆Z=0.3 m. ... 34

Figure 2.26 Gray-level histogram of different surface sample with different texture. ... 36

Figure 2.27 Comparison of images acquired with different light condition before and after the normalization process. ... 39

Figure 2.28 Binary images. ... 40

Figure 2.29 Parameters computation method for a single binary image. ... 41

Figure 2.30 classification of texture using SFTA algorithm with = 8; ... 42

Figure 2.31 RGB method ... 43

Figure 2.32 Color average operation scheme. ... 44

Figure 2.33 Average of the color of the original image using different window size. ... 44

Figure 2.34 Simulated infrared image. ... 45

Figure 2.35 Smoothed disparity map. ... 46

Figure 2.36 Focal reference frame and stereo system reference frame. ... 48

Figure 2.37 3-D reconstruction of the observed scenario using a limit of = 7 . 49 Figure 2.38 Different window size for the texture analysis. ... 50

Figure 3.1 Samples of the materials used in the database and First step of the cross-analysis of texture and color for different values of . ... 56

Figure 3.2 Second step of the cross-analysis of texture and color for different values of and = 0.5. ... 57

Figure 3.3 Prediction of the increment of the depth coordinates of the soil. ... 58

Figure 3.4 Binary image representing the soil of the reference images... 60

(15)

Figure 3.6 Scheme of the assumption done for the identification of the objects in the

second method. ... 62

Figure 3.7 Result of the first procedure of the first method for the detection of the objects. ... 63

Figure 3.8 Result of the second procedure of the first method for the detection of the objects. ... 64

Figure 3.9 Binary image of the objects using the first method. ... 64

Figure 3.10 Binary image of the objects using the first method. ... 65

Figure 3.11 Binary image of the objects using the first method after the smoothing. ... 65

Figure 3.12 Environment reconstruction ... 67

Figure 3.13 Environment reconstruction dividing the objects in smaller units. ... 68

Figure 3.14 Binary image of the obstacle. ... 70

Figure 3.15 Navigation map. ... 70

Figure 4.1 Stereo pairs of the set up environment. ... 71

Figure 4.2 Disparity map of the stereo pairs of figure 4.1. ... 72

Figure 4.3 Binary image representing the objects before the filtering process computed with the first method. ... 72

Figure 4.4 Three-dimensional environment reconstruction through the first method.. ... 73

Figure 4.5 Comparison between the objects coordinates of the true environment and the objects coordinates of the reconstructed environment. ... 74

Figure 4.6 Binary image representing the objects before the filtering process computed with the second method. ... 75

Figure 4.7 Three-dimensional environment reconstruction through the first method. ... 76

Figure 4.8 Navigation maps. ... 77

Figure 4.9 Stereo pair of the second environment analyzed. ... 78

Figure 4.10 Disparity map of the stereo pair depicted in the figure 4.9. ... 78

Figure 4.11 Binary image representing the soil obtained from the disparity map of figure 4.10... 79

Figure 4.12 Binary image representing the objects obtained using the second method from the disparity map of figure 4.10. ... 79

Figure 4.13 Environment reconstruction of the stereo pair of figure 4.9. ... 80

(16)

(17)

List of Tables

Table 2.1 Computed intrinsic parameter of the camera ... 20

Table 2.2 Computation time of the algorithms that compute the disparity map. ... 28

Table 3.1 Assignment of the degree of confidence to each pixel. ... 57

Table 4.1 Classification of the identified objects and the soil in the first method. ... 75

Table 4.2 Classification of the identified objects and the soil in the second method. ... 77

Table 4.3 Classification of the second environment analyzed. ... 80

(18)

(19)

CHAPTER 1 INTRODUCTION

1.1 Exploration of space and planets

More than fifty years of human activity in space have produced societal benefits that improve the quality of life on Earth. The first satellites, designed to study the space environment and test initial capabilities in Earth orbit, contributed critical knowledge and capabilities for developing satellite telecommunications, global positioning, and advances in weather forecasting. The challenges of space exploration have sparked new scientific and technological knowledge, which, coupled with ingenuity, provides people around the globe with solutions as well as useful products and services. Knowledge acquired from space exploration has also introduced new perspectives on our individual and collective place in the Universe.

Future space exploration goals call for sending humans and robots beyond Low Earth Orbit and establishing sustained access to destinations such as the Moon, asteroids and Mars. Space agencies participating in the International Space Exploration Coordination Group (ISECG)1 are discussing an international approach for achieving these goals, that approach begins with the International Space Station (ISS), and leads to human missions to the surface of Mars which is the planet of the solar system with most features in common with the Earth. Furthermore, in the last years many Mars colonization projects have been developed by many organizations (figure 1.1). Employing the complementary capabilities of both humans and robotic systems will enable humankind to meet this most

1_{ISECG space agencies include, in alphabetical order: ASI (Italy), CNES (France), CNSA (China), CSA} (Canada), CSIRO (Australia), DLR (Germany), ESA (Europe), ISRO (India), JAXA (Japan), KARI (Republic of Korea), NASA (USA), NSAU (Ukraine), Roscosmos (Russia), UKSA (UK).

(20)

2 | CHAPTER 1

Figure 1.1 Artistic representation of a greenhouse on Mars

ambitious space exploration challenge, and to increase benefits for society, for example in: innovation, culture, inspiration and new means to address global challenges.

Achieving the ambitious future exploration goals will further expand the economic relevance of space exploration which will continue to be an essential driver for opening up new domains in science and technology, triggering other sectors to partner with the space sector for joint research and development. Space exploration offers a unique and evolving perspective on humanity's place in the Universe, which is common to all. Every day, space exploration missions fulfill people's curiosity, producing fresh data about the solar system that brings us closer to answering profound questions that have been asked for millennia, for example: are we and our planet unique? To answer this question the planetary exploration and satellite observation are fundamental.

The ambitious project to bring the man on Mars, in order to be feasible and safe, must be based on the widest possible range of information about the Red Planet. The information obtained by probes placed in the Mars orbit is important but not sufficient to obtain the data necessary to ensure the safety of future astronauts will land.

Since the 60s the scientists thought they could achieve important results by exploring the moon and other planets and satellites of the solar system with rover, partially autonomous mobile robots, which, once on the surface, were able to explore and acquire information and transmit it to the Earth. In the 1976 the Viking 1 lander of NASA (figure 1.2b) relayed the first panoramic image of the surface of Mars (figure 1.2a), but this lander was not designed to move on the Martian surface.

(21)

Introduction | 3

Figure 1.2. The first missions on Mars surface.

Above, the first panoramic view of Mars surface by Viking 1, captured on July 20, 1976 (a). To the left, a model of the Viking lander (b). To the right, a model of the Mars Pathfinder lander (c).

Instead, thanks to the improvement of the rover construction technology, in the summer of 1997 the NASA Mars Pathfinder lander (figure 1.2c) landed on Mars, carrying the first robotic exploration vehicle Sojourner (figure 1.3) which explored a small area around the lander and returned more than 550 images. Sojourner, thanks to its installed instruments and its navigation capabilities, also returned more than 15 very important chemical analyses of rocks and soil which suggest that Mars was at one time in its past warm and wet, with water existing in its liquid state [1].

As regard the navigation system Sojourner was equipped with a pair of cameras and laser sensors used for the identification of obstacles, but the three-dimensional reconstruction of the panorama was carried out by a pair of stereo cameras placed on the lander. The images captured by the lander were transmitted to the Earth and then assessed by operators who proceeded to set the guide points to touch along the way. Despite this procedure on the robot software had been implemented a collision avoidance system which applied independently small trajectory corrections to avoid some obstacles previously not identified. Because of the architecture of the mission, the rover could not navigate outside of the reconstruction field of the lander’s stereo cameras, which

(22)

4 | CHAPTER 1

inevitably was bounded to the landing area. For this reason, Sojourner, although it has successfully accomplished its mission, had evident limits of independence.

_{New rovers have been developed on the basis of Sojourner to experiment more and}

more complex and autonomus systems. In this way the next step was represented by the prototype Rocky 7, built by JPL2. The main improvement has been made in the field of the visual system: now the stereo cameras are mounted directly on the rover, so the last was able to explore wider areas being able to turn away from the landing area. However also Rocky 7 was not able to autonomously choose the target of navigation, which must be decided by the operators who also had to assign waypoints between the rover and the target. Between the waypoints the rover was able to compute the trajectory making the analysis of the soil and avoiding reagions considered unsafe. This rover has been the basis of FIDO (Field Integrated Development and Operation) [2], a new prototype used to test the systems mounted on the twin rovers Spirit and Opportunity (figure 1.3) both landed succesfully on the Martian soil in Jenuary 2004.

Each of these twin rovers, which allowed exceptional discoveries on Mars, has 9

Figure 1.3 Three generation of NASA’s Mars rovers: Sojourner, the smallest; Opportunity, in the middle; Curiosity, the largest.

2

The NASA Jet Propulsion Laboratory (JPL) is an institute of technology in California that carries out robotic space and Earth science missions. Sojourner, Spirit, Opportunity and Curiosity were built by JPL.

(23)

Introduction | 5

Figure 1.4. Arrangement of cameras on board the rover Spirit and Opportunity on the left (a), zoom and cameras details of the “head” of the rover Curiosity on the right (b).

“eyes” (figure 1.4a) : 4 engineering Hazcams (Hazard avoidance cameras), 2 engineering Navcams (Navigation Cameras) and other 3 science cameras 2 of which are Pancams (Panoramic Cameras) [3]. The Hazcams are positioned on the lower part of the rover (2 front and 2 rear) with a FOV (Field Of View) of 120° and a range of 3 meters and are used to monitor possible nearby obstacles. The Navcam is a stereo pair of cameras, mounted on the “head”, each with a FOV of 45° to support ground navigation planning by scientists and engineers. They work in cooperation with the Hazcams by providing a complementary view of the terrain. The Pancams are the only color, stereo pair of cameras mounted on the rover “head” and delivers 3-D high resolution panoramas of the Martian surface for a scientific purpose, also performing spectral investigations.

The digital model of the surrounding area obtained via stereo techniques is analyzed from the rover's software independently. It considers as obstacles all that exceeds 30 centimeters in height and compute the trajectory to reach the target selected by the operators. The trajectory computed guarantees a safe navigation for displacements of about 30 cm, after which it is necessary to acquire new images and calculate a new trajectory in about 1 minute. In this way, the rovers reach the coordinates of the target. In the summer of 2012, under the project Mars Science Laboratory (MSL), the latest generation rover Curiosity (figure 1.3) landed on Mars. Thanks to the excellent design of Spirit and Opportunity the engineering cameras system of Curiosity is the same of its predecessors. Meantime the number and quality of scientific cameras is increased. For example the Pancams have been replacd by the Mastcams (figure 1.4b) that are very similar but more high resolution. Mars Science Laboratory's large size (shown in figure 1.3) gives the rover advantages in mobility. For instance, it has a ground clearance of slightly more than 60 centimeters, which enables it to climb over larger rocks than ever

(24)

6 | CHAPTER 1

before. In terms of autonomy, Curiosity benefits from major improvements uploaded and tested on the Mars Exploration Rovers (Spirit and Opportunity), including global path planning and visual target tracking. This software represents a leap forward for rovers that now can look ahead and plan a path to a spot 50 meters away, evading surface features like large rocks that they determine to be obstacles along the way [4].

1.2 Proposal for an innovative system

The aim of this work is to implement an algorithm capable of identify and classify autonomously the objects that are in the survey area of the rover, generating a detailed map in which are also defined the accessible zones for the navigation system. In order to do this different information regarding the photographed environment, are analyzed, like the color, the texture and the infrared emissions. The purpose of developing an autonomous system able to identify and classify objects is to allow the rover to independently choose the path to follow according to a potential identification of interesting objects. So the operators on the Earth provide the rover with the information about the objects of interest and then these objects can be identified and reached in autonomy by the rover.

Analyzing all the captured information, the system will generate a three-dimensional map by detecting the objects and providing their main characteristics and related coordinates with respect to the rover. The analysis of the territory allows to distinguish rocks and objects of various nature from the soil and from the background of the scenario observed. This innovative aspect of the system has been developed in two phases: the first one analyzes the soil and identifies the plane generating the map of the accessible zones for the navigation system; the second one, the more innovative of them, detects potential objects which are analyzed through their color, texture and infrared emissions and then classified with a certain degree of confidence in accordance with the outcome of this analysis compared to a database.

The system development requires the use of the following sensors: two color cameras for stereoscopic, color and texture analysis, and an infrared camera for infrared emissions analysis. Despite these requirements, in the present work only one color camera3 is used.

(25)

Introduction | 7

For the stereoscopic analysis the camera was properly moved for taking the two photos needed. This procedure, as it will be expounded later, makes the results less accurate but does not compromise their achievement. While as regards the infrared image, it is simulated by treating the red channel of the color camera.

The first step consists in selecting the positioning of the camera that is crucial because on it depends the size and the position of the observable surrounding area. Chosen the camera configuration, the images are acquired and analyzed. Starting from the two color images a specific algorithm computes the disparity map which is, through prospective considerations, the basis of the spatial reconstruction of the scene.

Now, for each pixel of the image, an algorithm called IVA4 computes: the x, y and z coordinates in the reference frame of the camera using the disparity map; the texture parameters of the surrounding area of the image using a dedicated algorithm; the average color of the surrounding pixel; and the temperature using the infrared information. In this way one obtains a vector which contains all the information listed above for each pixel of the image and therefore for each point of the scene identified by the camera.

Then, the computed vector is analyzed by an algorithm called IVAA5 in order to educe the information about the soil and the objects. Based on some geometric assumptions and exploiting the coordinates of the points, first the horizontal plane representing the soil and then the objects are computed. So, the area accessible for navigation is computed as the area where obstacles are not present. Subsequently the algorithm crosses the information of the texture, color and temperature of those points that are considered to be part of an object and compares them with a database. According to the matches of all information with the database objects, to each identified object of the scene a degree of confidence is provided. Is therefore provided a list of all the objects with the following information: the coordinates, the size of height and width, the material, the temperature and the degree of confidence of the assignment. The thickness of the objects cannot be known through a single "front" view but it is necessary at least one other "side" view.

Finally the three-dimensional map of the survey environment is reconstructed. The block diagram of the general algorithm implemented is depicted in the figure 1.5.

4_{Information Vector Algorithm. It is an algorithm developed in this work that computes and collects all the} information coming from the sensors and stores them in a vector.

5

Information Vector Analysis Algorithm. It is an algorithm developed in this work that analyzes the information contained in the vector computed in IVA.

(26)

8 | CHAPTER 1

Figure 1.5 Block diagram of the general algorithm.

The operational strategy designed for the rover is the following:

• acquisition of one or more pairs of images of the territory in order to reconstruct a digital map.

• calculation of the trajectory to reach an area of known coordinates set by the operators.

(27)

Introduction | 9

• autonomous identification and mapping of the object during the navigation.

• autonomous deviation from the trajectory to approach an object considered interesting.

It is to be specified that in the presented work the trajectories for the rover navigation are not computed but the information of the surrounding environment necessary to the navigation software to calculate the trajectories, are provided.

1.3 State of the art of the presented technologies

In this paragraph, the state of the art of the different areas covered by this project will be presented. As previously mentioned, the first step involves the stereo vision and the reconstruction of the disparity map. In the last decades, stereo vision has been one of the most studied tasks of computer vision and many proposals have been made in literature on this topic. Today dense stereo techniques are mainly divided into two categories: local approaches and global approaches.

Regarding local approaches, in order to increase the accuracy of disparity estimations, particularly along depth borders, state of the art algorithms deploy a variable support to compute the local matching cost rather than using, as in the traditional approach, a fixed squared window. Conversely, most global methods attempt to minimize an energy function computed on the whole image area. Since this task turns out to be a NP-hard problem, approximate but efficient strategies such as Graph Cuts (GC) [5] and Belief Propagation (BP) [6] have been proposed. In particular, a very effective approach turned out to be the employment of segmentation information and a plane fitting model within a BP-based framework. In this project, as will be shown in the paragraph 2.1.2, a global method based on BP was used.

The second step of the project is the analysis of the texture. The feature extraction method used is the Segmentation-based Fractal Texture Analysis, or SFTA [7]. This method, as will be shown in the paragraph 2.2, was chosen on the basis of the results obtained in comparison to traditional methods and other algorithms like Gabor and Haralick.

(28)

10 | CHAPTER 1

1.4 Structure of the thesis

Chapter 2 explains in the first part the basic principles of the stereoscopic vision that allow to obtain the three-dimensional coordinates of the environment starting from two images with different viewpoints. Then the camera calibration and the epipolar rectification of the images are carried out. After the rectification some algorithms that compute the disparity map are compared to each other and one of them is selected for this work. Then the criteria of the selection of the stereo system configuration are shown. In the second part of the chapter the basic principles of the texture analysis and the operating principles of the SFTA algorithm, which is used for the extrapolation of the texture parameters, are explained. Then it is shown how the color and the temperature information are treated. At the end of the chapter the strategy used in the IVA algorithm for the collection of the information listed above in a vector is shown.

Chapter 3 explains how the information are analyzed in the VAA algorithm in order to recognize the objects and reconstruct the three-dimensional environment. In the first part of the chapter the horizontal plane corresponding to the soil is identified. In the second one the objects are identified and classified. In the third one the detailed environment and the navigation map are reconstructed.

In chapter 4 the results obtained are analyzed and commented, while the conclusions and the future developments are given in chapter 5.

(29)

CHAPTER 2 COLLECTING OF OPTICAL AND INFRARED

INFORMATION

2.1 Stereo vision

This chapter describes the basic principles of stereoscopic vision. Based on these principles it has been possible to obtain the 3D map of the environment observed.

Stereo vision required to use at least 2 images with different points of view. In fact, analyzing a single image it is not possible to reconstruct the three-dimensional structure of the scene observed due to the loss of information inherent in perspective projection which maps the points of a 3D space in a 2D space.

Observing a three-dimensional scene from two different viewpoints, placed at a finite distance, differences in the relative positions occupied by the objects in the scene are perceived. This effect can be seen in highlighted, thanks to the overlapping of a grid, in the figure 2.1. The human eyesight is based on this effect from which the stereoscopic techniques are inspired.

(30)

12 | CHAPTER 2

Figure 2.2 Epipolar geometry.

In this project two color images are acquired for the stereo analysis. These images are obtained by only one digital camera which is adequately shifted between the two photo shoots. Investigating the differences between the two photographs is possible to get information about the spatial location of the elements observed. In fact, again for perspective reasons, the objects assume farther positions in the two pictures as they are closer to the cameras. Therefore, in the digital domain, in order to obtain the objects displacement, it is necessary to find the correspondence between the pixels of the pair of images that represent the same point of the three-dimensional scene observed. To determine this correspondence the epipolar geometry (figure 2.2) [8,9] is considered. The epipolar geometry, known the location of a point on the first image, imposes

Figure 2.3 Example of epipolar line.

12 | CHAPTER 2

Figure 2.2 Epipolar geometry.

In this project two color images are acquired for the stereo analysis. These images are obtained by only one digital camera which is adequately shifted between the two photo shoots. Investigating the differences between the two photographs is possible to get information about the spatial location of the elements observed. In fact, again for perspective reasons, the objects assume farther positions in the two pictures as they are closer to the cameras. Therefore, in the digital domain, in order to obtain the objects displacement, it is necessary to find the correspondence between the pixels of the pair of images that represent the same point of the three-dimensional scene observed. To determine this correspondence the epipolar geometry (figure 2.2) [8,9] is considered. The epipolar geometry, known the location of a point on the first image, imposes

Figure 2.3 Example of epipolar line.

CHAPTER 2

tereo analysis. These images are obtained by only one digital camera which is adequately shifted between the two photo is possible to get . In fact, again for positions in the two pictures as they are in order to obtain the objects espondence between the pixels of the pair of dimensional scene observed. To

is considered. image, imposes

(31)

Collecting of optical and infrared information | 13

fundamental constraints on the research of this point on the second image. Considering, for example, two cameras with converging optical axis as shown in the figure 2.2, the points PL and PR are the projections of the point P respectively on the image plane πL and

π_R_{. The plane passing through the points P, O}_L_{and O}_R_{is called epipolar plane. Still}

observing the figure 2.2 it is noted that any point that lies on the straight line passing through OL andP have the projection on PL regarding πLand on the red line regarding πR .

This red line is called epipolar line and it is the intersection between the epipolar plane and the plane of the image π. Given a system of two cameras the points on the epipolar line of an image π1 are all those which may correspond to the selected point on the other

image plane π2. Therefore, the correspondence between the two pixels of the two images

representing the same point selected on the first image, it is to be sought in the second image exactly on the epipolar line.

In the figure 2.3 an example of epipolar line is depicted. In this example, it is easy to note that all the points Pi and P have the projection in P on the right image while lie on the epipolar line on the left image. On this line, must be sought the correspondence between the pixels corresponding to the point P in the right image and the pixel corresponding to the point P in the left image. It is also easy to see that in the right picture of figure 2.3 the points Pi are occluded from the point P; the occlusion phenomenon is one of the stereo vision problems and will be discussed later.

Figure 2.4 Prospective projection and standard stereo system. In the left image (a) is represented the prospective projection p on the two-dimensional image plane π of a tridimensional point P. Actually the image plane π is behind the focal point of the camera O but it is tipped, so this representation is

(32)

14 | CHAPTER 2

If the focal axes of the cameras are aligned as in figure 2.4b the stereo system is called

standard stereo system. In this particular case the epipolar lines are all parallel to the base line b represented in the figure 2.4b. Thanks to this property the projections of a point P

of the tridimensional scene on the images plane has different coordinates u but has the same coordinates v. Therefore the research space of the correspondence problem is one-dimensional and the research for the matching of the pixels is carried out along the same row v of both images (figure 2.5).

Determined the match between pixels of coordinates (uR,vR) and (uL,vL), respectively

of the right and left images as in the figure 2.5, the disparity is defined as follows:

= −

Considering the following hypothesis of standard stereo system (figure 2.4b):

• Reference frames of the cameras aligned in the same direction

• Identical focal lengths (f )

• Coplanar image planes the relations below are obtained:

= − = = = =

and considering the prospective projection (figure 2.4a): =

=

(33)

the following relations are obtained:

= ∙ /

₌

" = ∙ /

₌

_{∙ /}

" → − = ∙ /

obtaining from the definition of the disparity, the equation that expresses the relation between the disparity of the projection of a point P and its z coordinate:

= ∙

Where, the focal length f is the distance between the center of the camera O and the center of the image plane o, while b is the baseline of the stereo system (figure 2.4).

Knowing the relationship that binds the disparity and the depth, intended as z coordinate, it is possible to find the three-dimensional coordinates of each pixel of the images starting from its disparity value. Associating a gray value to each pixel of the original image according to the value of its disparities, it is generated the disparity map from which the three-dimensional coordinates of the objects are obtained. So the disparity map is a grayscale image that contain the depth information about the scene captured as it is possible to see in the figure 2.6.

(34)

16 | CHAPTER 2

Figure 2.7 Depth accuracy.

It is necessary to introduce some properties of the stereo vision that affect the choice of the configuration in paragraph 2.1.3.

One property is the depth accuracy which is define as the amplitude of the area in which different points P with the same disparity can stay inside. It is possible to see in the figure 2.7 that near objects have greater accuracy compared to those farthest. Moreover the depth accuracy depends on the baseline as it is depicted in the left image of the figure 2.8. The larger the baseline, the greater is the depth accuracy in the faraway areas, while on the contrary the area covered by both the cameras fields of view is reduced (right image of figure 2.8) and this is a negative thing since an object to be identified on the disparity map must be present on both of the cameras view fields.

Another property directly dependent on the baseline is the occlusion. This property is define as the possibility that a point P observed by a camera is not observable from the other because of the presence of an object in front of the point P which occludes his vision. Therefore the choice of the baseline length is a trade-off between the advantages and the disadvantages described here above.

(35)

2.1.1 Camera calibration and images rectification

_{Settled the stereo system configuration, before computing the disparity map and the}

three-dimensional coordinates of each pixel, it is necessary to perform the calibration of the camera and the rectification of the pair of stereo images. The calibration is carry out in order to obtain the following intrinsic parameters of the camera:

• Focal length [fc] : defines the distance between the image plane of the camera and the optical center of the lens.

• Principal point [cc] : defines the point where the axis of symmetry of the lens intersects the image plane of the camera.

• Skew coefficient [αc] : defines the angle between x and y axes of the image plane.

• Lens distortion [kc] : describes the deformation due to the neglected variation of the focal length along the axes of the image plane. In fact, the focal length increases going from the center to the edges of the image plane.

and the following extrinsic parameter of the camera:

• Translation [T] : defines the translation between the reference frame of the camera and the reference frame of the chessboard used for the calibration.

• Rotation [R] : defines the rotation between the reference frame of the camera and the reference frame of the chessboard.

In defining the intrinsic parameters the distortion model introduced by brown in 1966 [10,11] is taken into account. Considering a point $ of coordinates %_& = ' _&, )_&, *_&+ in the reference frame of the camera, his projection on the image plane, consistently with the intrinsic parameters, is the point p of coordinates %_, = - _,, _,. in the image plane reference frame as depicted in the figure 2.9. Now, the vector %_/ which represents the normalized projection of the point P is considered:

%/ = 0 1 2 & *& )& *& 3 4 5 = 6 7

(36)

18 | CHAPTER 2

Figure 2.9 Camera and image plane reference frames.

Moreover, it is defined the distance of the point P as: 89 = 9+ 9

Introducing in the model the image distortion due to the lens of the camera, the vector of the normalized projection of the point P is redefined as follow:

%; = 6 ;_; 7 = '1 + = '1+ ∙ 89+ = '2+ ∙ 8?+ = '3+ ∙ 8A+ %/+ % where % is the tangential distortion vector:

% = 2 ∙ = '4+ ∙ ∙ + = '5+ ∙ '8_{= '4+ ∙ '8}₉_{+ 2 ∙} ₉_{+ + 2 ∙ = '5+ ∙ ∙ C}9+ 2 ∙ 9+

and = is a vector containing the radial (first 3 values) and the tangential distortions (last 2 values). The radial distortion is due to the curvature of the lens while the tangential distortion is due to the imperfect alignment of the lens components.

Applying the distortion model, the coordinates of the point p on the image plane becomes the follow:

D ,, 1E = FGGH I ; ; 1J = K '1+ L& ∙ '1+ '1+ 0 '2+ '2+ 0 0 1 M I ; ; 1J

(37)

Figure 2.10 Chessboard reference frame

This coordinates transformation, obtained through the matrix FNNH which is called

Camera Matrix , allows to obtain the rectified image free from distortions.

As regard the extrinsic parameter, a generic point P of coordinates % = ' , ), *+ in the chessboard reference frame (figure 2.10) and coordinates %_& = ' _&, )_&, *_&+ in the camera reference frame is considered. The two vector % and %_& are bound from the following rigid roto-traslation:

%& = FOH % + PQ%

where the vector PQ% moves the origin of the chessboard reference frame to the one of the camera and the matrix FRH rotates the chessboard reference frame aligning it with the one of the camera.

For the camera calibration the MATLAB ® tool Camera Calibration was used. This tool computes, through an iterative method, the intrinsic and extrinsic parameters of the camera. The images on which the tool works, must depict a known size chessboard oriented in different ways. Since the calculation method of the parameters is iterative, it is better to exploit many images. The options of the tool recommend to use between 10 and 20 images; for this project 20 images depicting a chessboard formed by 6 to 8 square boxes of 24 mm side are used (figure 2.11).

(38)

20 | CHAPTER 2

Figure 2.11 Set of the images acquired for the camera calibration.

The MATLAB ® tool identifies automatically the chessboard vertices and defines its reference frame (figure 2.12) then computes the parameters of the camera which are reported in the table 2.1.

Intrinsic parameter = 6 580.856_581.108 7 = 6 357.093_244.026 7 = = 0 U 1 U 2 −0.173 0.051 0.563 0.0013 −0.0013 3 U 4 U 5 L& =0.091

Table 2.1 Computed intrinsic parameter of the camera

Figure 2.12 Example of vertices identification of an image.

20 | CHAPTER 2

Figure 2.11 Set of the images acquired for the camera calibration.

The MATLAB ® tool identifies automatically the chessboard vertices and defines its reference frame (figure 2.12) then computes the parameters of the camera which are reported in the table 2.1.

Intrinsic parameter = 6 580.856_581.108 7 = 6 357.093_244.026 7 = = 0 U 1 U 2 −0.173 0.051 0.563 0.0013 −0.0013 3 U 4 U 5 L& =0.091

Table 2.1 Computed intrinsic parameter of the camera

Figure 2.12 Example of vertices identification of an image.

CHAPTER 2

board vertices and defines its the camera which are

(39)

Figure 2.13 Chessboard spatial arrangements in the camera-centric view.

In the figure 2.13 the spatial arrangements of the chessboards of all the 20 images in the camera reference frame are shown.

Looking at the rectified image in the figure 2.14, the deformations that the lens distortion generates are clear, especially along the edges of the image. Moreover, the rectification process, eliminating the distortion of the lens, gives the real projection of the objects on the image plane; on the contrary if the images are not rectified the pixel coordinates computed, as will described in the paragraph 2.4, are not exactly the real one. Despite these considerations, it was decided to not rectify the images used in this project through the calibration processes.

Figure 2.14 Image before (on the left) and after (on the right) the rectification processes.

Figure 2.13 Chessboard spatial arrangements in the camera-centric view.

In the figure 2.13 the spatial arrangements of the chessboards of all the 20 images in the camera reference frame are shown.

Looking at the rectified image in the figure 2.14, the deformations that the lens distortion generates are clear, especially along the edges of the image. Moreover, the rectification process, eliminating the distortion of the lens, gives the real projection of the objects on the image plane; on the contrary if the images are not rectified the pixel coordinates computed, as will described in the paragraph 2.4, are not exactly the real one. Despite these considerations, it was decided to not rectify the images used in this project through the calibration processes.

Figure 2.14 Image before (on the left) and after (on the right) the rectification processes.

| 21

centric view.

chessboards of all the 20 images in the

deformations that the lens , especially along the edges of the image. Moreover, the gives the real projection of the rectified the pixel coordinates computed, as will described in the paragraph 2.4, are not exactly the real one. the images used in this project

(40)

22 | CHAPTER 2

This decision was based on two considerations: the first one is that the purpose of this work is not to create an extremely precise three-dimensional map in terms of objects coordinates, but it is to identify and classify the objects, also providing a safe navigation; and these purposes are obtainable also without the precision provided by the image rectification. While the second consideration is that since the pairs of stereo images are acquired with a single camera moved manually it was considered useless to be precise performing the image rectification. In fact, on the basis of many tests the rectification of the single images did not improves the disparity map computation.

Instead, to improve the disparity map computation, a process of rectification different from the one discussed so far, which rectifies the pair of stereo images, was carried out. In this rectification method called epipolar rectification, the epipolar lines of both images are searched, and they are made horizontal, therefore, rectified, in the case where they are not. In this way a stereo system not exactly standard, due to instrument inaccuracies, becomes standard, facilitating the computation of the disparity map.

Figure 2.15 Epipolar rectification. Stereo pair images not rectified, above. Stereo pair images rectified through the epipolar rectification, below. The yellow lines represent the epipolar lines.

(41)

Figure 2.16 Comparison of the disparity maps obtained from the stereo pair of the figure 2.15 after the rectification. The disparity map on the left is computed using the original images. The map at the

center is computed using the rectified image. The map on the right is computed using the epipolar rectification.

The epipolar rectification is made by the ER9b (Epipolar Rectification 9b) algorithm developed in [12] with the results shown in figure 2.15. During the rectification the images are slightly rotated and zoomed. Care must be taken to this effect because it falsifies further the three-dimensional reconstruction of the scene; more the images of the stereo pair are aligned well less they are modified by the epipolar rectification. This effect will be further commented in following paragraph.

_{In the figure 2.16 are compared the disparity maps obtained using different pairs of}

stereo images: not rectified, rectified and rectified with the epipolar rectification. The best results are obtained with the epipolar rectification which is used in this project.

2.1.2 Algorithms for the Disparity Map computation

_{The development of an algorithm that provides the disparity map starting from a pair}

of stereo images acquired from a stereoscopic system goes beyond the scope of this work. Therefore it was decided to use an available algorithm appropriately chosen according to its robustness and precision.

In the last decades stereo vision has been one of the most studied task of computer vision and many proposals have been made in literature on this topic. The problem of stereo correspondence can be formulated as follows: given a pair of rectified stereo images one needs to find for each point V_W of the reference image its correspondent V on the target image, which, due to the epipolar constraint, lies on the same scan line as V_W and within the disparity range = F X ; Z H.

(42)

24 | CHAPTER 2

Today stereo techniques are mainly divided into two categories: local approaches and

global approaches [13].

Traditional local approaches search the pixel intensity correspondence along the epipolar line as in the Pixel-to-Pixel algorithm, or within a fixed squared window centered in the considered pixel, minimizing a cost function like the follow:

[' , , + = |]W' , + − ] ' + , +|

where ]_W' , + is the intensity of the reference image pixel (left image) of coordinates

' , + and ] ' + , + is the intensity of the target image pixel (right image) of

coordinates ' + , + as it is possible to see in the figure 2.17.

Figure 2.17 Disparity computation method. Scan line strategy on the top and cost function trend on the bottom (source: [13]).

24 | CHAPTER 2

Today stereo techniques are mainly divided into two categories: local approaches and

global approaches [13].

Traditional local approaches search the pixel intensity correspondence along the epipolar line as in the Pixel-to-Pixel algorithm, or within a fixed squared window centered in the considered pixel, minimizing a cost function like the follow:

[' , , + = |]W' , + − ] ' + , +|

where ]_W' , + is the intensity of the reference image pixel (left image) of coordinates

' , + and ] ' + , + is the intensity of the target image pixel (right image) of

coordinates ' + , + as it is possible to see in the figure 2.17.

Figure 2.17 Disparity computation method. Scan line strategy on the top and cost function trend on the bottom (source: [13]).

CHAPTER 2

local approaches and

Traditional local approaches search the pixel intensity correspondence along the Pixel algorithm, or within a fixed squared window considered pixel, minimizing a cost function like the follow:

is the intensity of the reference image pixel (left image) of coordinates is the intensity of the target image pixel (right image) of

(43)

In order to increase the accuracy of disparity estimations, particularly along depth borders, state of the art algorithms deploy a variable support window to compute the local matching cost. Local methods are faster than global methods, that’s why they are quite popular. The disparity maps obtained by using local matching methods typically suffer from a lack of smoothness, and that’s where the global methods come in [13,14].

In fact, conversely, global approaches attempt to minimize a cost function computed on the whole stereo pair [14]. This cost function is defined as the combination of a data energy and a smoothness energy:

[' + = [;^ ^' + + [_ `` a' +

About this approach, approximate but efficient strategies such as Graph Cuts (GC) and Belief Propagation (BP) have been proposed to solve the global minimization problem. In this paragraph the results obtained by three different algorithms that are based on different approaches are compared. The pair of images used for the first test are those of figure 2.18 which have the size of 720x480 pixel. These kind of images are chosen because they show up the problems of the algorithms. The disparity range used for all algorithms are given by er9b and it is = F35; 70H.

The first algorithm tested is DMAG5 (Depth Map Automatic Generator 5), which is a local method that uses a fast cost-volume filter based on the color edge preserving, that selecting an optimal search window, speeding up the calculation [15].

The second algorithm tested is DMAG6, which is a global method that relies heavily on the very efficient implementation of multi-scale belief propagation that minimize the

(44)

26 | CHAPTER 2

global cost function [6].

The last tested algorithm is DMAG7, which is a global method that uses a bilateral filter in the smoothing cost of the cost function. The bilateral filter is an edge-preserving filter, which blurs along the objects edges but not across edges [17].

Both DMAG5 and DMAG6 return two disparity map and two occlusion map, one for the left image and one for the right image, while DMAG7 returns only the left disparity map. The results of the disparity map computed are shown in the figure 2.19.

(45)

The result of the DMAG7 algorithm is the worst because due to the edge-preserving filter some disparities computed are forced by the color edge of the object generating some errors. While the results of DMAG5 and DMAG6 are very similar, but the disparity map of DMAG6 is slightly better in the preserving of the objects edges and in the accuracy along the inclined plane. A second test using the rectified pair of images of figure 2.15 is performed (figure 2.20). The environment captured by these images was realized in view of the type of environment in which the rover will work. It is composed

(46)

28 | CHAPTER 2

Figure 2.21 Occlusions map of the left disparity map of DMAG5 and DMAG6 in test 1.

of some stones of various size and shape posed on a red sand soil. The disparity range used in this second test for all algorithms are given by er9b and it is = F−5; 40H, moreover the considerations to be made on the results are the same made for the results of the first test.

In the figure 2.21 the left disparity map occlusions of DMAG5 and DMAG6 in test 1 are shown. It is possible to see in the figure, compared with the disparity map depicted in the figure 2.19, that where the occlusions are present (black point on the occlusions map), like between the objects and on the side of the images, the disparity map is inaccurate because in absence of information it is computed based on the latest correspondence found. The computational time of the presented tests is reported in the following table:

Computational time [s] DMAG5 DMAG6 DMAG7

Test 1 14.06 30.06 6.48

Test 2 12.23 28.73 7.80

Table 2.2 Computational time of the algorithms that compute the disparity map.

All tests were conducted using a PC with a Intel(R) Core(TM) i5-2410M CPU at 2.30 GHz with 4 GB of RAM.

DMAG7 is the fastest, also because it computes only one disparity map instead of two like the other algorithms so the computational time is halved, but it is too inaccurate and so is discarded. DMAG5 takes half the time of the DMAG6 to compute the maps

(47)

Figure 2.22 Disparity map computed by DMAG6 on the left and the same disparity map smoothed by EPS9 on the right.

which are, however, slightly less accurate compared with those of DMAG6. Whereas the difference between the computational time of DMAG5 and DMAG6, confronted with the one of the entire final algorithm shown in chapter 4, it is not very high, it was decided to use the DMAG6 which provides a more accurate disparity map.

After choosing the algorithm, the disparity map computed is smoothed in order to make it more uniform and accurate on the objects edges through the use of another algorithm, the EPS96 (Edge Preserving Smoothing 9) which is a variant of the bilateral filter [16]. The result of this procedure is shown in the figure 2.22. This algorithm uses as input the disparity map and the original image which is used to smoothing the edges. The computational time of EPS9 in the example of figure 2.22 is 10.40 seconds. It was also considered to use EPS9 on the disparity map of DMAG5 but to get the same result of DMAG6 much more time for the smoothing was needed.

Another advantage of using EPS9 is that smoothing the disparity map through the original captured image and not through the rectified one, the smoothed disparity map obtained is not affected by the movements of the objects due to the rectification of the images discussed in the previous paragraph eliminating this kind of error.

In conclusion, in this project, the DMAG6 was chosen for the computation of the disparity map which is then smoothed by EPS9.

6_{All the algorithms cited in this paragraph (DMAG5, DAMG6, DMAG7, ESP9, ER9b) are} implementations of [ 15, 6, 17, 16, 12 ] developed by Ugo Capeto. Moreover they were all used with the default parameters apart from the disparity range as regards the DMAG algorithms which is reported in the text.

(48)

30 | CHAPTER 2

2.1.3 Stereo system configuration

_{In this paragraph the selection criteria for the positioning of the cameras on the rover}

are shown and the resulting field of view of the system is computed.

First of all the optical properties of the camera are considered; the lens used is the Canon EF-S 18-55 mm which was used with a focal length of 18 mm. According to this focal length the horizontal field of view of the lens is b_a = 65,5° while the vertical field of view of the lens is b_d = 45,5°. Moreover, the focal length in term of pixel is = 581 pixels as it is computed during the calibration processes using the image resolution of 720x480 pixel. In the following configuration phase, three constraints are considered:

• the first constraint requires that the distance e_f is greater than the width of the rover g_W = 1.5 (see figure 2.23).

• the second constraint concerns the computational cost of the disparity map; in fact, in order to reduce the computational time, it was decided to impose a value of _^h = 80 for the maximum disparity of research of the algorithm.

• The third constraint concerns the depth accuracy ∆* ; it was imposed a minimum accuracy of 0.3 in the definition of the z coordinate (see paragraph 2.1).

(49)

Figure 2.24 Rover and field of view of the stereo system (lateral view).

Based on a rover with sizes similar to those of spirit and opportunity rovers, it was decided to impose the following values of height and tilt of the camera with respect to the ground (see figure 2.24):

ℎ = 1.1 L = 75°

Now the only parameter that remains to be defined is the baseline . Looking at the first constraint e_f > g_W, it is to consider the following trigonometric relation between the segment e_fand the baseline :

ef = 2 ftan ob_{2 p −}a

where _f is the minimum distance between the soil and the camera in the image field of view, assuming the soil as an horizontal plane (figure 2.24), having the following value:

f = ℎ

cos oL − bd 2 p

= 1.797

Substituting these equation in the disequation of the first constraint the following relation is obtained:

Figure 2.24 Rover and field of view of the stereo system (lateral view).

Based on a rover with sizes similar to those of spirit and opportunity rovers, it was decided to impose the following values of height and tilt of the camera with respect to the ground (see figure 2.24):

ℎ = 1.1 L = 75°

Now the only parameter that remains to be defined is the baseline . Looking at the first constraint e_f > g_W, it is to consider the following trigonometric relation between the segment e_fand the baseline :

ef = 2 ftan ob_{2 p −}a

where _f is the minimum distance between the soil and the camera in the image field of view, assuming the soil as an horizontal plane (figure 2.24), having the following value:

f = ℎ

cos oL − bd 2 p

= 1.797

Substituting these equation in the disequation of the first constraint the following relation is obtained:

| 31

(lateral view).

irit and opportunity rovers, it was height and tilt of the camera with respect to the

. Looking at the first trigonometric relation between the

is the minimum distance between the soil and the camera in the image field of ), having the following value:

(50)

32 | CHAPTER 2

2 ℎ

cos oL − bd 2 p

tan ob_{2 p − > g}a W

Substituting in the above equation all the known values the first constraint on is evaluated:

< 0.811

Then, considering the second constraint and the disparity equation introduced in the paragraph 2.1, the following disequation is considered:

*f > ^h

where *_fis the projection of the segment _f on the focal axis of the camera, i.e. the z coordinate, in the camera reference frame which is shown in figure 2.4, of the closest point to the camera in its field of view. So rewriting the previous disequation it becomes the following:

fcos ob_{2 p}d ^h >

from which, substituting all the known values the second constraint on is evaluated:

< 0.228

Relying on the fact that the larger the baseline better is the depth accuracy, it was decided to assign the highest value allowed by the two constraints analyzed. So, considering a margin of safety, the value of the baseline is determined as follow:

= 0.22

(51)

Finally, using this value of baseline, it is calculated the limit distance for which the depth coordinate z has the accuracy required from the third constraint. Considering the equation of the depth accuracy and the third constraint of the system configuration, the following disequation is obtained:

∆*' + = − _{+ 1 < 0.3}

Solving the above disequation in function of the disparity , the minimum disparity value that satisfies this disequation is _/ = 21 obtaining:

*9 =

/ = 6.308

where *₉ as in the previous case is the projection of the segment ₉ on the focal axis of the camera but in this case it was decided to make an approximation assuming ₉ ≅ *₉. This assumption was made assuming small angle between the segment ₉ and the focal axis of the camera (figure 2.24). Now are computed all the missing sizes of the survey area delimited with a red line in the figure 2.23 :

f = v f9− ℎ9 = 1.42 9 = v 99− ℎ9 = 6.21 ef = 2 ftan ob_{2 p = 2.31}a e9 = 2 9tan ob_{2 p = 8.11}a

It is interesting to analyze what happen if the second constraint regarding the computational cost is removed. Keeping the same configuration of ℎ and L, while allowing the baseline to increase because of the removal of the second constrain, the results depicted in the figure 2.25 for three different image resolution are obtained. In this figure it is possible to see a quadratic behavior for which to double the distance ₉ at

(52)

34 | CHAPTER 2

which one gets the same depth accuracy ∆* is necessary to quadruple the image resolution and thus, also the maximum disparity _^h quadruples, which increases linearly with respect to the resolution at the same baseline value. The saw-tooth behavior of the trend is due to the fact that the value of _/ computed in the depth accuracy disequation is a discrete value and in order to satisfy the disequation it increases with the baseline. So increasing _/ the depth accuracy initially decreases and then increases again.

Figure 2.25 Trend of d2 with respect to the baseline using ∆Z=0.3 m.

It is also interesting to see that doubling the baseline maintaining the same image resolution has the same effect on the depth accuracy of doubling the image resolution maintaining the same baseline. Using an image resolution of 2880x1920 pixel with a baseline of 0.811 meters a depth accuracy of 0.3 meters is obtained up to about 25 meters with a maximum disparity value of _^h = 1050 hugely increasing the computational cost of the disparity map.

Autonomous environment identification for planetary exploration rovers

POLITECNICO DI MILANO

School of Industrial and Information Engineering

Department of Aerospace Science and Technologies

Master Thesis in Space Engineering

AUTONOMOUS ENVIRONMENT IDENTIFICATION

FOR PLANETARY EXPLORATION ROVERS

Advisor :

Prof. Mauro Massari

Thesis by:

Gerardo Marenzi

841474

Abstract

Sommario

Contents

List of Figures

List of Tables

CHAPTER 1

INTRODUCTION

1.1

Exploration of space and planets

1.2

Proposal for an innovative system

1.3

State of the art of the presented technologies

1.4 Structure of the thesis

CHAPTER 2

COLLECTING OF OPTICAL AND INFRARED

INFORMATION

2.1

Stereo vision

= ∙ /

=

" = ∙ /

=

∙ /

" → − = ∙ /

2.1.1

Camera calibration and images rectification

2.1.2

Algorithms for the Disparity Map computation

2.1.3

Stereo system configuration

₌

₌

_{∙ /}