• Non ci sono risultati.

5.5 Box stacking simulations

5.5.1 MM analysis

In this section, the positive effect of the action term (5.4) on the structure of the latent space is demonstrated thus answering the questions listed in point 1). In detail, first, the influence of the

5.5. Box stacking simulations 179

dynamic minimum distance dm on the LSR performance is in-vestigated. Then, the structure of the latent space is studied by analyzing the distance between the encodings of the 288 states.

Lastly, the influence of the latent dimension on the LSR perfor-mance is examined.

Influence of dynamic dm

A key parameter in the action term (5.4) is the minimum distance dm encouraged among the action pairs. The approach proposed in Section 5.3 is validated using the hard box stacking task. In de-tail, the dynamical increase of dm (see Figure 5.6), which reaches dm = 2.3 at the end of the training, is compared with (i) the choice in [153] which uses dm fixed to the average action distance among all the training action pairs measured in the corresponding baseline VAE resulting in dm = 11.6, and (ii) the case where dm

is fixed to a constant value dm= 100 significantly higher than the one obtained in (i).

The performance of the LSR, evaluated using full scoring, with re-spect to the choice of dm is summarized in Table 5.1. The results show that the dynamic selection of dm significantly outperforms the other methods for all the scores. This approach not only elim-inates the need for training the baseline VAEs as in [153] but also reaches a value of dm that obtains a better separation of covered regions Zisys without compromising the optimization of the recon-struction and KL terms.

dm % All % Any % Trans.

Dynamic (Prop.) - 90.9 ± 3.5 92.1 ± 2.9 95.8 ± 1.3 Baseline [153] 11.6 48.7 ± 40.7 50.7 ± 41.5 72.8 ± 26.0

High 100 0.0 0.0 14.6 ± 2.0

Table 5.1: Comparison of the LSR performance when using different methods for selecting dm. Top row is the proposed approach where dm is determined dynamically, middle row is the approach used in [153] where dmis calculated from the latent action pairs in the baseline VAE, while in the bottom row dm

is set a high value. Results are obtained using VAE12-hs-L1 and LSR -L1.

Separation of the states

The effect of the action loss (5.4) on the structure of the latent space is investigated by analyzing the separation of the latent points z ∈ Tz corresponding to different underlying states of the system. Recall that images in TI containing the same state will look different because of the introduced positioning noise in both tasks, and different lightning conditions in the case of hs.

Let ¯zs be the centroid for state s defined as the mean point of the training latent samples {zs,i}i ⊂ Tz associated with the state s.

Let dintra(zs,i, ¯zs) be the intra-state distance defined as the distance between the latent sample i associated with the state s, namely zs,i, and the respective centroid ¯zs. Similarly, let dinter(¯zs, ¯zp) de-note the inter-state distance between the centroids ¯zs and ¯zp of states s and p, respectively. In the following analysis, the models VAE12-ns-L1 and VAE12-ns-L1 are considered for the normal and hard stacking tasks, respectively.

Figure 5.9 reports the mean values (bold points) and the stan-dard deviations (thin lines) of the inter- (in blue) and intra-state (in orange) distances for each state s ∈ {1, ..., 288} in the normal stacking task when using the baseline model VAE12-ns-b (top) and the action model VAE12-ns-L1 (bottom). In case of the baseline VAE, similar intra-state and inter-state distances are obtained.

This implies that samples of different states are encoded close to-gether in the latent space which can raise ambiguities when plan-ning. On the contrary, when using VAE12-ns-L1, the inter- and intra-state distances approach the values 5 and 0, respectively.

These values are imposed with the action term (5.4) as the mini-mum distance dm reaches 2.6. Therefore, even when there exists no direct link between two samples of different states, and thus the action term for the pair is never activated, the VAE is able to encode them such that the desired distances in the latent space are respected.

Similar conclusions also hold in the hard stacking task for which the inter- (in blue) and intra-state (in orange) distances are depicted in Figure 5.10. Top row again shows the distances

calcu-5.5. Box stacking simulations 181

0 0

0 0

2 2

4 4

6 6

8 8 10

10

50 50

100 100

150 150

200 200

250 250

State s VAE12-ns-b

VAE12-ns-L1

Figure 5.9: Mean values (bold points) and standard deviations (thin lines) of inter- (blue) and intra- (orange) state distances for each state calculated using a VAE trained with (bottom) and without (top) action term. Results are evaluated using VAE12-ns-L1 model on the normal stacking task.

lated with the baseline model VAE12-hs-b, while the ones shown in the bottom are obtained with the action model VAE12-hs-L1. Note that lower inter-state distance is obtained than in case of ns because the minimum distance dm reaches only 2.3 in hs.

Finally, the difference between the minimum inter-state dis-tance and the maximum intra-state disdis-tance is analyzed for each state. The higher the value the better separation of states in the la-tent space since samples of the same state are in this case closer to each other than samples of different states. When the latent states are obtained using the baseline VAE12-ns-b, a non-negative dis-tance is obtained for 0/288 states with an average value of ≈ −1.2.

This implies that only weak separation occurs in the latent space for samples of different states. On the other hand, when calculated on points encoded with VAE12-ns-L1, the difference becomes non-negative for 284/288 states and its mean value increases to ≈ 0.55, thus achieving almost perfect separation. In the hard stacking task, it is similarly obtained that VAE12-hs-b reaches an average difference of −5.86 (being non-negative for 0/288 states), while the action model VAE12-hs-L1 reduces the average difference to

−0.04 (being non-negative for 121/288 states). These results

val-00 00

5 5

10 10

15 15

20 20

50 50

100 100

150 150

200 200

250 250

State s VAE12-hs-b

VAE12-hs-L1

Figure 5.10: Mean values (bold points) and standard deviations (thin lines) of inter- (blue) and intra- (orange) state distances for each state calculated using a VAE trained with (bottom) and without (top) action term. Results are evaluated using VAE12-hs-L1model on the hard stacking task.

idate that the action term (5.4) and the dynamic setting of dm

contribute to a better structured latent space Zsys as well as con-firm the higher level of complexity of the hard task compared to the normal one.

Latent space dimension

The choice of the latent space dimension is not a trivial task, since it should be as low as possible while still encoding all the relevant features. Table 5.2 reports the partial scoring of the LSR on both normal and hard stacking tasks using VAE models with various latent dimensions. The results demonstrate an evident drop in the performance when the latent dimension is too small, such as ld = 4. As ld increases, gradual improvements in the performance are observed and a satisfactory level is achieved using ld ≥ 6 for ns, and ld ≥ 12 for hs. Therefore, hs requires more dimensions in order to capture all the relevant and necessary features. This result not only demonstrates the complexity of each task version but also justifies the choice ld = 12 in the rest of experiments.

5.5. Box stacking simulations 183

ld ns [%] hs [%]

4 7.9 ± 2.2 8.8 ± 7.9 6 99.96 ± 0.08 56.2 ± 23.1 8 99.96 ± 0.08 62.7 ± 18.7 12 100.0 ± 0.0 92.1 ± 2.9 16 100.0 ± 0.0 95.9 ± 1.4 32 97.5 ± 4.33 96.4 ± 0.4 64 99.8 ± 0.5 95.3 ± 0.9

Table 5.2: Comparison of the LSR performance when using different latent dimensions for the normal (left) and hard (right) box stacking tasks.