• Non ci sono risultati.

5. Conclusions Compared to the results of the previous work

N/A
N/A
Protected

Academic year: 2021

Condividi "5. Conclusions Compared to the results of the previous work"

Copied!
1
0
0

Testo completo

(1)

Carlo Bertinetto Tesi di laurea

65

5. Conclusions

Compared to the results of the previous work28, in these experiments we have expanded the data set with a number of cyclic polymers. Larger sampling usually helps the prediction, if the added compounds are not too different from the ones already present. In our case the data set expansion was pretty heterogeneous, since the phenylic polymers contained groups that were absent in the acyclic ones, and that therefore couldn’t be learnt from them. On the other hand, acyclic compounds do contain many groups that are in common with cyclic ones, so the first can be learnt through the latter. This explains why the acyclic samples show better results, in MAE and S (except only in exp. 1), than ref. [28], and phenylic ones worse. The overall outcome is anyway about equal to ref. [28] and hasn’t worsened despite the mentioned heterogeneity of the data set. The method is hence robust.

By what we have noticed trying different adjustments in representation, it seems that the RecNN provides better predictions after simple fragment unifications that affect a very large number of samples. The unifications of ‘CH2’ + ‘CH2’ and ‘C=O’ + ‘O’, which involved many polymers in the

data set, indeed improved both the results and the computing efficiency. Instead, when we used more complex fragment representations with the aim of improving a limited number of polymers, as we did in exp. 6 and 7 for acids and amides, the RecNN seemed to get more problems than benefits. In fact the samples we were focusing on didn’t get any better and the overall outcome was poorer.

We have examined the ability of the RecNN to reproduce the trend of compounds differing by only one feature (position on the phenyl ring, chain length, heteroatom substitution). It has been observed that the trend is correctly reproduced in most cases, but not so often for compounds with substituent respectively in ortho, meta and para position.

The cycle cutting representation shows about the same results, although at a little greater computational expense, as the block representation. This makes us confident enough to use this method for further research. US and InChI obtain more or less the same output and only few samples are strongly affected by the change of standard algorithm and priority rules.

The RecNN technique, despite its already obtained successes, appears to have a lot of ground for development. It has so far proved out to be flexible enough to adapt itself to many problems that didn’t find a solution with existing methods, and everything looks like it will be able to solve other situations as well. In our study, the next step will be to expand the data set with compounds containing different kind of cycles than benzene, for which bibliographical data has already been collected.

Riferimenti

Documenti correlati