• Non ci sono risultati.

Raw Data Manipulation on the Database for Time Prevision

TEvents is the only table that continuously increases the size of the AWS database. It is still possible to manage and edit data directly online through the Microsoft SQL Server database manager. Through this soft-ware you can manipulate the database by creating new tables or views getting the data from TEvents filtering them appropriately. In fact the subtables generated by TEvents in the AWS server are many, all created for statistical purposes. Among these there are 2 that will be the starting point for the time prevision part, already developed by Bottero. The sec-ond part of the thesis will take in input these two tables and after a small pre-processing the data will be well formatted to estimate the processing time for each step.

These two tables are:

• view_StepDetail

3 – Database and Data Collection

• TLamiWinDataToRtx548

The first one is an extension of the Step event to which other information of other events like Session, Glass, TypeOfGlass have been associated so that the table contains a lot of information related to each step taken by the machine. Having a lot of information for each step it is possible to estimate the working time of that operation and finally obtain the total working time of the plate by adding all the steps belonging to a glass plate.

The second table contains additional information for each step. A column is a pointer to the step ID so that the two tables can be merged to create an extended one with even more information, so that the time prediction is more precise.

These two tables were created by Bottero explicitly in order to be functional to the prediction of machining times. To create them Bottero worked, as anticipated, on SQL Server directly connected to the database AWS. Starting from TEvents the desired data was searched and associ-ated in the two tables. The programming on SQL Server is done through the T-SQL language very powerful. The two tables will then be extracted and imported in python development environment for faster management and for the application of machine learning algorithms. Of this work I will discuss in chapter related with time prevision.

Chapter 4

Machine Learning: Main Supervised Characteristics and Algorithms

A formal definition of what machine learning is was given by Tom. M.

Mitchell and turns out to be highly rated and appreciated:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

It is clear from this definition what is meant by learning: a program is said to learn if there is an improvement in performance after completing a task, i.e. with experience. Based on this experience of the program, the objective of machine learning is to successfully complete new tasks that it has never faced. Thanks to many examples collected (and therefore known), the machine that has the task of producing accurate forecasts based on criteria detected independently during the training phase is instructed. From what has been said we can identify the various steps that are necessary to successfully develop a classical machine learning algorithm. These phases are:

• Learning

• Test and Validation

• Prediction

To these it is necessary to add the pre-processing phase of the data to pro-vide the algorithm with correct, significant and well-formatted examples, so as to optimize the learning phase and obtain better predictions. The first two phases are iterative until the algorithm is validated. Iterative

4 – Machine Learning: Main Supervised Characteristics and Algorithms

means that parameters, features, algorithms are often changed before be-ing satisfied with the result. These iterations are due to the resolution of problems intrinsic to machine learning projects that are related to over-fitting, underover-fitting, bias, variance, size and characteristics of the training and test set, type and complexity of the hypothesis, features selection. In the next paragraphs these concepts will be extended and defined precisely because they will be an integral part of this thesis, in fact we will discuss them again when we will choose the algorithms to be adopted for this work.

Now it is good to define the different types of existing machine learning that can be divided into 4 groups:

• Supervised Learning

• Unsupervised Learning

• Reinforcement Learning

• Recommender Systems

The most important for this work is the first type. Supervised Learning works with examples providing input and relative output, a part of these examples are known during the learning phase. The objective is to extract a general rule that associates the inputs with the correct outputs, when this rule has been found the algorithm is able to use only the inputs so as to be useful for predicting the output. The goal is to be able to trust these predictions with some confidence, to measure the degree of

"confidence" of the model there is the testing and validation phase that defines measures regarding accuracy, precision, recall, F1 score, etc. . .

The other types of machine learning are not important for the purpose of this work. Only Unsupervised Learning deserves a mention, in which unlike the previous one the outputs are not provided but are only the inputs. These algorithms have the task of finding correlations between the inputs provided and grouping them, doing what in jargon is called clustering.