• Non ci sono risultati.

Spark Streaming

N/A
N/A
Protected

Academic year: 2021

Condividi "Spark Streaming"

Copied!
5
0
0

Testo completo

(1)

Spark Streaming

(2)

Predict the variety of iris plants in real-time

Inputs:

Training data

Static input file: training.csv

New data

A stream of records about new iris plants

Output

Predicted class label/variety for each new iris plant using only column “sepallength” and

“sepalwidth”

(3)

Training data schema

sepallength: double

sepalwidth: double

petallength: double – not considered

petalwidth: double – not considered

variety: integer

1 -> Setosa category

2 -> Versicolor category

3 -> Virginica category

(4)

New streaming data schema

sepallength: double

sepalwidth: double

petallength: double

petalwidth: double

variety: it is always null for the new data

(5)

Example training data

sepallength,sepalwidth,petallength,petalwidth,variety 5.1,3.5,1.4,0.2,1

Example new streaming data

5.2,3.2,1.4,0.2,

Riferimenti

Documenti correlati

bivalenza, il dialogo lasciava però trapelare sia il filoerasmismo sia la posizione ferocemente anticlericale dell’autore, sostanziata anche dagli ideali etici e civili laici,

Contestualmente la rivoluzione zappiana produsse forti echi oltre che sugli studi aziendali anche sulla prassi professionale (Canziani, 1987, pp. Non vi è dubbio che l’evoluzione

spark-submit --class it.polito.spark.DriverMyApplication -- deploy-mode client --master local

 The DataFrameReader class (the same we used for reading a json file and store it in a DataFrame) provides other methods to read many standard (textual) formats and read data

 Split the input stream in batches of 10 seconds each and print on the standard output, for each batch, the occurrences of each word appearing in the batch. ▪ i.e., execute the

 If a stock satisfies the conditions multiple times in the same batch, return the stockId only one time for each

 The DataFrameReader class (the same we used for reading a json file and store it in a DataFrame) provides other methods to read many standard (textual) formats and read data

 For each unlabeled sentence the predicted class label value by using a logistic regression algorithm.  You must train the model by using as input two