• Non ci sono risultati.

Deep Learning Libraries

N/A
N/A
Protected

Academic year: 2021

Condividi "Deep Learning Libraries"

Copied!
18
0
0

Testo completo

(1)

Deep Learning Libraries

Giuseppe Attardi

(2)

General DL Libraries

Theano

Tensorflow

Pytorch

Caffe

Keras

(3)

Caffe

Deep Learning framework from Berkeley

focus on vision

coding in C++ or Python

Network consists in layers:

layer = { name = “data”, …}

layer = { name = “conv”, ...}

data and derivatives flow through net as blobs

(4)

Theano

Python linear algebra compiler that optimizes

symbolic mathematical computations generating C code

integration with NumPy

transparent use of a GPU

efficient symbolic differentiation

(5)

Example

>>> x = theano.dscalar('x')

>>> y = x ** 2

>>> gy = theano.grad(y, x)

>>> f = theano.function([x], gy)

>>> f(4) array(8.0)

creates symbolic variable

creates symbolic variable

looks like a normal expression

looks like a normal expression

(6)

2-layer Neural Network

z1 = X.dot(W1) + b1 # first layer a1 = tanh(z1) # activation

z2 = a1.dot(W2) + b2 # second layer

y_hat = softmax(z2) # output probabilities

# class prediction

prediction = T.argmax(y_hat, axis=1)

# the loss function to optimize

loss = categorical_crossentropy(y_hat, y).mean()

(7)

Gradient Descent

dW2 = T.grad(loss, W2) db2 = T.grad(loss, b2) dW1 = T.grad(loss, W1) db1 = T.grad(loss, b1)

gradient_step = theano.function([X, y],

updates = ((W2, W2 – learn_rate * dW2), (W1, W1 – learn_rate* dW1), (b2, b2 – learn_rate* db2), (b1, b1 – learn_rate* db1)))

(8)

Training loop

# Gradient descent. For each batch...

for i in xrange(0, epochs):

# updates parameters W2, b2, W1 and b1!

gradient_step(train_X, train_y)

(9)

TensorFlow

C++ core code using Eigen

Eigen template metaprogramming linear algebra generates code for CPU or GPU

Build architecture as dataflow graphs:

 Nodes implement mathematical operations, or data feed or variables

 Nodes are assigned to computational devices and execute asynchronously and in parallel

 Edges carry tensor data between nodes

(10)

Training

Gradient based machine learning algorithms

Automatic differentiation

Define the computational architecture, combine that with your objective function, provide data -- TensorFlow handles computing the derivatives

(11)

Performance

Support for threads, queues, and asynchronous computation

TensorFlow allows assigning graph to different devices (CPU or GPU)

(12)

Keras

(13)

Keras

Keras is a high-level neural networks API, written in Python and capable of running on top of

TensorFlow, CNTK, or Theano.

Guiding principles

 User friendlyness, modularity, minimalism, extensibility, and Python-nativeness.

Keras has out-of-the-box implementations of common network structures:

 convolutional neural network

 Recurrent neural network

Runs seamlessly on CPU and GPU.

(14)

Outline

Install Keras

 Import libraries

 Load image data

Preprocess input

Define model

Compile model

Fit model on training data

Evaluate model

(15)

Install Keras

Full instructions

Basic instructions:

 Install backend: Tensorflow or Theano

> pip install keras

(16)

30 sec. example

from keras.models import Sequential from keras.layers import Dense

model = Sequential()

model.add(Dense(units=64, activation='relu', input_dim=100))

model.add(Dense(units=10, activation='softmax')) model.compile(loss='categorical_crossentropy',

optimizer='sgd', metrics=['accuracy'])

model.compile(loss=keras.losses.categorical_crossent ropy,

optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, nesterov=True))

model.fit(x_train, y_train, epochs=5, batch_size=32) classes = model.predict(x_test, batch_size=128)

(17)

Keras Tutorial

Run tutorial at:

http://attardi-4.di.unipi.it:8000/user/attardi

/notebooks/MNIST/MNIST%20in%20Keras.ipynb

(18)

DeepNL

Deep Learning library for NLP

C++ with Eigen

Wrapper for Python, Java, PHP automatically generated with SWIG

Git repository

Riferimenti

Documenti correlati

Despite the cohomology groups being topological invariants, the Hodge structure on them, which is given by the Hodge decomposition for a compact Kähler manifold, depends on the

At the first glance at unrolling method the basic idea comes in mind that why we do not unrolling all the loops by the maximum value of factor (number of loop iteration).. the point

We show here how can we use machine learn- ing algorithm, in particular deep convolutional neural networks (CNNs), as a tool of inference of physical information on a system.

Decisamente meno complessa e più esplicita è la lettera di dedica che compare sul terzo volume, che uscì il 5 novembre 1523. 8v, a chiusura della tavola e appena prima dell’incipit,

1959: Receptive fields of single neurones in the cat's striate cortex 1962: Receptive fields, binocular interaction and functional architecture in the cat's visual cortex

 GNNs can approximate only the functions that preserve the unfolding equivalence: this result applies also to modern GNNs.  such a behaviour may not be

In the dynamic experiment of the Teacher model, we estimated the final percentage of pruned nodes or filters in each layer using as a reference the results obtained by the one-

quebracho tannins in diet rich in linoleic acid on milk fatty acid composition in dairy ewes. Methods for dietary fibre, neutral