Deep Learning Libraries

(1)

Deep Learning Libraries

Giuseppe Attardi

(2)

General DL Libraries

 Theano

 Tensorflow

 Pytorch

 Caffe

 Keras

(3)

Caffe

 Deep Learning framework from Berkeley

 focus on vision

 coding in C++ or Python

 Network consists in layers:

layer = { name = “data”, …}

layer = { name = “conv”, ...}

 data and derivatives flow through net as blobs

(4)

Theano

 Python linear algebra compiler that optimizes

symbolic mathematical computations generating C code

 integration with NumPy

 transparent use of a GPU

 efficient symbolic differentiation

(5)

Example

>>> x = theano.dscalar('x')

>>> y = x ** 2

>>> gy = theano.grad(y, x)

>>> f = theano.function([x], gy)

>>> f(4) array(8.0)

creates symbolic variable

looks like a normal expression

(6)

2-layer Neural Network

z1 = X.dot(W1) + b1 # first layer a1 = tanh(z1) # activation

z2 = a1.dot(W2) + b2 # second layer

y_hat = softmax(z2) # output probabilities

# class prediction

prediction = T.argmax(y_hat, axis=1)

# the loss function to optimize

loss = categorical_crossentropy(y_hat, y).mean()

(7)

Gradient Descent

dW2 = T.grad(loss, W2) db2 = T.grad(loss, b2) dW1 = T.grad(loss, W1) db1 = T.grad(loss, b1)

gradient_step = theano.function([X, y],

updates = ((W2, W2 – learn_rate * dW2), (W1, W1 – learn_rate* dW1), (b2, b2 – learn_rate* db2), (b1, b1 – learn_rate* db1)))

(8)

Training loop

# Gradient descent. For each batch...

for i in xrange(0, epochs):

# updates parameters W2, b2, W1 and b1!

gradient_step(train_X, train_y)

(9)

TensorFlow

 C++ core code using Eigen

 Eigen template metaprogramming linear algebra generates code for CPU or GPU

 Build architecture as dataflow graphs:

 Nodes implement mathematical operations, or data feed or variables

 Nodes are assigned to computational devices and execute asynchronously and in parallel

 Edges carry tensor data between nodes

(10)

Training

 Gradient based machine learning algorithms

 Automatic differentiation

 Define the computational architecture, combine that with your objective function, provide data -- TensorFlow handles computing the derivatives

(11)

Performance

 Support for threads, queues, and asynchronous computation

 TensorFlow allows assigning graph to different devices (CPU or GPU)

(12)

Keras

(13)

Keras

 Keras is a high-level neural networks API, written in Python and capable of running on top of

TensorFlow, CNTK, or Theano.

 Guiding principles

 User friendlyness, modularity, minimalism, extensibility, and Python-nativeness.

 Keras has out-of-the-box implementations of common network structures:

 convolutional neural network

 Recurrent neural network

 Runs seamlessly on CPU and GPU.

(14)

Outline

 Install Keras

 Import libraries

 Load image data

 Preprocess input

 Define model

 Compile model

 Fit model on training data

 Evaluate model

(15)

Install Keras

 Full instructions

 Basic instructions:

 Install backend: Tensorflow or Theano

> pip install keras

(16)

30 sec. example

from keras.models import Sequential from keras.layers import Dense

model = Sequential()

model.add(Dense(units=64, activation='relu', input_dim=100))

model.add(Dense(units=10, activation='softmax')) model.compile(loss='categorical_crossentropy',

optimizer='sgd', metrics=['accuracy'])

model.compile(loss=keras.losses.categorical_crossent ropy,

optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, nesterov=True))

model.fit(x_train, y_train, epochs=5, batch_size=32) classes = model.predict(x_test, batch_size=128)

(17)

Keras Tutorial

 Run tutorial at:

 http://attardi-4.di.unipi.it:8000/user/attardi

/notebooks/MNIST/MNIST%20in%20Keras.ipynb

(18)

DeepNL

 Deep Learning library for NLP

 C++ with Eigen

 Wrapper for Python, Java, PHP automatically generated with SWIG

 Git repository