# Overview of Some Deep Learning Libraries

Machine learning is a broad topic. Deep learning, in particular, is a way of using neural networks for machine learning. Neural network is probably a concept older than machine learning, dated back to 1950s. Unsurprisingly, there were many libraries created for it.

In the following, we will give an overview of some of the famous libraries for neural network and deep learning.

After finishing this tutorial, you will learn

Some of the deep learning or neural network libraries

The functional difference between two common libraries, PyTorch and TensorFlow

Let’s get started.

## Overview

This tutorial is in three parts, they are

The C++ Libraries

Python Libraries

PyTorch and TensorFlow

## The C++ Libraries

Deep learning gained attention in the last decade. Before that, we were not confident on how to train a neural network with many layers. However, the understanding on how to build a multilayer perceptrons was around for many years.

Before we have deep learning, probably the most famous neural network library is libann. It is a library for C++ and the functionality is limited due to its age. This library has stopped development. A newer library for C++ is OpenNN. It allows modern C++ syntax.

But that’s pretty much all for C++. The rigid syntax of C++ may be the reason we do not have too many libraries for deep learning. The training phase of deep learning project is about experiments. We would like some tools that allows us to iterate faster. Hence a dynamic programming language could be a better fit. Therefore, you will see Python comes on the scene.

## Python Libraries

One of the earliest library for deep learning is Caffe. It is developed in U.C. Berkeley and specifically for computer vision problems. While it is developed in C++, it is served as a library with a Python interface. Hence we can build our project in Python with the network defined in a JSON-like syntax.

Chainer is another library in Python. It is an influential one because the syntax makes a lot of sense. While it is less common nowadays, the API in Keras and PyTorch bears resemblence to Chainer. The following is an example from Chainer’s documentation and you may mistaken it as Keras or PyTorch:

import chainer

import chainer.functions as F

import chainer.links as L

from chainer import iterators, optimizer, training, Chain

from chainer.datasets import mnist

train, test = mnist.get_mnist()

batchsize = 128

max_epoch = 10

train_iter = iterators.SerialIterator(train, batchsize)

class MLP(Chain):

def __init__(self, n_mid_units=100, n_out=10):

super(MLP, self).__init__()

with self.init_scope():

self.l1 = L.Linear(None, n_mid_units)

self.l2 = L.Linear(None, n_mid_units)

self.l3 = L.Linear(None, n_out)

def forward(self, x):

h1 = F.relu(self.l1(x))

h2 = F.relu(self.l2(h1))

return self.l3(h2)

# create model

model = MLP()

model = L.Classifier(model) # using softmax cross entropy

# set up optimizer

optimizer = optimizers.MomentumSGD()

optimizer.setup(model)

# connect train iterator and optimizer to an updater

updater = training.updaters.StandardUpdater(train_iter, optimizer)

# set up trainer and run

trainer = training.Trainer(updater, (max_epoch, ‘epoch’), out=’mnist_result’)

trainer.run()

The other obsoleted library is Theano. It has ceased development but once upon a time it is a major library for deep learning. In fact, the earlier version of Keras library allows to choose between Theano or TensorFlow backend. Indeed, neither Theano nor TensorFlow are deep learning libraries precisely. Rather, they are tensor libraries that make matrix operations and differentiation handy, which deep learning operations can be built upon. Hence these two are considered replacement from each other from Keras’ perspective.

CNTK from Microsoft and Apache MXNet are the two other libraries that worth to mention. They are large with interface for multiple languages. Python, of course, is one of them. CNTK has C# and C++ interfaces while MXNet provides interfaces for Java, Scala, R, Julia, C++, Clojure, and Perl. But recently, Microsoft decided to stop developing CNTK. But MXNet does have some momentum and it is probably the most popular library after TensorFlow and PyTorch.

Below is an example of using MXNet via the R interface. Conceptually, you see the syntax similar to Keras functional API:

require(mxnet)

train <- read.csv(‘data/train.csv’, header=TRUE)

train <- data.matrix(train)

train.x <- train[,-1]

train.y <- train[,1]

train.x <- t(train.x/255)

data <- mx.symbol.Variable(“data”)

fc1 <- mx.symbol.FullyConnected(data, name=”fc1″, num_hidden=128)

act1 <- mx.symbol.Activation(fc1, name=”relu1″, act_type=”relu”)

fc2 <- mx.symbol.FullyConnected(act1, name=”fc2″, num_hidden=64)

act2 <- mx.symbol.Activation(fc2, name=”relu2″, act_type=”relu”)

fc3 <- mx.symbol.FullyConnected(act2, name=”fc3″, num_hidden=10)

softmax <- mx.symbol.SoftmaxOutput(fc3, name=”sm”)

devices <- mx.cpu()

mx.set.seed(0)

model <- mx.model.FeedForward.create(softmax, X=train.x, y=train.y,

ctx=devices, num.round=10, array.batch.size=100,

learning.rate=0.07, momentum=0.9,

eval.metric=mx.metric.accuracy,

initializer=mx.init.uniform(0.07),

epoch.end.callback=mx.callback.log.train.metric(100))

## PyTorch and TensorFlow

PyTorch and TensorFlow are the two major libraries nowadays. In the past when TensorFlow was in version 1.x, they are vastly different. But as TensorFlow absorbed Keras as part of its library, these two library are working similarly most of the time.

PyTorch is backed by Facebook and its syntax is stable over the years. There are also a lot of existing models that we can borrow. The common way of defining a deep learning model in PyTorch is to create a class:

import torch

import torch.nn as nn

import torch.nn.functional as F

class Model(nn.Module):

def __init__(self):

super().__init__()

self.conv1 = nn.Conv2d(1, 6, kernel_size=(5,5), stride=1, padding=2)

self.pool1 = nn.AvgPool2d(kernel_size=2, stride=2)

self.conv2 = nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0)

self.pool2 = nn.AvgPool2d(kernel_size=2, stride=2)

self.conv3 = nn.Conv2d(16, 120, kernel_size=5, stride=1, padding=0)

self.flatten = nn.Flatten()

self.linear4 = nn.Linear(120, 84)

self.linear5 = nn.Linear(84, 10)

self.softmax = nn.LogSoftMax(dim=1)

def forward(self, x):

x = F.tanh(self.conv1(x))

x = self.pool1(x)

x = F.tanh(self.conv2(x))

x = self.pool2(x)

x = F.tanh(self.conv3(x))

x = self.flatten(x)

x = F.tanh(self.linear4(x))

x = self.linear5(x)

return self.softmax(x)

model = Model()

but there are also a sequential syntax to makes the code more concise:

import torch

import torch.nn as nn

model = nn.Sequential(

# assume input 1x28x28

nn.Conv2d(1, 6, kernel_size=(5,5), stride=1, padding=2),

nn.Tanh(),

nn.AvgPool2d(kernel_size=2, stride=2),

nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0),

nn.Tanh(),

nn.AvgPool2d(kernel_size=2, stride=2),

nn.Conv2d(16, 120, kernel_size=5, stride=1, padding=0),

nn.Tanh(),

nn.Flatten(),

nn.Linear(120, 84),

nn.Tanh(),

nn.Linear(84, 10),

nn.LogSoftmax(dim=1)

)

TensorFlow in version 2.x adopted Keras as part of its libraries. In the past, these two are separate projects. In TensorFlow 1.x, we need to build a computation graph, set up a session, and derive gradients from a session for the deep learning model. Hence it is a bit too verbose. Keras is designed as a library to hide all these low level details.

The same network as above can be produced by TensorFlow’s Keras syntax as follows:

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Conv2D, Dense, AveragePooling2D, Flatten

model = Sequential([

Conv2D(6, (5,5), input_shape=(28,28,1), padding=”same”, activation=”tanh”),

AveragePooling2D((2,2), strides=2),

Conv2D(16, (5,5), activation=”tanh”),

AveragePooling2D((2,2), strides=2),

Conv2D(120, (5,5), activation=”tanh”),

Flatten(),

Dense(84, activation=”tanh”),

Dense(10, activation=”softmax”)

])

One major difference between PyTorch and Keras syntax is on the training loop. In Keras, we just need to assign the loss function, the optimization algorithm, the dataset, and some other parameters to the model. Then we have a fit() function to do all the training work, as follows:

…

model.compile(loss=”categorical_crossentropy”, optimizer=”adam”, metrics=[“accuracy”])

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=100, batch_size=32)

But in PyTorch, we need to write our own training loop code:

# self-defined training loop function

def training_loop(model, optimizer, loss_fn, train_loader, val_loader=None, n_epochs=100):

best_loss, best_epoch = np.inf, -1

best_state = model.state_dict()

for epoch in range(n_epochs):

# Training

model.train()

train_loss = 0

for data, target in train_loader:

output = model(data)

loss = loss_fn(output, target)

optimizer.zero_grad()

loss.backward()

optimizer.step()

train_loss += loss.item()

# Validation

model.eval()

status = (f”{str(datetime.datetime.now())} End of epoch {epoch}, “

f”training loss={train_loss/len(train_loader)}”)

if val_loader:

val_loss = 0

for data, target in val_loader:

output = model(data)

loss = loss_fn(output, target)

val_loss += loss.item()

status += f”, validation loss={val_loss/len(val_loader)}”

print(status)

optimizer = optim.Adam(model.parameters())

criterion = nn.NLLLoss()

training_loop(model, optimizer, criterion, train_loader, test_loader, n_epochs=100)

This may not be an issue if you’re experimenting a new design of network which you want to have more control on how the loss is calculated and how the optimizer updates the model weights. But otherwise, you would appreciate the simpler syntax from Keras.

Note that, both PyTorch and TensorFlow are libraries with Python interface. Therefore, it is possible to have interface for other languages too. For example, there are Torch for R and TensorFlow for R.

Also note that, the libraries we mentioned above are full-featured libraries that includes training and prediction. If we consider a production environment where we make use of a trained model, there could be wider choice. TensorFlow has a “TensorFlow Lite” counterpart that allows a trained model to be run in mobile or on the web. Intel also has a OpenVINO library that aims at optimizing the performance in prediction.

## Further Reading

Below are the links to the libraries we mentioned above:

libann

OpenNN

Chainer

Caffe

CNTK

MXNet

Theano

PyTorch

TensorFlow

Torch for R

TensorFlow for R

## Summary

In this post you discovered various deep learning libraries and some of their characteristics. Specifically, you learned:

What are the libraries available for C++ and Python

How the Chainer library influenced the syntax in building a deep learning model nowadays

The relationship between Keras and TensorFlow 2.x

What are the difference between PyTorch and TensorFlow

The post Overview of Some Deep Learning Libraries appeared first on Machine Learning Mastery.