November 29, 2016

Importing a Keras Model into Cortex

  1. Saving a Keras model
  2. Generating Layer Outputs
  3. Conversion to Cortex
  4. Using the Cortex model for Inference
  5. Wrap Up

In this post I’ll demonstrate importing a model from Keras into Cortex, ThinkTopic's Clojure library for deep learning. I think this is likely to be a common use of Cortex in the Clojure world going forward. There are a lot of deep learning frameworks now, and many provide training infrastructure and community models. Model portability and transfer learning have become increasingly important in deep learning, and Cortex supports importing models from Caffe for both inference and model fine tuning, and has recently added Keras support.

I personally prefer Keras with TensorFlow as its backend for fast prototyping. Simple models are straight forward to reason about using the sequential API. Complex or non-trivially connected models become easier to reason about using the functional API. I think Cortex shows a lot of promise going forward and I’m proud of the work we’ve put into it in ThinkTopic, but at present it is less mature than other frameworks. Turn key training and visualization for debugging, as with quiver are planned but not yet implemented.

That said, I prefer Clojure to Python, especially when building microservices, or applications that rely heavily on concurrency, or interacting with databases like Datomic, etc. While I don’t feel a lot of immediate pain using modern deep learning frameworks in Python, I’d also prefer not to have to default to Python development everywhere just because it has a good machine learning ecosystem.

One path that Cortex enables is to take trained models from other frameworks, like Keras or Caffe, and load those models in Clojure for inference. In Cortex, a trained neural network is just data, and it can be loaded for inference using the CPU or GPU as a backend. This gives us a lot of portability — a JVM is sufficient to run a Cortex backend, but you can also leverage a GPU with the GPU compute backend, or native libraries via the core.matrix backends. And most importantly, you can run these models from Clojure.

Since I have more familiarity with Keras I’ll demonstrate this process using the Keras import tools we provide with Cortex. This process boils down to:

  • save a model in Keras

  • use the Python script provided in Cortex to generate layer outputs

  • load the model using the Cortex Keras importer

  • validate each layer’s output

  • save/serialize the model with nippy

  • load/deserialize the model from nippy

  • instantiate the neural net model on a Cortex backend

  • call the model for inference

The accompanying code and sample models can be found here.

Saving a Keras model

For simplicity, I’ve provided a basic convolutional neural net run on the CIFAR-10 public dataset.

The meat of the model is:

model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
                        input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.15))

model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.15))

model.add(Convolution2D(96, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.15))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

This is a pretty basic architecture that includes three convolution and max pooling blocks followed by a single fully connected layer and a softmax (class probability) output. The network uses dropout during training for regularization.

In addition, the training data runs through the Keras ImageDataGenerator augmentation pipeline with a few simple transforms applied randomly with each batch of model input:

traingen = ImageDataGenerator(
    rotation_range=15, # degrees
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
)

The code for saving the model is fairly simple. The architecture is converted to JSON and written as text, the weights are saved in an hdf5 file (requires the h5py dependency).

with open("simple.json", "w") as f:
    f.write(model.to_json())
model.save_weights("simple.h5")

Generating Layer Outputs

Given that Cortex is in early development and Keras support is in an experimental phase, importing a Keras model at present requires going through a validation step. The get_keras_output.py script in the Cortex repo is used to generate an hdf5 file that contains the output at each layer, provided a sample image (just generated dummy data in the script for now).

To generate outputs, you can invoke as follows:

> python3 get_keras_output.py --json simple.json simple.h5 simple_output.h5

The json argument is the model architecture, the first .h5 file is the weights file, and the final .h5 file is the sample output that will be used for validation.

Conversion to Cortex

I’ve outlined a dev REPL process for loading a keras model here.

There is one native dependency, libhdf5-dev that needs to be sorted. You can use the instructions in the think.hdf5 repo to install.

The key part in the current version of the script is to use load-sidecar-and-verify in think.cortex.keras.core and to point it at the three artifacts we generated in the previous steps:

(def model
  (keras/load-sidecar-and-verify "python/simple.json"
                                 "python/simple.h5"
                                 "python/simple_output.h5"))

If all goes well you should see information about layer loading and reshaping popping up in stdout:

loading weights/bias for :convolution2d_7
Reshaping weights for :convolution2d_7
loading weights/bias for :convolution2d_8
Reshaping weights for :convolution2d_8
loading weights/bias for :convolution2d_9
...
Reshaping output for:  :convolution2d_7 [32 32 32] 32768
Reshaping output for:  :activation_11 [32 32 32] 32768
Reshaping output for:  :maxpooling2d_7 [16 16 32] 8192
Reshaping output for:  :dropout_9 [16 16 32] 8192
Reshaping output for:  :convolution2d_8 [16 16 64] 16384
...
No reshape required for:  :dense_5
No reshape required for:  :activation_14
#'keras-to-cortex.converter/model

If this fails, you’ll see an ex-info thrown with the layers that weren’t able to load and their corresponding names and dimensions. Since the model is just data, save and load with nippy just works:

;; save
(with-open [w (clojure.java.io/output-stream converted-model-uri)]
(nippy/freeze-to-out! (DataOutputStream. w) model))

;; load
(def restored-model
  (with-open [r (clojure.java.io/input-stream converted-model-uri)]
    (nippy/thaw-from-in! (DataInputStream. r))))

Using the Cortex model for Inference

Using the model for inference requires loading data (in this case an image) and wrapping it in a dataset abstraction (in this case an in-memory dataset), then feeding it into the restored model. The data representation of the neural net is not quite ready for inference. It needs to be loaded into a Cortex backend. This is accomplished with build-and-create-network which takes the the data from the network description we’ve loaded, a backend, and a batch-size for inputs that will be passed to it (in our case, just 1).

(def nn-model
  (future
    (desc/build-and-create-network
      (with-open [r (clojure.java.io/input-stream "resources/simple.cnn")]
        (nippy/thaw-from-in! (DataInputStream. r)))
      (backend/create-cpu-backend)
      1)))

In this case I’ve wrapped the model load into a future (the inference call will deref it) so that time to load the model doesn’t impact application startup time. Not much of an issue with a smaller network, but it can certainly make a difference as networks get larger. For the backend, I’m using the compute backend which provides a pure Clojure implementation that follows idioms that are unified with gpu-compute.

There are a few other pieces that may not seem entirely straight forward if you’re new to Cortex:

(def image-shape
  (apply ds/create-image-shape [3 32 32]))

This returns an image shape hashmap with a structure expected by other Cortex routines.

(patch/image->patch resized)

The patch routines in think.image return a vector of red, blue, and green channel image data and are the form Cortex datasets can wrap through image augmentation processes.

(ds/create-in-memory-dataset {:data {:data [rgb-vec]
                                     :shape image-shape}}
                             [0])]

This is essentially nesting our data in the dataset abstraction to be provided as input to the model. The data in this case consists of only one form of data, with a single image sample, with :shape of image-shape (declared previously). Because we only have one example, we only have one index that corresponds to it, index 0. The nested [0] for a less trivial dataset could be a range of integer values, or some other key by which the data in the dataset could be related to a specific training record.

These details are all encapsulated in the infer function, so that we can go straight from image to prediction with less fuss:

(defn infer
  [buf-img]
  (let [model   @nn-model
        resized (img/resize buf-img (:width image-shape) (:height image-shape))
        rgb-vec (patch/image->patch resized)
        img-ds  (ds/create-in-memory-dataset {:data {:data [rgb-vec]
                                                     :shape image-shape}}
                                             [0])]
    (ffirst (train/run model img-ds [:data]))))

The short REPL workflow I’ve provided in the inference namespace demonstrates how to call the provided routines on new data to get predictions. For example, the simple network I trained on CIFAR-10 and converted is fairly certain my dog Zazzy (at least in her 32x32 representation) is some kind of a plane:

zazzy cifar10
Figure 1. Zazzy: definitely a plane.

Wrap Up

So in conclusion, we’ve seen a workflow for converting a model from Keras to Cortex, then loading it into a project for inference. This was all made possible by think.cortex.keras, available here. Similar functionality is provided for Caffe. This means that even if you’re stuck in Python for machine learning prototyping or training infrastructure, you can still get the fruits of your AI labor into Clojure for inference!

Support for model import is still in its early days, but both contributions and detailed reports of model import failure are welcome!

Tags: clojure deep learning Keras TensorFlow neural networks python