In this article we will take a quick look at a vitally important aspect of machine learning. It’s one that we sometimes overlook – bringing our predictive models into production. And specifically: bringing our models into production without changing them in any way!
We must give consideration to the languages used in model training and model deployment – and we should do this before any model work begins! It is better to consider the two environments (and sometimes the two teams) as a whole and then work to a common interface. Imagine if the two engineering teams responsible for the construction of the Channel Tunnel – from the English and French ends – had only considered alignment issues just prior to completion, instead of at the outset of the project!*
If we use the model in a “static” way – i.e. on data-at-rest, away from production processes – then this is not so critical. But if we want to make predictions in near-real-time on data-in-motion, we will need a scalable, enterprise-ready development environment – i.e. probably C# or java. This is so that we can take advantage of the mature memory-management, concurrency and security features that such languages offer.
Here, we will limit our discussion to java as the deployment language. This is not the only approach and may not fit your use-case, but it’s a scenario that is not uncommon. It is also important to remember that callouts against our model are often “lightweight”, in the sense that we are applying weights, coefficients etc. to a small amount of data subject to the prediction we are making.
Strategies for deploying to production
Our options are not endless, as there are only a certain number of frameworks available that provide such a bridge. This is certainly not an exhaustive list, but suitable candidates would be, in no particular order:
- Python + REST (python flask)
Let’s have a quick look at each of them.
Dl4j is a java machine learning library which we can use to either write or import models. The import option is interesting if your data scientists are using the increasingly popular Keras library (on top of e.g. Tensorflow) as Dl4j offers a direct import harness for such models. Be careful to create the input arrays correctly as there are some slight differences here between Keras and Dl4j! Otherwise the Dl4j team keep the compatibility with Keras versions fairly well up-to-date.
H2O is java library with a python interface – so it’s very simple to export your model into production as java classes. It is maybe advisable to keep them in a distinct .jar artefact so that versioning is as transparent as possible.
Alternatively, for some types of model we can export them in the form of a zipped folder which we can reference as a maven resource. This allows us to switch out our model implementation at runtime, without having to recompile any packages.
The H2O python environment offers some really nice extras – for instance, the ability to visualize model training results in a separate h2o server, and the built-in preprocessing options (standardization and normalization).
Weka has been around for some time as the data mining component of the Pentaho stack (now distributed by Hitachi). However, it is much more than “just” a data mining tool. It’s also a standalone tool in its own right. Open source and offering a suite of features for rapid model development, it can easily transform and scale data, run different algorithms and select the best performing model. It’s written in java, and like Dl4j and H2O it is possible in production to load the model directly from file as an addressable artefact. One of the minor downsides is that the data input format is fairly specific to Weka. Everything runs in the JVM so you may have a limitation on the amount of data you can process, too.
Python + REST
There may however be no easy way to make your model available to your enterprise layer. This could be due to the algorithms or the machine learning language you are using. You may decide to keep the model in e.g. python and to expose it to the outside world via a REST service.
Setting up a clean python flask REST service is a little more involved than I expected it to be! – but sometimes python has ready-made algorithms that java does not. Bayesian optimization is a good example of this – see scikit-optimize for more details, as well as my article on this topic.
This overview is of course incomplete and subjective for several reasons. I have never used R and haven’t written code in C# for several years. Our focus is specifically on models that we will deploy to micro-service environments in production. Another aspect to model training is the need to not only straddle different environments (discussed here), but also to find the right balance between training models at different levels of scale. That will be the subject of a separate article.
*As a young civil engineer I attended a surveying course near one of the main bases for the Channel Tunnel engineering teams, a year before the final breakthrough was made at the end of 1990. There had been rumours that the successful alignment of both ends of the tunnel was a close run thing, although in the event the two tunnels were only off by 350mm (horizontally) and 60mm vertically over 30+km of underwater tunneling: quite an achievement. Of the three bore machines used, one is still exhibited near the French end of the tunnel, one was driven 50m further down into the ground, where it still is, acting as an earth, and the third was auctioned off on EBay!