Model Serving
Inference Model is a package in Analytics Zoo aiming to provide high-level APIs to speed-up development. It allows user to conveniently use pre-trained models from Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR). Inference Model provides Java, Scala and Python interfaces.
Highlights
- Easy-to-use APIs for loading and prediction with deep learning models of Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR).
- Support transformation of various input data type, thus supporting future prediction tasks.
- Transparently support the OpenVINO toolkit, which deliver a significant boost for inference speed (up to 19.9x).
Load and predict with pre-trained model
Basic usage of Inference Model:
- Directly use InferenceModel or write a subclass extends
InferenceModel
(AbstractInferenceModel
in Java). - Load pre-trained models with corresponding
load
methods, e.g,doLoadBigDL
for Analytics Zoo, anddoLoadTensorflow
for TensorFlow. - Do prediction with
predict
method.
Supported models:
Predict input and output
predictInput
: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Input data for prediction. JTensor is a 1D List, with Array[Int] shape.predictOutput
: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Prediction result.
OpenVINO requirements:
Ubuntu 18.04 LTS (64 bit)
CentOS 7.4 (64 bit)
macOS 10.13, 10.14 (64 bit)
Python requirements:
tensorflow>=1.2.0
networkx>=1.11
numpy>=1.12.0
protobuf==3.6.1
Java
Write a subclass that extends AbstractInferenceModel
, implement or override methods. Then, load model with corresponding load
methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with loadBigDL
, loadCaffe
, loadOpenVINO
and loadTensorflow
), and do prediction with predict
method.
import com.intel.analytics.zoo.pipeline.inference.AbstractInferenceModel;
import com.intel.analytics.zoo.pipeline.inference.JTensor;
public class ExtendedInferenceModel extends AbstractInferenceModel {
public ExtendedInferenceModel() {
super();
}
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
// Load Analytics Zoo model
model.loadBigDL(modelPath, weightPath);
// Predict
List<List<JTensor>> result = model.predict(inputList);
Scala
New an instance of InferenceModel
, and load model with corresponding load
methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with doLoadBigDL
, doLoadCaffe
, doLoadOpenVINO
and doLoadTensorflow
), then do prediction with predict
method.
import com.intel.analytics.zoo.pipeline.inference.InferenceModel
val model = new InferenceModel()
// Load Analytics Zoo model
model.doLoadBigDL(modelPath, weightPath)
// Predict
val result = model.doPredict(inputList)
In some cases, you may want to write a subclass that extends InferenceModel
, implement or override methods. Then, load model with corresponding load
methods, and do prediction with predict
method.
import com.intel.analytics.zoo.pipeline.inference.InferenceModel
class ExtendedInferenceModel extends InferenceModel {
}
val model = new ExtendedInferenceModel()
// Load Analytics Zoo model
model.doLoadBigDL(modelPath, weightPath)
// Predict
val result = model.doPredict(inputList)
Python
New an instance of InferenceModel
, and load Zoo model with corresponding load
methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with load_bigdl
, load_caffe
, load_openvino
and load_tensorflow
), then do prediction with predict
method.
from zoo.pipeline.inference import InferenceModel
model = InferenceModel()
# Load Analytics Zoo model
model.load_bigdl(model_path, weight_path)
# Predict
result = model.predict(input_list)
In some cases, you may want to write a subclass that extends InferenceModel
, implement or override methods. Then, load model with corresponding load
methods, and do prediction with predict
method.
from zoo.pipeline.inference import InferenceModel
class ExtendedInferenceModel(InferenceModel):
def __init__(self):
pass
model = ExtendedInferenceModel()
# Load Analytics Zoo model
model.load_bigdl(model_path, weight_path)
# Predict
result = model.predict(input_list)
Examples
We provide examples based on InferenceModel.
See here for the Java example.
See here for the Scala example.