2 Using ONNX models from Clojure
ns onnx
(:require [tablecloth.api :as tc]
(:as j])
[clojure.java.data :import [ai.onnxruntime OrtEnvironment
( OnnxTensor]))
We assume that we get an ONNX file from somewhere and want to use it from Clojure. A model can for example be trained with Python sklearn and exported to ONNX format. There are as well model zoos existing, which allow to download pre-trained models on ONNX format.
We can then open such file using the JAVA ONNX run-time which has the maven coordinates com.microsoft.onnxruntime/onnxruntime {:mvn/version “1.19.0”}
2.1 Load and inspect ONNX file
We use here a model which was trained on the well known iris data and can predict the species
def env (OrtEnvironment/getEnvironment)) (
def session (.createSession env "logreg_iris.onnx")) (
We can inspect the model and among other information discover which input format it needs.
(j/from-java-deep
(.getInputInfo session) {})
"float_input"
{:info
{:dimensionNames ("" ""),
{:numElements -4,
:scalar false,
:shape (-1 4)},
:name "float_input"}}
This shows us that it has one input called “float_input” which needs to be a 2D tensor with dimensions (anyNumber, 4) This matches our knowledge on the iris data, which has 4 columns (+ prediction)
In a similar way we can introspect the model output returned on inference:
(j/from-java-deep
(.getOutputInfo session) {})
"output_label"
{:info
{:dimensionNames (""), :numElements -1, :scalar false, :shape (-1)},
{:name "output_label"},
"output_probability"
:info {:sequenceOfMaps true}, :name "output_probability"}} {
This model predicts one value for each row of the input, which matches as well the iris data. Now we need to construct an instance of ai.onnxruntime.OnnxTensor of shape [anyNumber,4] This can be done starting from a vector-of-vector, for example
2.2 Run inference on arrays
def input-data
(7 0.5 0.5 0.5]
[[0.5 1 1 1]]) [
def tensor (OnnxTensor/createTensor
(
envinto-array (map float-array input-data)))) (
tensor
0x166229ef "OnnxTensor(info=TensorInfo(javaType=FLOAT,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,shape=[2, 4]),closed=false)"] #object[ai.onnxruntime.OnnxTensor
def prediction (.run session {"float_input" tensor})) (
prediction
0x181026b8 "ai.onnxruntime.OrtSession$Result@181026b8"] #object[ai.onnxruntime.OrtSession$Result
We have two pieces of data in the prediction result:
map key prediction) (
"output_label" "output_probability") (
namely predicted labels and probabilities We need a bit of inter-op to get the numbers out of the prediction predicted species:
-> prediction first val .getValue) (
0, 0] [
probability distribution for each species for all labels:
map
(%)
#(.getValue -> prediction second val .getValue)) (
0 0.6436056, 1 0.35639435, 2 8.642593E-8}
({0 0.99678445, 1 0.0032155933, 2 4.3943203E-8}) {
2.3 Run inference on tech.dataset
In case we have our data in a tech.ml.dataset
def ds
(0.5 0.5 0.5 0.5]
(tc/dataset [[1 1 1 1]
[1 1 2 7]
[3 1 2 1]
[7 8 2 10]])) [
ds
:_unnamed [5 4]:
0 | 1 | 2 | 3 |
---|---|---|---|
0.5 | 0.5 | 0.5 | 0.5 |
1.0 | 1.0 | 1.0 | 1.0 |
1.0 | 1.0 | 2.0 | 7.0 |
3.0 | 1.0 | 2.0 | 1.0 |
7.0 | 8.0 | 2.0 | 10.0 |
we can convert it to a OnnxTensor as well easily
def tensor-2
(
(OnnxTensor/createTensor
env into-array (map float-array (tc/rows ds))))) (
Running predictions is then the same.
def prediction-2 (.run session {"float_input" tensor-2})) (
-2 (get 0) getValue) (.. prediction
0, 0, 2, 0, 2] [
Overall we can use any ONNX model from Clojure. This allows polyglot scenarios where data preprocession and model evaluation is done in Clojure, while training is done in Python with its huge ecosystem of ML models.
Hopefuly over time the ONNX standard will see widespread use. Most sklearn models/pipelines can be exported to ONNX using sklearn-onnx Other python ML frmeworks start to support ONNX as well for example PyTorch, see PyTorch ONNX export