Example: Operationalize a pre-trained model

You work for a pharmaceutical company and you are studying data on flower formation in foxgloves to provide a better treatment for heart diseases. You want to find out whether the common foxglove Digitalis purpurea or the woolly foxglove Digitalis lanata can provide a better prognosis.

To perform your research, you must classify data on the length and width of the flower sepals and petals by flower species. To classify the data, you developed a pre-trained model outside of Data Integration.

To operationalize the pre-trained model, complete the following tasks:

1Create a mapping that contains a passive Python transformation and list the pre-trained model as a resource file.

2Write a Python script that accesses the pre-trained model.

3Pass the data on flower sepals and petals to the Python transformation to classify the data by foxglove species.

The following table shows sample sepals and petals data that you can pass to the Python transformation:

Name	Type	Precision
sepal_length	decimal	10
sepal_width	decimal	10
petal_length	decimal	10
petal_width	decimal	10
true_class	string	50

The passive Python transformation uses the following components:

Resource File

Specify the path of the pre-trained model as the resource file.

For example, you might use a pre-trained model that is stored in the file foxgloveDataMLmodel.pkl in the following path:

- Path that is relative to the location on the Secure Agent machine.

For example, if the resource file is under <Secure Agent installation directory>/ext/python/folder1/foxgloveDataMLmodel.pkl, then the relative path would be /folder1/foxgloveDataMLmodel.pkl.

- The supplementary file location for a serverless runtime environment.

/data/home/dtmqa/data/foxgloveDataMLmodel.pkl

Python Code

Specify the Python code in the Pre-Partition Python Code and Main Python Code sections.

Use the Pre-Partition Python Code section to import libraries, load the resource file, and initialize variables.

For example, you might enter the following code in the Pre-Partition Python Code section:

from sklearn import svm
from sklearn.externals import joblib
import numpy as np
clf = joblib.load(resourceFileArrays[0])
classes = ['common', 'woolly']

Use the Main Python Code section to define how the Python transformation uses the pre-trained model to evaluate each row of data.

For example, you might enter the following code in the Main Python Code section:

input = [sepal_length, sepal_width, petal_length, petal_width]
input = np.array(input).reshape(1,-1)
pred = clf.predict(input)
predicted_class = classes[pred[0]]
sepal_length_out = sepal_length
sepal_width_out = sepal_width
petal_length_out = petal_length
petal_width_out = petal_width
true_class_out = true_class