You work for a pharmaceutical company and you are studying data on flower formation in foxgloves to provide a better treatment for heart diseases. You want to find out whether the common foxglove Digitalis purpurea or the woolly foxglove Digitalis lanata can provide a better prognosis.
To perform your research, you must classify data on the length and width of the flower sepals and petals by flower species. To classify the data, you developed a pre-trained model outside of Data Integration.
To operationalize the pre-trained model, complete the following tasks:
1Create a mapping that contains a passive Python transformation and list the pre-trained model as a resource file.
2Write a Python script that accesses the pre-trained model.
3Pass the data on flower sepals and petals to the Python transformation to classify the data by foxglove species.
The following table shows sample sepals and petals data that you can pass to the Python transformation:
Name
Type
Precision
sepal_length
decimal
10
sepal_width
decimal
10
petal_length
decimal
10
petal_width
decimal
10
true_class
string
50
The passive Python transformation uses the following components:
Resource File
Specify the path of the pre-trained model as the resource file.
For example, you might use a pre-trained model that is stored in the file foxgloveDataMLmodel.pkl in the following path:
- Path that is relative to the location on the Secure Agent machine.
For example, if the resource file is under <Secure Agent installation directory>/ext/python/folder1/foxgloveDataMLmodel.pkl, then the relative path would be /folder1/foxgloveDataMLmodel.pkl.
- The supplementary file location for a serverless runtime environment.
/data/home/dtmqa/data/foxgloveDataMLmodel.pkl
Python Code
Specify the Python code in the Pre-Partition Python Code and Main Python Code sections.
Use the Pre-Partition Python Code section to import libraries, load the resource file, and initialize variables.
For example, you might enter the following code in the Pre-Partition Python Code section:
from sklearn import svm from sklearn.externals import joblib import numpy as np clf = joblib.load(resourceFileArrays[0]) classes = ['common', 'woolly']
Use the Main Python Code section to define how the Python transformation uses the pre-trained model to evaluate each row of data.
For example, you might enter the following code in the Main Python Code section: