REST API Reference > Model Serve REST API > Model deployments
  

Model deployments

Use API resources to create a model deployment for a machine learning model, edit the deployment, and monitor deployments in your organization.

Creating a model deployment

Use the Documents resource to create a model deployment.

POST request

To create a model deployment, include the project or folder ID in the URI. Use the following URI:
/frs/v1/Projects('<project ID>')/Documents
Include the following fields in the request:
Field
Type
Description
name
String
Model deployment name.
description
String
Optional. Description of the model deployment.
documentType
String
Use MLOPS_DEPLOYMENT.
documentState
String
Use an empty string.
nativeData
Object
Blob object that defines the model deployment.
Include the following fields in the nativeData object:
Field
Type
Description
name
String
Model deployment name.
modelId
String
ID of the machine learning model to associate with this deployment.
computeUnits
Integer
Maximum number of compute units that you want the model deployment to use. Enter a whole number that is a multiple of 4 from 4 to 40.

POST response

Returns the model deployment summary.

Editing a model deployment

Use the Documents resource to edit a model deployment.

PATCH request

To edit a model deployment, include the deployment ID in the URI. Use the following URI:
/frs/v1/Documents('<deployment ID>')
Get the deployment ID from the response to create a model deployment or to monitor model deployments.
Include the following fields in the request:
Field
Type
Description
name
String
Model deployment name.
description
String
Optional. Description of the model deployment.
documentType
String
Use MLOPS_DEPLOYMENT.
nativeData
Object
Blob object that defines the model deployment.
Include the following fields in the nativeData object:
Field
Type
Description
name
String
Model deployment name.
modelId
String
ID of the machine learning model to associate with this deployment.
computeUnits
Integer
Maximum number of compute units that you want the model deployment to use. Enter a whole number that is a multiple of 4 from 4 to 40.

POST response

Returns the model deployment summary.

Monitoring model deployments

Use the deployment resource to monitor the status of a model deployment.

GET request

You can request the status of all deployments in the organization or the status of a particular deployment.
Status of all deployments in the organization
To get the status of all deployments in the organization, use the following URI:
/mlops/api/v1/deployment/monitor
You can include parameters in the request to sort and filter the results. Use the following syntax to specify parameters:
/mlops/api/v1/deployment/monitor?offset=<offset>&limit=<limit>&filter=<filter>&sortkey=<sort key>&sortdir=<sort direction>
The following table describes the parameters you can use:
Parameter
Description
offset
Number of results to offset from the beginning. For example, if you want the request to skip the top 5 results, set the offset to 5.
limit
Number of results to return. For example, if you want only the top 10 results, set the limit to 10.
filter
Term to use to filter results. The results only include deployments with the filter term in the deployment name.
sortkey
Field to use to sort the results. Use one of the following values:
  • - NAME
  • - DISPLAY_STATUS
  • - LOCATION
  • - STARTED_BY
  • - START_TIME
  • - STOP_TIME
  • - DURATION
sortdir
Direction to sort the results. Use either ASC or DESC.
Status of a particular deployment
To get the status of a particular deployment, include the deployment ID in the URI. Use the following URI:
/mlops/api/v1/deployment/monitor/<deployment ID>
Get the deployment ID from the response to create a model deployment or a previous response to monitor model deployments.

GET response

Returns details about the results and the deployment status object for every deployment within the query parameters. If you requested the details for a particular deployment, returns only the deployment status object.
The following table describes the fields in a deployment status object:
Field
Description
count
Number of deployments returned in this response.
offset
Offset used in the response.
Note: This might differ from the offset in the request if the requested value was out of bounds.
limit
Limit used in the response.
Note: This might differ from the limit in the request if the requested value was out of bounds.
deployments
Object that describes each deployment.
deploymentID
ID of the model deployment.
deploymentName
Name of the model deployment.
state
Requested state of the deployment. When you start or restart the deployment, the state is ENABLED. When you stop the deployment, the state is DISABLED.
status
Internal status of the deployment.
statusLabel
External status label for the current state and status.
message
Warning or error message string.
Applies only if a warning or error occurs.
locationID
ID of the folder that stores the deployment.
locationName
Name of the folder that stores the deployment.
startedByUserID
ID of the user that last started the deployment.
startedbyUserName
Name of the user that last started the deployment.
startTime
Time that the deployment started, in milliseconds from the UNIX epoch, 00:00:00 UTC on January 1, 1970.
stopTime
Time that the deployment stopped, in milliseconds from the UNIX epoch.
Applies only when you stop the deployment.
duration
Number of milliseconds since the deployment started running.
updateTime
Time that the deployment status was last updated, in milliseconds from the UNIX epoch.
monitorTime
Time that the deployment status was checked, in milliseconds from the UNIX epoch.
agentSaasid
ID of the agent that deploys the job.
communicationMode
Mode of communication to the REST API. The communication mode is channel.
predictUrl
Endpoint URL to use to generate predictions from the model.
framework
Framework used to create the machine learning model.
frameworkType
Type of framework used to register the machine learning model.