Deployment on Azure Machine Learning
Pre-requisites
cd inference/triton_server
Set the environment for AML:
export RESOURCE_GROUP=Dhruva-prod
export WORKSPACE_NAME=dhruva--central-india
export DOCKER_REGISTRY=dhruvaprod
Also remember to edit the yml
files accordingly.
Registering the model
az ml model create --file azure_ml/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME
Pushing the docker image to Container Registry
az acr login --name $DOCKER_REGISTRY
docker tag indictrans2_triton $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest
docker push $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest
Creating the execution environment
az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
Publishing the endpoint for online inference
az ml online-endpoint create -f azure_ml/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image.
Attaching a deployment
az ml online-deployment create -f azure_ml/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME
Testing if inference works
- From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without
https://
or trailing/
) and an authentication key. - In
client.py
, enableENABLE_SSL = True
, and then set theENDPOINT_URL
variable as well asAuthorization
value insideHTTP_HEADERS
. - Run
python3 client.py