## Instructions to run end-to-end demo ## Chapters [I. Installation of KServe & its dependencies](#installation-of-kserve--its-dependencies) [II. Setting up local MinIO S3 storage](#setting-up-local-minio-s3-storage) [III. Setting up your OpenShift AI workbench](#setting-up-your-openshift-ai-workbench) [IV. Train model and evaluate](#train-model-and-evaluate) [V. Convert model to Caikit format and save to S3 storage](#convert-model-to-caikit-format-and-save-to-s3-storage) [V. Deploy model onto Caikit-TGIS Serving Runtime](#deploy-model-onto-caikit-tgis-serving-runtime) [VI. Model inference](#model-inference) **Prerequisites** * To support training and inference, your cluster needs a node with CPUS, 4 GPUs, and GB memory. Instructions to add GPU support to RHOAI can be found [here](https://docs.google.com/document/d/1T2oc-KZRMboUVuUSGDZnt3VRZ5s885aDRJGYGMkn_Wo/edit#heading=h.9xmhoufikqid). * You have a cluster administrator permissions * You have installed the OpenShift CLI (`oc`) * You have installed the `Red Hat OpenShift Service Mesh Operator` * You have installed the `Red Hat OpenShift Serverless Operator` * You have installed the `Red Hat OpenShift AI Operator` and created a **DataScienceCluster** object ### Installation of KServe & its dependencies Instructions adapted from [Manually installing KServe](https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2-latest/html/serving_models/serving-large-models_serving-large-models#manually-installing-kserve_serving-large-models) 1. Git clone this repository ``` git clone https://github.com/trustyai-explainability/trustyai-detoxify-sft.git ``` 2. Login to your OpenShift cluster as a cluster adminstrator ``` oc login --token= ``` 2. Create the required namespace for Red Hat OpenShift Service Mesh ``` oc create ns istio-system ``` 3. Create a `ServiceMeshControlPlane` object ``` oc apply -f manifests/kserve/smcp.yaml -n istio-system ``` 4. Sanity check to verify creation of the service mesh instance ``` oc get pods -n istio-system ``` Expected output: ``` NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h ``` 5. Create the required namespace for a `KnativeServing` instance ``` oc create ns knative-serving ``` 6. Create a `ServiceMeshMember` object ``` oc apply -f manifests/kserve/default-smm.yaml -n knative-serving ``` 7. Create and define a `KnativeServing` object ``` oc apply -f manifests/kserve/knativeserving-istio.yaml -n knative-serving ``` 8. Sanity check to validate creation of the Knative Serving instance ``` oc get pods -n knative-serving ``` Expected output: ``` NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h ``` 9. From the web console, install KServe by going to **Operators -> Installed Operators** and click on the **Red Hat OpenShift AI Operator** 10. Click on the **DSC Intialization** tab and click on the **default-dsci** object 11. Click on the **YAML** tab and in the `spec` section, change the `serviceMesh.managementState` to `Unmanaged` ``` spec: serviceMesh: managementState: Unmanaged ``` 12. Click **Save** 12. Click on the **Data Science Cluster** tab and click on the **default-dsc** object 13. Click on the **YAML** tab and in the `spec` section, change the `components.kserve.managementState` and the `components.kserve.serving.managementState` to `Managed` ``` spec: components: kserve: managementState: Managed serving: managementState: Managed ``` 15. Click **Save** ### Setting up local MinIO S3 storage 1. Create a namespace for your project called "detoxify-sft" ``` oc create namespace detoxify-sft ``` 2. Set up your local MinIO S3 storage in your newly created namespace ``` oc apply -f manifests/minio/setup-s3.yaml -n detoxify-sft ``` 3. Run the following sanity checks ``` oc get pods -n detoxify-sft | grep "minio" ``` Expected output: ``` NAME READY STATUS RESTARTS AGE minio-7586f6f744-nvdl 1/1 Running 0 22h ``` ``` oc get route -n detoxify-sft | grep "minio" ``` Expected output: ``` NAME STATUS LOCATION SERVICE minio-api Accepted https://minio-api... minio-service minio-ui Accepted https://minio-ui... minio-service ``` 4. Get the MinIO UI location URL and open it in a web browser ``` oc get route minio-ui -n detoxify-sft ``` 5. Login using the credentials in `manifests/minio/setup-s3.yaml` **user**: `minio` **password**: `minio123` 6. Click on **Create a Bucket** and choose a name for your bucket and click on **Create Bucket** ### Setting up your OpenShift AI workbench 1. Go to Red Hat OpenShift AI from the web console 2. Click on **Data Science Projects** and then click on **Create data science project** 3. Give your project a name and then click **Create** 4. Click on the **Workbenches** tab and then create a workbench with a Pytorch notebook image, set the container size to Large, and select a single NVIDIA GPU. Click on **Create Workbench** 5. Click on **Add data connection** to create a matching data connection for MinIO 6. Fill out the required fields and then click on **Add data collection** 7. Once your workbench status changes from **Starting** to **Running**, click on **Open** to open JupyterHub in a web browser 8. In your JupyterHub environment, launch a terminal and clone this project ``` git clone https://github.com/trustyai-explainability/trustyai-detoxify-sft.git ``` 8. Go into the `notebooks` directory ### Train model and evaluate 1. Open the `01-sft.ipynb` file 2. Run each cell in the notebook 3. Once the model trained and uploaded to HuggingFace Hub, open the `02-eval.ipynb` file and run each cell to compare the model trained on raw input-output pairs vs. the one trained on detoxified prompts ### Convert model to Caikit format and save to S3 storage 1. Open the `03-save_convert_model.ipynb` and run each cell in the notebook to convert the model Caikit format and save it to a MinIO bucket ### Deploy model onto Caikit-TGIS Serving Runtime 1. In the OpenShift AI dashboard, navigate to the project details page and click the **Models** tab 2. In the **Single-model serving platform** tile, click on deploy model. Provide the following values: **Model Name**: `opt-350m-caikit` **Serving Runtime**: `Caikit-TGIS Serving Runtime` **Model framework**: `caikit` **Existing data connection**: `My Storage` **Path**: `models/opt-350m-caikit` 3. Click **Deploy** 4. Increase the `initialDelaySeconds` ``` oc patch template caikit-tgis-serving-template --type=='merge' -p '{"spec":{"containers":[{"readinessProbe":"initialDelaySeconds":300, "livenessProbe":"initialDelaySeconds":300}]}}' ``` 5. Wait for the model **Status** to show a green checkmark ### Model inference 1. Return to the JupyterHub environment to test out the deployed model 2. Click on `03-inference_request.ipynb` and run each cell to make an inference request to the detoxified model