Building an ML-ops pipeline on AWS — Part 2: model deployment
As companies adopt machine learning across their organizations, building, training, and deploying ML models manually become bottlenecks for innovation. Establishing MLOps patterns allows you to create repeatable workflows for all stages of the ML lifecycle and are key to transitioning from the manual experimentation phase to production. MLOps helps companies innovate faster by boosting productivity of data science and ML teams in creating and deploying models with high accuracy.
At a high level, Here I will create two pipelines using CloudFormation in two different parts:
1. Part 1: Model training pipeline is available at
2. Part 2: Model deployment pipeline
This is part 2 of the series.
Part 1 is accessible at https://medium.com/@neha-tomar/building-an-ml-ops-pipeline-on-aws-part-1-model-training-f9c60e5d6c2b
Here, we are ready to create the CloudFormation template for the CodePipeline training pipeline. This pipeline will listen to changes to a CodeCommit repository and invoke the Step Functions workflow we created in part 1:
- Copy and save the following code block to a file called mlpipeline.yaml. This is the template for building the training pipeline.
Parameters:
BranchName:
Description: CodeCommit branch name
Type: String
Default: master
RepositoryName:
Description: CodeCommit repository name
Type: String
Default: MLSA-repo
ProjectName:
Description: ML project name
Type: String
Default: FinanceSentiment
MlOpsStepFunctionArn:
Description: Step Function Arn
Type: String
Default: arn:aws:states:ca-central-1:300165273893:stateMachine:TrainingStateMachine2-89fJblFk0h7b
Resources:
CodePipelineArtifactStoreBucket:
Type: 'AWS::S3::Bucket'
DeletionPolicy: Delete
Pipeline:
Type: 'AWS::CodePipeline::Pipeline'
Properties:
Name: codecommit-events-pipeline
RoleArn: !GetAtt CodePipelineServiceRole.Arn
ArtifactStore:
Type: S3
Location: !Ref CodePipelineArtifactStoreBucket
Stages:
- Name: Source
Actions:
- Name: SourceAction
ActionTypeId:
Category: Source
Owner: AWS
Version: 1
Provider: CodeCommit
OutputArtifacts:
- Name: SourceOutput
Configuration:
BranchName: !Ref BranchName
RepositoryName: !Ref RepositoryName
PollForSourceChanges: false
RunOrder: 1
- Name: ModelBuilding
Actions:
- Name: ExecuteSagemakerMLOpsStepFunction
InputArtifacts:
- Name: SourceOutput
ActionTypeId:
Category: Invoke
Owner: AWS
Version: 1
Provider: StepFunctions
OutputArtifacts:
- Name: myOutputArtifact
Configuration:
StateMachineArn: !Ref MlOpsStepFunctionArn
ExecutionNamePrefix: finbert
InputType: FilePath
Input: sf_start_params.json
RunOrder: 1
ArtifactStore:
Type: S3
Location: !Ref CodePipelineArtifactStoreBucket
CodePipelineServiceRole:
Type: 'AWS::IAM::Role'
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- codepipeline.amazonaws.com
Action: 'sts:AssumeRole'
Path: /
Policies:
- PolicyName: AWS-CodePipeline-Service-3
PolicyDocument:
Version: 2012-10-17
Statement:
- Resource: '*'
Effect: Allow
Action:
- 'codecommit:CancelUploadArchive'
- 'codecommit:GetBranch'
- 'codecommit:GetCommit'
- 'codecommit:GetUploadArchiveStatus'
- 'codecommit:UploadArchive'
- Resource:
- !Sub arn:aws:s3:::${CodePipelineArtifactStoreBucket}/*
Effect: Allow
Action:
- s3:PutObject
- s3:GetObject
- s3:GetObjectVersion
- s3:GetBucketVersioning
- Resource: "*"
Effect: Allow
Action:
- codebuild:StartBuild
- codebuild:BatchGetBuilds
- iam:PassRole
- states:DescribeStateMachine
- states:StartExecution
- states:DescribeExecution
Outputs:
PipelineUrl:
Value: !Sub https://console.aws.amazon.com/codepipeline/home?region=${AWS::Region}#/view/${Pipeline}
ArtifactBucket:
Value: !Ref CodePipelineArtifactStoreBucket
Similarly, let’s launch this cloud template in the CloudFormation console to create the pipeline definition for execution. Once the CloudFormation template has been executed, navigate to the CodePipeline management console to verify that it has been created. The CloudFormation execution will also execute the newly created pipeline automatically, so you should see that it already ran once. You can test it again by clicking on the Release changes button in the SageMaker management console.
We want to be able to kick off the CodePipeline execution when a change is made (such as a code commit) in the CodeCommit repository. To enable this, we need to create a CloudWatch event that monitors this change and kicks off the pipeline. Let’s get started:
2. Add the following code block to the mlpipeline.yaml file, just before the Outputs section, and save the file as mlpipeline_1.yaml.
Parameters:
BranchName: Description: CodeCommit branch name Type: String Default: master RepositoryName: Description: CodeCommit repository name Type: String Default: MLSA-repo ProjectName: Description: ML project name Type: String Default: FinanceSentiment MlOpsStepFunctionArn: Description: Step Function Arn Type: String Default: arn:aws:states:ca-central-1:300165273893:stateMachine:TrainingStateMachine2-89fJblFk0h7b Resources: CodePipelineArtifactStoreBucket: Type: 'AWS::S3::Bucket' DeletionPolicy: Delete Pipeline: Type: 'AWS::CodePipeline::Pipeline' Properties: Name: codecommit-events-pipeline RoleArn: !GetAtt CodePipelineServiceRole.Arn ArtifactStore: Type: S3 Location: !Ref CodePipelineArtifactStoreBucket Stages: - Name: Source Actions: - Name: SourceAction ActionTypeId: Category: Source Owner: AWS Version: 1 Provider: CodeCommit OutputArtifacts: - Name: SourceOutput Configuration: BranchName: !Ref BranchName RepositoryName: !Ref RepositoryName PollForSourceChanges: false RunOrder: 1 - Name: ModelBuilding Actions: - Name: ExecuteSagemakerMLOpsStepFunction InputArtifacts: - Name: SourceOutput ActionTypeId: Category: Invoke Owner: AWS Version: 1 Provider: StepFunctions OutputArtifacts: - Name: myOutputArtifact Configuration: StateMachineArn: !Ref MlOpsStepFunctionArn ExecutionNamePrefix: finbert InputType: FilePath Input: sf_start_params.json RunOrder: 1 ArtifactStore: Type: S3 Location: !Ref CodePipelineArtifactStoreBucket CodePipelineServiceRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: - codepipeline.amazonaws.com Action: 'sts:AssumeRole' Path: / Policies: - PolicyName: AWS-CodePipeline-Service-3 PolicyDocument: Version: 2012-10-17 Statement: - Resource: '*' Effect: Allow Action: - 'codecommit:CancelUploadArchive' - 'codecommit:GetBranch' - 'codecommit:GetCommit' - 'codecommit:GetUploadArchiveStatus' - 'codecommit:UploadArchive' - Resource: - !Sub arn:aws:s3:::${CodePipelineArtifactStoreBucket}/* Effect: Allow Action: - s3:PutObject - s3:GetObject - s3:GetObjectVersion - s3:GetBucketVersioning - Resource: "*" Effect: Allow Action: - codebuild:StartBuild - codebuild:BatchGetBuilds - iam:PassRole - states:DescribeStateMachine - states:StartExecution - states:DescribeExecution AmazonCloudWatchEventRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: - events.amazonaws.com Action: 'sts:AssumeRole' Path: / Policies: - PolicyName: cwe-pipeline-execution PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: 'codepipeline:StartPipelineExecution' Resource: !Join - '' - - 'arn:aws:codepipeline:' - !Ref 'AWS::Region' - ':' - !Ref 'AWS::AccountId' - ':' - !Ref Pipeline AmazonCloudWatchEventRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.codecommit detail-type: - CodeCommit Repository State Change resources: - !Join - '' - - 'arn:aws:codecommit:' - !Ref 'AWS::Region' - ':' - !Ref 'AWS::AccountId' - ':' - !Ref RepositoryName detail: event: - referenceCreated - referenceUpdated referenceType: - branch referenceName: - master Targets: - Arn: !Join - '' - - 'arn:aws:codepipeline:' - !Ref 'AWS::Region' - ':' - !Ref 'AWS::AccountId' - ':' - !Ref Pipeline RoleArn: !GetAtt - AmazonCloudWatchEventRole - Arn Id: codepipeline-AppPipeline Outputs: PipelineUrl: Value: !Sub https://console.aws.amazon.com/codepipeline/home?region=${AWS::Region}#/view/${Pipeline} ArtifactBucket: Value: !Ref CodePipelineArtifactStoreBucket
3. Now, run this CloudFormation template to create a new pipeline. You can delete the previously created pipeline by deleting the CloudFormation stack. This will run the pipeline again automatically. Wait until the pipeline’s execution is complete before you start the next step.
4. Now, let’s test the automatic execution of the pipeline by committing a change to the code repository. Find a file in your cloned code repository directory. Create a new file called pipelinetest.txt and commit the change to the code repository. Navigate to the CodePipeline console; you should see the codecommit-events-pipeline pipeline starting to run.
Congratulations! you have successfully used CloudFormation to build a CodePipeline-based ML training pipeline that automatically runs when there is a file change in a CodeCommit repository. Next, let’s build the ML deployment pipeline for the model.
Creating a CloudFormation template for the ML deployment pipeline
To start creating a deployment, perform the following steps:
- Copy the following code block to create a file called mldeployment.yaml. This CloudFormation template will deploy a model using the SageMaker hosting service. Make sure that you enter the correct model’s name for your environment:
Description: Basic Hosting of registered model
Parameters:
ModelName:
Description: Model Name
Type: String
Default: <mode name>
Resources:
Endpoint:
Type: AWS::SageMaker::Endpoint
Properties:
EndpointConfigName: !GetAtt EndpointConfig.EndpointConfigName
EndpointConfig:
Type: AWS::SageMaker::EndpointConfig
Properties:
ProductionVariants:
InitialInstanceCount: 1
InitialVariantWeight: 1.0
InstanceType: ml.m4.xlarge
ModelName: !Ref ModelName
VariantName: !Ref ModelName
Outputs:
EndpointId:
Value: !Ref Endpoint
EndpointName:
Value: !GetAtt Endpoint.EndpointName
Similarly, let’s launch this cloud template in the CloudFormation console to create the pipeline definition for execution. Once the CloudFormation template has been executed, navigate to the CodePipeline management console to verify that it has been created. The CloudFormation execution will also execute the newly created pipeline automatically, so you should see that it already ran once. You can test it again by clicking on the Release changes button in the SageMaker management console.
We want to be able to kick off the CodePipeline execution when a change is made (such as a code commit) in the CodeCommit repository. To enable this, we need to create a CloudWatch event that monitors this change and kicks off the pipeline. Let’s get started:
Add the following code block to the mlpipeline.yaml file, just before the Outputs section, and save the file as mlpipeline_1.yaml
Parameters:
BranchName: Description: CodeCommit branch name Type: String Default: master RepositoryName: Description: CodeCommit repository name Type: String Default: MLSA-repo ProjectName: Description: ML project name Type: String Default: FinanceSentiment MlOpsStepFunctionArn: Description: Step Function Arn Type: String Default: arn:aws:states:ca-central-1:300165273893:stateMachine:TrainingStateMachine2-89fJblFk0h7b Resources: CodePipelineArtifactStoreBucket: Type: 'AWS::S3::Bucket' DeletionPolicy: Delete Pipeline: Type: 'AWS::CodePipeline::Pipeline' Properties: Name: codecommit-events-pipeline RoleArn: !GetAtt CodePipelineServiceRole.Arn ArtifactStore: Type: S3 Location: !Ref CodePipelineArtifactStoreBucket Stages: - Name: Source Actions: - Name: SourceAction ActionTypeId: Category: Source Owner: AWS Version: 1 Provider: CodeCommit OutputArtifacts: - Name: SourceOutput Configuration: BranchName: !Ref BranchName RepositoryName: !Ref RepositoryName PollForSourceChanges: false RunOrder: 1 - Name: ModelBuilding Actions: - Name: ExecuteSagemakerMLOpsStepFunction InputArtifacts: - Name: SourceOutput ActionTypeId: Category: Invoke Owner: AWS Version: 1 Provider: StepFunctions OutputArtifacts: - Name: myOutputArtifact Configuration: StateMachineArn: !Ref MlOpsStepFunctionArn ExecutionNamePrefix: finbert InputType: FilePath Input: sf_start_params.json RunOrder: 1 ArtifactStore: Type: S3 Location: !Ref CodePipelineArtifactStoreBucket CodePipelineServiceRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: - codepipeline.amazonaws.com Action: 'sts:AssumeRole' Path: / Policies: - PolicyName: AWS-CodePipeline-Service-3 PolicyDocument: Version: 2012-10-17 Statement: - Resource: '*' Effect: Allow Action: - 'codecommit:CancelUploadArchive' - 'codecommit:GetBranch' - 'codecommit:GetCommit' - 'codecommit:GetUploadArchiveStatus' - 'codecommit:UploadArchive' - Resource: - !Sub arn:aws:s3:::${CodePipelineArtifactStoreBucket}/* Effect: Allow Action: - s3:PutObject - s3:GetObject - s3:GetObjectVersion - s3:GetBucketVersioning - Resource: "*" Effect: Allow Action: - codebuild:StartBuild - codebuild:BatchGetBuilds - iam:PassRole - states:DescribeStateMachine - states:StartExecution - states:DescribeExecution AmazonCloudWatchEventRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: - events.amazonaws.com Action: 'sts:AssumeRole' Path: / Policies: - PolicyName: cwe-pipeline-execution PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: 'codepipeline:StartPipelineExecution' Resource: !Join - '' - - 'arn:aws:codepipeline:' - !Ref 'AWS::Region' - ':' - !Ref 'AWS::AccountId' - ':' - !Ref Pipeline AmazonCloudWatchEventRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.codecommit detail-type: - CodeCommit Repository State Change resources: - !Join - '' - - 'arn:aws:codecommit:' - !Ref 'AWS::Region' - ':' - !Ref 'AWS::AccountId' - ':' - !Ref RepositoryName detail: event: - referenceCreated - referenceUpdated referenceType: - branch referenceName: - master Targets: - Arn: !Join - '' - - 'arn:aws:codepipeline:' - !Ref 'AWS::Region' - ':' - !Ref 'AWS::AccountId' - ':' - !Ref Pipeline RoleArn: !GetAtt - AmazonCloudWatchEventRole - Arn Id: codepipeline-AppPipeline Outputs: PipelineUrl: Value: !Sub https://console.aws.amazon.com/codepipeline/home?region=${AWS::Region}#/view/${Pipeline} ArtifactBucket: Value: !Ref CodePipelineArtifactStoreBucket
1. Now, run this CloudFormation template to create a new pipeline. You can delete the previously created pipeline by deleting the CloudFormation stack. This will run the pipeline again automatically. Wait until the pipeline’s execution is complete before you start the next step.
2. Now, let’s test the automatic execution of the pipeline by committing a change to the code repository. Find a file in your cloned code repository directory. Create a new file called pipelinetest.txt and commit the change to the code repository. Navigate to the CodePipeline console; you should see the codecommit-events-pipeline pipeline starting to run.
Congratulations! you have successfully used CloudFormation to build a CodePipeline-based ML training pipeline that automatically runs when there is a file change in a CodeCommit repository. Next, let’s build the ML deployment pipeline for the model.
Creating a CloudFormation template for the ML deployment pipeline
To start creating a deployment, perform the following steps:
- Copy the following code block to create a file called mldeployment.yaml. This CloudFormation template will deploy a model using the SageMaker hosting service. Make sure that you enter the correct model’s name for your environment:
Description: Basic Hosting of registered model
Parameters:
ModelName:
Description: Model Name
Type: String
Default: <mode name>
Resources:
Endpoint:
Type: AWS::SageMaker::Endpoint
Properties:
EndpointConfigName: !GetAtt EndpointConfig.EndpointConfigName
EndpointConfig:
Type: AWS::SageMaker::EndpointConfig
Properties:
ProductionVariants:
InitialInstanceCount: 1
InitialVariantWeight: 1.0
InstanceType: ml.m4.xlarge
ModelName: !Ref ModelName
VariantName: !Ref ModelName
Outputs:
EndpointId:
Value: !Ref Endpoint
EndpointName:
Value: !GetAtt Endpoint.EndpointName
2. Create a CloudFormation stack using this file and verify that a SageMaker endpoint has been created. Now, upload the mldeployment.yaml file to the code repository directory and commit the change to CodeCommit. Note that this file will be used by the CodePipeline deployment pipeline, which we will create in the following steps.
3. Before we create the deployment pipeline, we need a template config file for passing parameters to the deployment template when it is executed. Here, we need to pass the model name to the pipeline. Copy the following code block, save it to a file called mldeployment.json, upload it to the code repository directory in Studio, and commit the change to codecommit:
{
"Parameters" : {
"ModelName" : <name of the financial sentiment model you have trained>
}
}
4. Now, we can create a CodePipeline pipeline CloudFormation template for automatic model deployment. This pipeline has two main stages:
a) The first stage fetches source code (such as the configuration file we just created and the mldeployment.yaml template) from a CodeCommit repository.
b) The second stage creates a CloudFormation change set (a change set is the difference between a new template and an existing CloudFormation stack) for the mldeployment.yaml file we created earlier. It adds a manual approval step and then deploys the CloudFormation template’s mldeployment.yaml file.
This CloudFormation template also creates supporting resources, including an S3 bucket for storing the CodePipeline artifacts, an IAM role for CodePipeline to run with, and another IAM role for CloudFormation to use to create the stack for mldeployment.yaml.
5. Copy the following code block and save the file as mldeployment-pipeline.yaml.
Parameters:
BranchName:
Description: CodeCommit branch name
Type: String
Default: master
RepositoryName:
Description: CodeCommit repository name
Type: String
Default: MLSA-repo
ProjectName:
Description: ML project name
Type: String
Default: FinanceSentiment
CodePipelineSNSTopic:
Description: SNS topic for NotificationArn
Default: arn:aws:sns:ca-central-1:300165273893:CodePipelineSNSTopicApproval
Type: String
ProdStackConfig:
Default: mldeploymentconfig.json
Description: The configuration file name for the production WordPress stack
Type: String
ProdStackName:
Default: FinanceSentimentMLStack1
Description: A name for the production WordPress stack
Type: String
TemplateFileName:
Default: mldeployment.yaml
Description: The file name of the WordPress template
Type: String
ChangeSetName:
Default: FinanceSentimentchangeset
Description: A name for the production stack change set
Type: String
Resources:
CodePipelineArtifactStoreBucket:
Type: 'AWS::S3::Bucket'
DeletionPolicy: Delete
Pipeline:
. . . . .
Now, let’s launch the newly created mldeployment-pipeline.yaml template in the CloudFormation console to create the deployment pipeline, and then run the pipeline from the CodePipeline console.
We successfully created and run a CodePipeline deployment pipeline to deploy a model from the SageMaker model registry.
Summary
In this 2-part blog, we discussed the key requirements for building an enterprise ML platform to meet needs such as end-to-end ML life cycle support, process automation, and separating different environments. We also talked about architecture patterns and how to build an enterprise ML platform on AWS using AWS services. We discussed the core capabilities of different ML environments, including training, hosting, and shared services.