Aspects of MLOps on Cloud
Focus on Retail Banking
by: Rajeev Verma
I. Introduction
Over the last 5 years, the adoption of cloud-enabled computing in the banking industry has gained traction. Moving from On-premise model deployment to big-data clusters was a step towards Machine Learning operations (MLOps).
MLOps processes are defined taking into account model characteristics such as end usage, scoring frequency, model algorithm etc. Apart from these model characteristics, it is also important to understand the regulatory requirement towards the model monitoring,
fair lending and model explainability before planning to initiate the MLOps process. It is also very important to have strong governance by clearly defining roles and responsibilities. In this paper, I have covered different aspects of the MLOps process and
reference solutions.
II. Key aspects of ML
In the recent industry scenario modelling units and the business, stakeholders are increasingly focusing on the faster deployment of the model and therefore a lot of resources are being deployed for seamless cloud migration. Further sections discuss the
key components involved in each stage, the challenges and the best practices adopted while migrating the process to the cloud.
i. Data/ Model pipelines
Data pipelines are nothing but a way to ensure that data is transported from source to target efficiently, in usable form and in an automated manner. An automated data pipeline becomes critical when the objective is to make a real-time data-driven informed
decision. The important features of a reliable pipeline in the context of risk and regulatory predicted models are :
a) Data Quality checks b) Automated ETL c) Disaster Recovery of the data d) Flexible in handling the stress scenarios
New age data pipelines enable easy access to data from various sources such as apps, APIs, and perform better on analytics and insights opportunities. In the BFSI domain when dealing with critical projects such as AML, fraud detection etc. it is very critical
to have real-time data feed by establishing an automated data pipeline. However, if a task is related to credit scoring which is more often a batch process, we may not need a fully automated data pipeline.
ii. CI/CD Pipelines
One question that arises here is: How complex should the CI-CD paradigm be? Well, it depends on many things such as the business context, modelling complexity and usage of the model. However, it is recommended to start with a simple CI-CD pipeline vs a
complex one and MLOps engineers can enhance it periodically.
iii. Monitoring and Observability
Monitoring refers to the process of identifying whether the model health checks are in place and whether model performance is within the benchmark. However, monitoring alone does not help data scientists. It is the observability that gives answers to the
data scientists on “Why the model performance deteriorated?”. Observability helps in identifying the root cause of the model and data drift.
(a). Concept of drift
It is a well-known saying that “Change is the only Constant”, and it applies well in the domain of model development. Model monitoring is a key aspect of MLOps. Monitoring or validation includes two concepts. First, monitoring the model performance
that includes comparing statistical metrics vs their respective thresholds. Second, tracking the change in data which includes tracking the population shift[1]
(b))Data Drift and Model Drift
Data drift is defined as any statistically significant change in the distribution of the data when compared with the testing and training environment. Model drift is defined as a breach in the model performance threshold. Generally,
credit scoring models developed on a stable portfolio perform consistently for 3-4 years. The life of the model gets affected either due to any external internal policy change, regulatory intervention, macro-economic factors or pandemics such as Covid-19. The
pace and intensity will depend on the characteristics of the portfolio on which the model is built[2].
When we talk about monitoring, one obvious question that comes to mind is how frequently the monitoring and retraining/recalibration exercise must be performed. Well, the answer depends on factors such as business domain, frequency of data collection and
cost-benefit analysis[3].
III What level of cloud maturity we should adopt
As understood so far, for any large business, the decision to migrate to cloud infrastructure is not an easy path to walk. Various cloud service providers have defined different levels of maturity on cloud adoption. The below table shows various models
against the probable level of maturity requirement and their features.
Model type: Level (Low(1) to High(3): Feature
Risk Scoring Models Level1 Manual Pipeline
Response/
recommendation model Level 2 Automated Pipeline
Fraud/AML Level 3 Automated Pipeline and Deployment
Migrating to the cloud opens up the scope for innovation for any product owner, however, given the differences in the size of the books, and current infrastructure not all migration/cloud adoption will follow the same path. In the image, I have shown
personal view on a ML Ops Maturity Grid that helps to do a gap assessment, which guides an organization to decide the right MLOps capabilities rather than going with a fully mature environment. The following table gives reference to different cloud components
and indicative levels of maturity in adopting the MLOps process.
Table: Reference Grid Level of MLOps Components Maturity across Model Types (image)
IV. Conclusion
In summary, the paper provides a comprehensive overview of the transformation of cloud-enabled computing in the banking sector, with a specific emphasis on MLOps processes and their various components. It sheds light on the importance of considering model
characteristics, regulatory requirements, and governance practices in successful MLOps implementation. By dissecting key aspects of MLOps and cloud maturity, the paper offers valuable insights into the challenges and strategies associated with adopting these
transformative practices in the banking industry.
[1] which could be due to economic changes or changes in underwriting policies or changes in customer behaviour etc.
[2]If the portfolio is new and the product is newly launched the model performance may decay faster as compared to the model which is built on a stable portfolio.
[3] Areas like AML, Fraud detection, Recommendation engine or Cyber security would need more frequent retraining of models. If the model monitoring pipeline takes one week to complete and give only 0.5% expected lift, then high frequency of
model update may not be justified. Challenges could be there if the ground truth or actual performance is captured with lag or delay. In practice, there are models which are retrained daily (e.g. recommendation engine) and once or twice a year such as the
credit scoring model