Why MLOps is Required: The Mission Control Imperative

Authors: Ryan Oskvarek | 17-December-2024

Taking to heart the advice of business leadership expert Simon Sinek, let's start with WHY Machine Learning Operations (MLOps) is required: Humans are too slow, and computers are too fast. This fundamental disparity creates a governance challenge that can only be addressed through comprehensive and well curated MLOps systems.

The Scale of Modern AI

Picture yourself traveling in a spaceship without mission control, where you, a single astronaut must monitor all systems, make course corrections, and ensure safety - all while traveling at 25,000 MPH. Organizations face a similar challenge when attempting to manually govern AI systems. Every second, modern AI systems are making millions of decisions, processing petabytes of data, and affecting countless users - all at speeds far beyond human comprehension.

What is MLOps?

Here's a frequently cited definition from the Institute of Electrical and Electronics Engineers (IEEE):

“MLOps is a paradigm, including aspects like best practices, sets of concepts, as well as a development culture when it comes to the end-to-end conceptualization, implementation, monitoring, deployment, and scalability of machine learning products. Most of all, it is an engineering practice that leverages three contributing disciplines: machine learning, software engineering (especially DevOps), and data engineering. MLOps is aimed at productionizing machine learning systems by bridging the gap between development (Dev) and operations (Ops). Essentially, MLOps aims to facilitate the creation of machine learning products by leveraging these principles: CI/CD automation, workflow orchestration, reproducibility; versioning of data, model, and code; collaboration; continuous ML training and evaluation; ML metadata tracking and logging; continuous monitoring; and feedback loops.[1]”

In my opinion, MLOps is the most powerful way to de-risk any AI system. Why is that? First you visualize your processes and map all your risks. Second, you remove bottlenecks, improve the system, and start to automate, in a way that Lean Six Sigma practitioners would champion as “kaizen” (change for the better). Third, you build pipelines that test your systems prior to deployment and monitor your systems in production, ensuring that their outputs are in line with your expectations.

A Day Without MLOps

Let's imagine a day without MLOps. Somehow, someway your organization has been able to build a model that automates a key part of some decision-making process. The data have been cleaned, the model trained, and now… you have to convince your risk and compliance teams that this model, which leverages complex technologies the vast majority of humans barely understand, is “safe” to deploy. Queue the meetings. This potentially valuable tool for your enterprise, which required an investment of weeks, months or even years of effort by teams of highly skilled individuals, is effectively sitting on the shelf waiting for the gears of bureaucracy to turn.

As the model sits on a drive somewhere, and as your team sits in meeting after meeting trying to move the deployment forward, convincing the organization that the model is "safe enough", the valuable learnings on how the model was built start to fade into the past. As that model sits on some hard drive, the market need and/or the deployment environment begin to change, resulting in concept or model drift. Deployment momentum is lost. and business value dissipates.

The speed mismatch between AI and humans becomes even more evident when we consider potential value loss. During the time spent awaiting approval, the model could have processed millions of transactions, learned from countless interactions, and adapted to changing conditions. Instead, human processes - inherently slower than AI - create a bottleneck that prevents us from realizing the full potential of AI systems.

The MLOps Solution: Building Mission Control

This is where MLOps enters the picture, not as an optional enhancement but as a necessary enabler. NASA built mission control in response to the challenges they were facing during the Mercury program, namely how to leverage new technologies (rocketry and computers) and align a wide range of individuals around dynamically changing sets of problems and solutions. Organizations need to lean into systems, such as MLOps, to manage the deployment, curation, and monitoring of AI systems.

Luckily, there is an engineering discipline that has already been making inroads within technology enterprises, namely DevOps. I personally saw the impact of a good DevOps pipeline at the same place I got my start in Product Management, Workiva. I remember sitting there as Travis Ensley, the then leader of Quality Assurance (QA), gave a talk to the whole company about the need to "do things smaller and faster". That talk launched Workiva on a trajectory out of large quarterly releases to small daily releases even during the most critical times for their customers using the software. By doing things small and having a robust automated pipeline to production, risk was mitigated.

The same lessons from the space program and DevOps hold true for MLOps; we just need to extend them a bit.

The Learning Cycle: Beyond Simple Automation

So what is different? Why can’t we just reuse the same old DevOps pipelines that we have been diligently developing? Well, the primary variance I have seen is the difference between the types of computing. DevOps was engineered for deterministic systems while MLOps has to be engineered for non-deterministic systems. Deterministic systems are “programmed”, encoded into the memory of computers and, when given the same inputs, predictably generate the same outputs, time after time after time.

Non-deterministic computing still needs to be deployed, monitored, accessed, and used. However, it must also continue to learn, and thus, any system that we build must allow for the underlying models to learn and adjust as the environment in which they exist evolves.

Non-deterministic computing also means that the outputs may vary significantly based on the inputs. For example, changing a single word in a prompt can lead to a radically different response from an LLM/ This greatly increases the difficulty of protecting our customers and our enterprises from undesirable outputs.

Building Trust in Non-Deterministic Systems

“Trust but verify”. I am a child of the eighties, but I didn’t hear this famous phrase right from Reagan’s lips. It was from a manager at a major bank. As non-deterministic systems, humans need checks and balances to ensure that great customer service, great code, and great products are the norm; the same goes for AI systems.

This is where the true innovation of MLOps comes into play. A robust MLOps system will not only deploy and monitor models but also evaluate and regenerate their outputs in real-time, proactively before a customer ever experiences a harmful interaction with the model. Like a manager, auditor, or secret shopper, MLOps guardrails can take several forms:

Evaluation frameworks continuously assess model outputs against established criteria. Unlike traditional software testing, which can rely on exact matches, these frameworks must understand acceptable ranges and context-dependent responses.
Guardrails act as constant guardians, checking every output against established policies and ethical guidelines. These aren't simple yes/no filters – they're sophisticated systems that understand nuance and context of the application.
Learning capture systems record not just the outputs, but the entire context of each interaction. When something goes wrong (or right!), the system preserves the relevant information for future audit, learning, and improvement.

The irony is that our careful, human-paced verification processes often slow down systems designed to operate at machine speed. The solution isn't to remove these checks—it's to automate them. When properly implemented, MLOps can perform thousands of safety checks in the time it would take a human to complete just one, maintaining both speed and safety.

This automated vigilance operates at machine speed but with human-designed intelligence. While a human team might take days to review a sampling of model outputs, an MLOps pipeline can evaluate every single output in real-time, catching potential issues before they reach customers.

Expanding the Table

One of the things I love most about DevOps is how it helps bring so many perspectives and stakeholders to the same table. MLOps does the same thing because an array of internal organizations, from risk and compliance, to data science, to operations now have an ongoing and vested interest in the curation and maturation of this system. A well managed MLOps pipeline has continuous improvement as its core philosophy and evolves as the system is used.

This has several benefits for an enterprise. First, the level of risk goes down because the automated checks keep getting better. Second, the velocity of learning radically increases because the data is visible and accessible. Third, costs of deployment drop because the amount of time required to move a change to production also drops. With more time on their hands, the highly skilled technical and business people we typically pay lots of money to think, can now focus their thinking on creating new products and growing the enterprise.

The Continuous Learning Loop

This brings us to perhaps the most critical aspect of MLOps: the continuous learning loop. In traditional DevOps, we might deploy a fix and move on. In MLOps, every interaction is an opportunity for both the model and the humans managing it to learn and improve.

The system must capture not just what happened, but why it happened and what we can learn from it. This learning feeds back into multiple levels:

The models themselves can be fine-tuned based on real-world performance.
The guardrails can be adjusted to better reflect real-world requirements.
And perhaps most importantly, the humans overseeing the system gain deeper insights into how their AI systems actually behave in production.

From Theory to Practice

The reality is that AI systems are already making millions of decisions daily in organizations around the world. The question isn't whether to implement MLOps – it's how quickly we can establish these systems to ensure safe, effective, and valuable AI deployment.

Without robust MLOps, organizations face a stark choice: either move slowly and cautiously, missing opportunities for innovation, or move quickly and risk uncontrolled outcomes. MLOps provides a third path: controlled, systematic deployment that enables both safety and speed.

The Path Forward

As we continue to develop and deploy more sophisticated AI systems, the role of MLOps will only grow in importance. The future belongs to organizations that can effectively manage the balance between innovation and control, between speed and safety, between automation and oversight.

The good news is that we don't have to build everything from scratch. We can learn from DevOps, from systems engineering, from safety-critical systems, and from each other. The patterns and practices of MLOps are emerging and evolving rapidly, driven by real-world experience and necessity.

Conclusion

MLOps isn't just another layer of technology for building AI systems – it's a fundamental shift in how we approach the deployment and management of AI. It's the difference between hoping our models behave well and ensuring they do. Between crossing our fingers and confidently maintaining control. Between creating potential value and delivering actual value.

The time to implement MLOps isn't after we've deployed our first model, or after we've had our first customer-impacting incident. The time is now, as we build and deploy these powerful new tools.

Because in the end, the success of AI in our organizations won't just be about the models we build – it will be about how effectively we manage, monitor, and improve them over time.

ZealStrat: A Partner in Ethical AI

As the only U.S. company authorized by IEEE to provide AI ethics assessments based on the CertifAIEd™ standard, ZealStrat has developed a comprehensive approach to identifying and mitigating risks throughout the lifecycle of decision-support systems.

ZealStrat helps organizations prioritize and manage these risks, evolving their AI systems to be more trustworthy and more valuable. If you’re looking to integrate ethical AI into your decision-support systems, we invite you to reach out to us at ZealStrat.

References:

[1] Kreuzberger, Dominik; Kühl, Niklas; Hirschl, Sebastian (2023). "Machine Learning Operations (MLOps): Overview, Definition, and Architecture". IEEE Access. 11: 31866–31879. arXiv:2205.02302. Bibcode:2023IEEEA..1131866K. doi:10.1109/ACCESS.2023.3262138. ISSN 2169-3536. S2CID 248524628.

Thank you to Tom Tirpak for his helpful suggestions and edits for this article.

Posts

Managing AI Risks in New Product Development

Artificial Intelligence (AI) is rapidly transforming New Product Development (NP....

AI and Ethics in Practice: Podcast Episode

Our CEO Dr. Ganesan Keerthivasan and our Head of AI Ethics Dr. Tom Tirpak recent....

AI System Inventories - The Foundation for Governance

In a conference room last week, a CTO asked her team a seemingly simple question....

The Legal Implications of Ethical AI

As AI continues to permeate business and society, the legal landscape surroundin....

Installing AI Governance

AI Governance: Installing a framework in your business In the rapidly evolving ....

Ensuring Ethical AI: The Value of Third-Party Assessments

In the rapidly evolving landscape of artificial intelligence, organizations face....