AI & the Challenge of Scale

4 Tips for Approaching Scalability in Your AI

May 20, 2021

As you may have noticed based on the last newsletter, AI has been on my mind a lot lately, which makes sense, AI is core to pretty much everything we do at Charli.

In particular, I’ve been thinking a lot about scale lately. As a young startup, we focus a lot on features, functionality and tech, but then you hit an inflection point, where you MUST think about scale or you’re going to have a pretty difficult time taking the company where it needs to go.

For a technology startup, scale is an obvious must (at some point). But for an AI like Charli’s that’s all about learning personal preference and automating for the individual user, designing it to be both user-centric/customizable and scalable didn’t come without many long nights and a few extra grey hairs. 🤯

The dirty little secret of the tech world is that AI has a ton of scale challenges. If you’re a founder considering starting an AI-centric startup, fair warning, scaling AI can be expensive if you don’t get creative. Luckily, you’re not alone. Much data is needed to train the models, and the quality of the data must satisfy the goals you are trying to achieve. Data labeling (the exercise of providing data along with the metadata in order to train the models) is also strenuous because it requires enough people to help to satisfy the training requirements. And then, there’s the continuous training. That’s a whole other mountain to climb.

In other words, developing and implementing AI that can continuously learn across a diverse population of millions of users can be overwhelming, and general-purpose ML models won't suffice.

Having been down this path, and learned many things as we came to our solution, I wanted to share the following for those of you thinking of building AI products or adding AI to your existing portfolio...

If you’re struggling with AI-scale stress, you’re not alone. 🤗 Here are four scalability challenges many startups lose sleep over, as well as some suggestions:

1. Customization of AI models

General-purpose AI models do not have the necessary performance, nor the accuracy, to solve real problems. That means each AI model must be customized and trained to fit a specific problem, data set, and domain. In addition, maintenance and continuous learning of each model require a ton of work. This is a huge challenge when we start talking scale.

To approach this challenge, focus on making the design and implementation of the customization process as efficient as possible. At Charli, we’ve figured this out and can work closely with users across any number of industries and apply knowledge and continuous learning on a structured basis. We can do this through scalable methods that orchestrate AI models across users, industries, locales, and other factors.

2. Data management

Data is a crucial component of each AI model. Labeling, processing, and managing a vast amount of data needed to utilize AI models at a production scale require tremendous effort. Moreover, AI models have to be continuously updated with new incoming data. This means we need to go through the lifecycle of data pre-processing, model training, and deployment again and again.

The first step to approach the data management complexity is to take the time and define the data strategy endgame, i.e. think about what and when the data should be collected, processed, and incorporated. At Charli, we designed a robust scalable data management solution with the requirements of AI models in mind. Having an AI-infused solution facilitates implementing automated data processing steps. In addition, we take extra measures to ensure the quality of data since any AI model is as good as its training data.

3. Resource management

With the ever-increasing data and new complex architectures of AI models, scalable infrastructural resources, such as memory, computational power, and storage, are a must. Specifically, those that can easily scale without breaking the product.

In addition, we have to make sure the AI models are efficient and customized to their specific task, otherwise, they might run slow and consume huge amounts of expensive computing resources. This is more and more important for models that need to be continuously updated/trained.

One way to approach this is to invest time and resources to design and implement an optimized MLOps process. To bring AI models into production at scale, MLOps practices bring Data Scientists and DevOps engineers together to increase automation and improve the quality of various steps in the ML lifecycle. At Charli, in addition to having an optimized MLOps practice in place, we are also mindful of the performance of AI models at scale while creating them. This is necessary to avoid creating AI solutions that appear to work well during testing, but are unstable in production.

4. Unexpected behavior

For incorporating AI models in production, we need to have support for situations that are not designed or planned for. Like issues that did not appear in testing but could happen in real life. An example of this could be if the result of an AI model is not accepted due to any business logic. Supporting how to deal with these cases and learning from the experience automatically is another challenge for scaling AI.

To approach this challenge, at Charli we’ve learned that having various validation steps helps to capture these unexpected behaviors and activate the contingency options in time. Moreover, we’ve invested into continuous monitoring and “guardrail” techniques that are designed to provide confidence in AI decision making. Here at Charli, we see most of these unexpected behaviors as learning opportunities, to enhance the accuracy and performance of our AI solutions.

Inside Charli AI Labs

Discussion about this post