Google Cloud Run Vs. AI Platform: Making The Right ML Pipeline Choice
We are a customer support automation startup, IrisAgent, that processes large quantities of text data from support tickets and time-series data from engineering and product sources. Our business objective is to enable smarter customer support using real-time insights about operational, engineering, and user issues.
We evaluated the Google AI Platform and Google Cloud Run for setting up a robust and production-ready ML pipeline. Hope our findings can save you valuable time.
Introduction
What is Google Cloud Run?
Google Cloud Run is a fully managed, serverless platform that allows developers to deploy containerized applications quickly and easily. It automatically scales applications in response to incoming requests, providing cost-efficiency and flexibility. Developers can focus on building and deploying code while Google Cloud Run handles the underlying infrastructure, making it ideal for web services, microservices, and API deployments.
What is Google AI Platform?
Google AI Platform is a cloud-based service that simplifies the development, training, and deployment of machine learning models. It provides a collaborative environment for data scientists and ML engineers, offering tools for data preparation, training, and serving models. Google AI Platform accelerates the development of AI solutions, making them accessible to a broader range of users.
Goals for our ML Pipeline
We wanted to move to an ML pipeline for the following objectives:
Easy to manage
We’d rather focus on our and our customers' core business problems than spend much time on data engineering and managing the ML pipeline. We wanted an out-of-the-box solution that just worked.
Modular and Extensible
We are a young startup and are iterating quickly on different ML approaches. We have different steps in the process of our ML pipeline and wanted a pipeline tool that allows us to swap out and replace new components easily.
Compatible with our current setup
We currently use containers on Google Cloud Run to deploy all our services and use MongoDB and Google Cloud Storage for storage.
ML pipeline requirements
The first thing we did was to define the ideal setup and our requirements. We wanted modular components for data preparation, processing, training, evaluation, and serving new data.
Findings to build an AI Pipeline
Google AI Platform
Google AI Platform was compatible with our current cloud setup which was also on GCP. It is a managed service, so it was easy to manage. However, we ran into a blocker when experimenting with it.
Let me shed some light on it. We had to decide between using a standard container or developing a custom container, and unfortunately, neither worked for us.
Standard Container
We could not use GCP’s standard out-of-the-box container as we used ML frameworks other than TensorFlow, scikit-learn, or XGBoost. As a customer-support AI company, we have several NLP models that often don’t use standard frameworks. We also needed to experiment and deploy models quickly without getting blocked by framework limitations.
Standard frameworks for predictions run smoothly on the AI platform. However, a non-standard framework required us to configure a custom prediction routine impacting our velocity. The custom prediction routine also had a big limitation: we could only use a legacy (MLS1) machine type with available RAM of just 2GB! We very quickly ran into an out-of-memory issue.
ISSUE: Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and redeploy
Thus, standard containers became a no-go.
Custom Container
Next, we tried using a custom container, but it didn’t meet the speed and the easy-to-manage requirement we had. It also required a different deployment strategy.
Google Cloud Run
We decided to stay with Cloud Run for our ML requirements. We set up a microservices-oriented architecture and used Cloud Scheduler to schedule different ML tasks periodically.
The most significant advantage of Cloud Run is it handles autoscaling and container crashing gracefully with no operational overhead on us. It is also much cheaper with a generous free tier. The most significant limitation of Cloud Run is max RAM of 8 GB and max CPU count of 4, which will likely be hit in the future as we use larger ML models. We will likely migrate to the AI Platform or Google Kubernetes Engine at that time.
Interested in learning how we are solving real business problems using AI? Learn more about our AI product on our website or contact us directly.
Interested in joining us and working on exciting and challenging problems in AI and machine learning? Send us a quick note with your LinkedIn profile link.