What is Vertex AI? - Model Training, Features and more
When building machine learning (ML) models, it takes a lot of input and expertise to move from data management all the way to predictions. You have to worry about data preparation, training the model based on your data, evaluating it for efficiency, and then deploying it to make predictions. Given that organizations tend to have teams at different levels of expertise, streamlining an ML workflow to build accurate and efficient ML models can be challenging. As a result, accelerating the delivery of ML models and applications to production becomes extremely difficult. Well, not anymore, thanks to Vertex AI.
What is Vertex AI?
Vertex AI is Google Cloud’s ML platform for building, evaluating, versioning, and deploying ML models into production environments. It is an end-to-end solution covering the entire machine-learning workflow from data preparation and model development to deployment and monitoring. Vertex AI combines all of these under one roof so that your data science, ML engineering, and data engineering teams can have a common set of tools throughout your ML workflow.
It has a variety of low-code tools that users can utilize to harness the power of AI for different use cases without compromising on capabilities. All of this is on fully managed infrastructure handled by Google Cloud, allowing you to seamlessly scale ML models without worrying about infrastructure provisioning. Vertex AI is one of Google Cloud’s data services, and with its features, organizations can build, deploy and scale machine learning models more easily.
Features of Vertex AI
Vertex AI has solutions for each step of the machine learning workflow including:
Data preparation consolidated in one place
You can use Managed Datasets within Vertex AI to prepare your data prior to training. In Managed Datasets you can bring in data directly from the console or via the API, clean it, and label it. Once you have your data, you can explore it using either Vertex AI Workbench notebooksor Dataproc Serverless Spark depending on its size.
Model training options for experts
Here there are 2 options: AutoML or Custom. AutoML is the best option for novice users as it allows them to train models based on their data without writing any code. It supports different data types like text, images, video, and tabular data. With AutoML, Vertex AI will handle finding the best model for the specific task.
The Custom option is the preferred ML framework for experts and it gives them more control over the architecture of the model. So if you want to create the architecture and write the training code yourself, this is the option for you. Custom training allows you to use a pre-built container image (PyTorch, Tensorflow, or XGboost) or create a custom container image to prepare your training app. Then, you can configure the compute resources to run the training job of your custom model.
Model evaluation to improve performance
Using Explainable AI in Vertex AI you can assess your ML models and understand the signals behind how it works. Vertex AI provides you with several metrics such as confidence threshold, recall, precision, etc., to evaluate your model’s performance. You can use this information to eliminate errors and improve your model’s performance.
Model serving
To serve your models for predictions, Vertex AI allows you to create endpoints for deployment. A model can have multiple endpoints and it will scale depending on traffic, provided you configure the compute resources. The best part about model serving with Vertex AI is that you can import a model trained elsewhere and serve it on Vertex AI. An ML model deployed on Vertex AI can serve online predictions (HTTP predictions) and batch predictions – based on batch data on Cloud Storage. Batch predictions don’t require deployment to endpoints.
Integration with Generative AI
Vertex AI’s integration with Generative AI gives users access to Google’s Generative AI foundation models. There are a variety of foundation models available on Vertex AI such as code, image, text and chat, and text embedding models. These are grouped depending on the content they’re designed to generate.
Foundation models are built for general use cases but you can finetune them for specific tasks. You can do this using prompts (for instance, text commands) and get them to behave in different ways based on how you structure your prompts. Once you’re satisfied, you can deploy them to production. With limited ML knowledge and a few lines of code, you can get started building with generative AI.
Vertex AI in MLOps
Vertex AI is great for organizations that streamline model development and deployment via MLOps. In addition to the features mentioned above, Vertex AI has several other features that help users to implement and enhance MLOps throughout the ML workflow. These include:
Vertex AI pipelines – to build machine-learning pipelines that automate the orchestration of ML workflows, monitoring, and governance. This helps to minimize errors and move ML models to production faster.
Vertex ML Metadata – to record ML metadata, artifacts, and parameters that are useful in model evaluation.
Vertex AI Experiments and Vertex AI TensorBoard – to identify the best model for a given use case depending on different factors like architecture and how well they perform.
Vertex AI Feature Store – to manage ML features and re-use them at scale to fast-track developing and deploying new ML models.
Vertex AI Model Monitoring – to monitor the quality of deployed models and send alerts when the predictions are wide off the mark.
With Vertex AI’s MLOps tools, AI teams (data scientists, ML engineers, and IT persons) can collaborate effectively and improve their ML models via predictive monitoring and proactive maintenance.
Training a custom model with Vertex AI
If your machine learning use cases can’t be met by what is offered in Vertex AI’s AutoML, training a custom model would be best for you. It will give you more control and flexibility throughout the model development process; and if you want to do this at scale, Vertex AI addresses the common challenges associated with scaling custom models through infrastructure management and enterprise-grade security.
NOTE: This method is more complex than using AutoML and it will require you to write your own training code.
So, to train a custom model with Vertex AI, here’s what you’ll need to do:
1: Load and prepare training data
Start by creating your dataset based on the prediction task and the type of data you want to use. All the datasets created will be available to you on the Vertex AI dashboard. To get the best performance and support, we recommend that you use either Cloud Storage, BigQuery, or NFS shares on Google Cloud as your data source, or Vertex AI Managed datasets when you want to use training pipelines.
2: Prepare the training application
Training a custom model in Vertex AI is done in Docker containers, so you’ll have to create a container for your training application. Before you do this, ensure you select the Custom Training option in the training tab on your Vertex AI dashboard.
With the custom training option, Vertex AI allows you to train models built with any framework using pre-built or custom containers. For pre-built containers, here are the supported frameworks: PyTorch, TensorFlow, XGBoost, and scikit-learn. If you’ve built your training app with any of these frameworks, you can upload your code as a Python package.
If you built your training app using any other framework, the custom container option is for you. Here, you can create your own Docker container image with your training code and dependencies installed and push it to Artifact Registry.
3: Configure the training job
First, you’ll need to select the type of training job as either a custom job, hyperparameter tuning, or training pipeline. Then configure the compute resources (WMs, GPUs, etc) that will be used for the training job. After these settings, you’ll also make container configurations such as providing the URI of the container image you want to use.
4: Create your training job
With all the settings in place, you can create your training job using either the Google Cloud console, Google Cloud CLI, Vertex AI SDK for Python, or the Vertex AI API.
All the models you create will be accessible to you on the Vertex AI dashboard. You can now create an endpoint to serve your model. It is important to note that to use Vertex AI, you must create a billing account. Google Cloud will give you a $300 credit for free over 90 days but, you won’t be billed automatically once the trial period is over.
Key Takeaways
Vertex AI is a machine-learning platform that provides tools for each step of the ML workflow from data preparation all the way to predictions. It is a great tool for MLOps as it comes with features that allow continuous integration and deployment of ML models to production environments.
Vertex AI is ideal for organizations that want to unify their AI and MLOps teams in one platform and provide them with a common toolset for their work. In doing so, it can help them to create secure, efficient, and accurate ML models much faster and more conveniently.