Thursday, July 18, 2024

The News from Google Cloud Next ‘24

Last year, the world was just beginning to imagine how generative AI technology could transform businesses — and today, that transformation is well underway. At Google, our north star is the same: to make AI helpful for everyone, to improve the lives of as many people as possible. 

A world of Cloud-connected AI-powered agents

With Google Cloud’s entire AI portfolio – infrastructure, Gemini, models, Vertex AI – customers and partners are building increasingly sophisticated AI agents that serve customers, support employees, and help them create content and much more.

Great Customer Agents can help your customers interact with your business more seamlessly by working across channels – web, mobile, call center, and point of sale – and in multiple modalities, like text, voice, and more. 

  • IHG Hotels & Resorts will launch a generative AI-powered travel planning capability that can help guests easily plan their next vacation.
  • Target is optimizing offers and curbside pickup on the Target app and

Employee Agents help all your employees be more productive and work better together. For example:

  • Etsy uses Vertex AI training to optimize their search recommendations and ads models, delivering better listing suggestions to buyers and increasing sales.
  • Dasa in Brazil is helping physicians detect relevant findings in test results more quickly.

Creative Agents can serve as the best designer and production team – working across images, slides, and exploring concepts with you. We provide the most powerful platform and stack to build creative agents. For example:  

  • Canva is using Vertex AI to power its Magic Design for Video, helping users build engaging videos in a matter of seconds.
  • Carrefour is using Vertex AI, they were able to create dynamic marketing campaigns across various social networks in weeks not months.  

Here’s a look at the product innovations announced at Cloud Next ‘24 to help organizations of all sizes pave new paths forward in the AI era.

Scale with AI-Optimized Infrastructure 

The potential for gen AI to drive rapid transformation is only as powerful as the infrastructure that underpins it. Google Cloud is making key advancements to support customers across every layer of the stack: 

  • A3 mega: Developed with NVIDIA using H100 Tensor Core GPUs, this new GPU-based instance is generally available and delivers twice the bandwidth per GPU compared to A3 instances, to support the most demanding workloads. Google Cloud is also announcing Confidential A3, which enables customers to better protect the confidentiality and integrity of sensitive data and AI workloads during training and inferencing. 
  • NVIDIA HGX B200 and NVIDIA GB200 NVL72: The latest NVIDIA Blackwell platform chips will be coming to Google Cloud in early 2025 in two variations: the HGX B200 and the GB200. B200 is designed for mainstream training and serving, while GB200 NVL72 powers real-time large language model inference and massive-scale training performance for trillion-parameter- scale models. 

TPU v5p: Google Cloud is announcing the general availability of TPU v5p, our most powerful, scalable, and flexible AI accelerator for training and inference, with 4X the compute power of per pod compared to our previous generation. Google Cloud is also announcing availability of Google Kubernetes Engine (GKE) support for TPU v5p. Over the past year, the use of GPUs and TPUs on GKE has grown more than 900%.  

  • AI-optimized storage options: Google Cloud is accelerating training speed with new caching features in Cloud Storage FUSE and Parallelstore, which keep data closer to a customer’s TPU or GPU. Google Cloud is also introducing Hyperdisk ML (in preview), our next generation block storage service accelerates model load times up to 3.7X compared to common alternatives.
  • New options for Dynamic Workload Scheduler: Calendar mode for start time assurance and flex start for optimized economics will help customers ensure efficient resource management for the distribution of complex training and inferencing jobs. 

Google Cloud is also bringing AI closer to where the data is being generated and consumed – to the Edge, to air-gapped environments, to Google Sovereign Clouds, and Cross-Cloud. Google Cloud is  enabling AI anywhere through Google Distributed Cloud (GDC), allowing you to choose the environment, configuration, and controls that best suit your organization’s specific needs. For example, leading mobile provider Orange, which operates in 26 countries where local data must be kept in each country, leverages AI on GDC to improve network performance and enhance customer experiences.

Today Google Cloud is announcing a number of new capabilities in GDC, including:

  • NVIDIA GPUs to GDC: Google Cloud is bringing NVIDIA GPUs to GDC for both connected and air-gapped configurations. Each of these will support new GPU-based instances to run AI models efficiently.
  • GKE on GDC: The same GKE technology leading AI companies are using on Google Cloud will be available in GDC. 
  • Support AI Models: Google Cloud is enabling a variety of open AI models, including Gemma, Llama and more on GDC to run in air-gapped and connected edge environments.
  • Vector Search on GDC: Google Cloud is also bringing the power of Vector Search to allow search and information retrieval on GDC for your private and sensitive data with extremely low latency. 
  • Sovereign Cloud: For the most stringent regulatory requirements, Google Cloud delivers GDC in a fully air-gapped configuration with local operations, full survivability, managed by Google or through a partner of your choice. You have complete control, and when regulations change, we have the flexibility to help you respond quickly. 

While not every workload is an AI workload, every workload you run in the cloud needs optimization and each application has unique technical needs. That’s why Google Cloud is introducing new, general-purpose compute options that help customers maximize performance, enable interoperability between applications, and meet sustainability goals, all while lowering costs. 

  • Google Axion: Google Cloud’s first custom ArmⓇ-based CPU designed for the datacenter, delivers up to 50% better performance and up to 60% better energy efficiency than comparable current-generation x86-based instances. 
  • Google Cloud is also announcing N4 and C4, two new machine series in our general purpose VM portfolio; native bare-metal machine shapes in the C3 Machine Family; the general availability of Hyperdisk Advanced Storage Pools, and much more.

Google Cloud is also expanding data residency for data stored at-rest for Generative AI on Vertex AI services to 11 new countries: Australia, Brazil, Finland, Hong Kong, India, Israel, Italy, Poland, Spain, Switzerland, and Taiwan.  

  • Additionally, customers can now limit machine learning processing to the United States or European Union when using Gemini 1.0 Pro and Imagen. 
  • Joining 10 other countries announced last year, these new regions give customers more control over where their data is stored and how it is accessed, making it easier for customers to satisfy regulatory and security requirements around the world. 

Create Agents with Vertex AI
Google Cloud offers more than 130 first and third party models on Vertex AI, and we’re expanding access to a variety of models so customers have the most choice when it comes to model selection:

  • Gemini 1.5 Pro: Gemini 1.5 Pro offers two sizes of context windows – 128K tokens and 1 million tokens, and that it is now available in public preview. Customers can process vast amounts of information in a single stream including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code, or over 700,000 words.
  • Claude 3: Anthropic’s new family of state-of-the-art models is now generally available for customers on Vertex AI.
  • CodeGemma : Gemma is a family of lightweight, state-of-the-art open models built by the same research and technology used to create the Gemini models. A new fine tuned version of Gemma designed for coding use cases such as code generation and code assistance, CodeGemma, is now available on Vertex AI. 
  • Imagen 2: Google Cloud’s most advanced text-to-image technology boasts a variety of image generation features to help businesses create images that match their specific brand requirements.  A new text to live image capability allows marketing and creative teams to generate animated images such as gifs, which are equipped with safety filters and digital watermarks. In addition, we are announcing general availability of advanced photo editing features, including inpainting and outpainting, and much more.
  • Digital Watermarking: Powered by Google DeepMind’s SynthID, Google Cloud is proud to announce it is generally available today for AI-generated images produced by Imagen 2. 

Customers get far more from their models when they augment and ground them with enterprise data. Today, we are expanding Vertex AI grounding capabilities in two ways:

  • Google Search: Grounding models in Google Search combines the power of Google’s latest foundation models along with access to fresh, high quality information to significantly improve the completeness and accuracy of responses. Your Data: Ground on data from Enterprise Applications, like Workday or Salesforce, and easily connect Google’s databases, like AlloyDB and BigQuery. 

Once you have chosen the right model, tuned and grounded it, Vertex can also help you deploy, manage and monitor the models. Today, Google Cloud is announcing additional ML opps capabilities: 

  • Prompt Management tools: These tools let you collaborate on built-in prompts with notes and status, track changes over time, and compare the quality of responses from different prompts. 
  • Automatic side-by-side: Now generally available, Auto SxS provides explanations of why one response outperforms another and certainty scores, which helps users understand the accuracy of the evaluation.  
  • Rapid Evaluation feature: Now in preview, this helps customers quickly evaluate models on smaller data sets when iterating on prompt design. 

Finally Vertex AI Agent Builder brings together foundation models, Google Search, and other developer tools, to make it easy for you to build and deploy agents. It provides the convenience of a no code agent builder console alongside powerful grounding, orchestration and augmentation capabilities. With Vertex AI Agent Builder, you can now quickly create a range of gen AI agents, grounded with Google Search and your organization’s data. 

Accelerate development

Gemini Code Assist is our enterprise-focused AI code assistance solution. To support developers, Google Cloud is announcing: 

  • Gemini 1.5 Pro in Gemini Code Assist: This upgrade brings a massive 1 million token context window, revolutionizing coding for even the largest projects. Gemini Code Assist now delivers even more accurate code suggestions, deeper insights, and streamlined workflows. 
  • Gemini Cloud Assist: This provides AI assistance across your application lifecycle, making it easier to design, secure, operate, troubleshoot, and optimize the performance and costs of your application.  

Google Cloud  deployed Gemini Code Assist to a group of developers inside Google and found they had more than 40% faster completion time for common dev tasks and spent 55% less time writing new code. In fact, Gemini Code Assist supports your private code base to be anywhere – on premises, GitHub, GitLab, Bitbucket, or even multiple locations.

Unlock the Potential of AI with Data

Google Cloud lets you combine the best of AI with your grounded enterprise data, while keeping your data private and secure. Today, we’re announcing new enhancements to help organizations build great data agents:  

  • Gemini in BigQuery: Gemini in BigQuery uses AI to help your data teams with data preparation, discovery, analysis and governance. Combined with this, you can build and execute data pipelines with our new BigQuery Data Canvas, which provides a new notebook-like experience with natural language and embedded visualizations, both available in preview. 
  • Gemini in Databases: This makes it easy for you to migrate data safely and securely from legacy systems, for example, converting your database to a modern cloud database like AlloyDB. 
  • Gemini in Looker: We’re introducing new capabilities, currently in preview, that allow your data agents to easily integrate with your workflows. We have also added new gen AI capabilities to enable you to chat with your business data, and it is integrated with Google Workspace.

Improve your Cybersecurity Posture with AI-Driven Capabilities

Gen AI has the potential to tip the balance in favor of defenders, with Security Agents providing help across every stage of the security lifecycle. Innovations in Google Cloud’s security portfolio that deliver stronger security outcomes and enable every organization to make Google a part of their security team include:

  • Gemini in Threat Intelligence: Uses natural language to deliver deep insight about threat actor behavior. With Gemini, we are able to analyze much larger samples of potentially malicious code. Gemini’s larger context window allows for analysis of the interactions between modules, providing new insight into code’s true intent.
  • Gemini in Security Operations: A new assisted investigation feature converts natural language to detections, summarizes event data, recommends actions to take, and navigates users through the platform via conversational chat.

Supercharge Productivity with Google Workspace

With Gemini for Workspace, businesses have an AI-powered agent that’s built right into Gmail, Docs, Sheets and more. Today Google Cloud is announcing the next wave of innovations and enhancements to Gemini for Google Workspace, including:

  • Google Vids: This new AI-powered video creation app for work is your video writing, production, and editing assistant, all in one. It can generate a storyboard you can easily edit, and after choosing a style, it pieces together your first draft with suggested scenes from stock videos, images, and background music. It can also help you land your message with the right voiceover – either choosing one of our preset voice overs or using your own. Vids will sit alongside other Workspace apps like Docs, Sheets, and Slides. It includes a simple, easy-to-use interface and the ability to collaborate and share projects securely from your browser. Vids is being released to Workspace Labs in June.. 
  • AI Meetings and Messaging add-on: With “take notes for me”, chat summarization, and real-time translation in 69 languages (equal to 4,600 language pairs), this collaboration tool will only cost $10 per user, per month. 
  • New AI Security add-on: Workspace admins can now automatically classify and protect sensitive files and data using privacy-preserving AI models and Data Loss Prevention controls trained for their organization. The AI Security add-on is available for $10 per user, per month and can be added to most Workspace plans.

Blog post: Welcome to Google Cloud Next ‘24 by Thomas Kurian, CEO of Google Cloud