Summary: In this blog post, SADA Associate CTO of AI/ML Simon Margolis tackles recent reports related to Google’s release of the generative AI tool Gemini. Learn how Google provides developers with the means to make adjustments to output, confront bias in GenAI, and prevent potentially offensive content.
Google Gemini: Causing offense or creating opportunity?
By now, you may have seen the various image generation “gaffes” from the latest generation of the Gemini (formerly Bard) service from Google. Some have speculated that Gemini has been specifically designed to confront bias and encourage diversity, equity, and inclusion in its generated images, and possibly in its text responses.
While I agree that image responses for a prompt of “Nazis” returning people with dark skin and prompts for “the pope” returning dark-skinned women represent a form of AI hallucination, I want to spend some time discussing why these hallucinations are not indicative of a larger problem with Google’s models, but rather demonstrate an opportunity for enterprise adaptations of these technologies for their own needs.
Improving at the speed of GenAI
Before we get into the weeds of how generative AI models, products, and fine-tuning work, let’s pause and consider what it means to work with technologies that are evolving so quickly in front of our eyes. While in decades past, long development cycles meant that companies could not afford missteps with product launches, the sheer rate of iteration that’s possible today means that companies that insist on perfection run the risk of missing the boat. The accelerated capacity to pivot and refine in real-time means that public reaction essentially becomes part of the QA process.
Sorting through Gemini’s names
I think it’s important to discuss the differences between Gemini, Gemini, and Gemini. Naming is famously one of the hardest things to do in programming, and product names have evolved quickly in the generative AI era.
The product that most people associate with “Google Gemini” is the product formerly known as “Google Bard.” This AI assistant chatbot is a product, not a model. This distinction is critical for understanding the results that are circulating on the Internet. Google’s Gemini assistant is a product that is composed of several different products all stitched together and given complex instructions. One such component of the product is a foundation model that shares a similar name: Gemini 1.0.
Unlike Gemini the chatbot, Gemini 1.0 the foundation model is a large language model (LLM) and not a product unto itself. The Gemini model plays an important role in the Gemini product, but the Gemini chatbot that has been making its rounds in social media posts represents the result of significant development on top of the model itself.
Whether you agree with Google Gemini chatbot’s opinions, they are the direct result of the instruction tuning, prompt engineering, and other directions given to the application by its developers. Google Gemini 1.0 the model does what it’s told. Think of it as a mirror. These opinions, grounding data, and guidelines are not a part of the model, but rather instructions given to the model, leading it to behave the way it does. This means that organizations developing solutions on top of models like Gemini 1.0 (or others from Google as well as other organizations) can control the behaviors, biases, and truths for their specific use cases, an important part of developing AI solutions.
Explicitly defining GenAI behaviors
The ability to control the biases, opinions, and “grounding data” are critical components of many LLMs. With Google’s Vertex AI Studio, developers can explicitly define these behaviors by guiding the model (“You are a customer service chatbot for Superfast Airlines. You must always be kind and helpful. You must never recommend a customer book a flight with another airline.”) as well as the ability to define sources of truth or “grounding data.”
This means that on top of the instructions and guidelines, developers can provide the facts for the model to use. In the example of an airline customer service chatbot, the developers can specify specific standard operating procedures for common issues and their resolutions. By doing this, the LLM will not try to guess at or hallucinate responses to customer complaints but rather will use the provided facts when communicating with the end user.
How enterprises actually use GenAI
While Google’s recent negative publicity is a reflection of many peoples’ disagreement with the grounding data and guidelines built into the Google Gemini product, its ability to hold these opinions consistently and with great specificity is a fantastic example of the power of solutions customized on top of the Gemini 1.0 model.
Beyond these capabilities for customization on top of a model like Gemini 1.0, it’s also important to understand how enterprises are using AI. While chatbots are fun to play with and do a great job of exciting the imagination around the power of generative AI, the reality is that most enterprise business users are not making use of this technology for the sake of having a witty conversation with a virtual companion, nor are they asking for opinions about which dictator is better than another.
Maintaining control over GenAI outputs
Businesses use this technology to perform tasks that are poor uses of human time, such as summarizing large volumes of complex data, discovering patterns and trends in videos and images, process automation, and repetitive tasks like data entry. When using generative AI models like Gemini 1.0, enterprises can complete these objectives while still maintaining extreme control over their outputs.
Google’s Vertex AI platform was developed long before the generative AI boom and has always been focused on curating underlying data, controlling outputs, testing for consistency, and ensuring the maximum number of knobs and levers for developers to mold the models to their desired outcomes.
Placing control of GenAI in developers’ hands
With the hype around generative AI, Google made sure that these controls remained in developers’ hands, understanding that the stakes were even higher as these technologies reached the mainstream. Vertex continues to provide increasing amounts of control via tools like model data grounding in Vertex AI Studio, fine-tuning of foundation models, and citation for source materials used to generate responses in Vertex Search.
Despite the recent negative publicity, I’m convinced that Google’s commitments to transparency and putting control into the hands of the organizations using their technology are stronger than ever. As enterprise adoption of generative AI expands, these foundational tools will only grow in significance and will help to ensure that businesses can rely on their generative AI-based solutions to perform as expected.
And while we continue to see social media clicks focused on the products with the biggest shock factor, whether that be Gemini’s image generation hallucination or Grok’s unhinged snarky replies, I believe the most widespread use cases will continue to depend on the enterprise foundations Google have continued to invest in.
Generative AI is a technological advancement of such magnitude that it requires us to rethink our relationship to software. As we come to understand certain headline-grabbing outputs as reflections of the nature of our input rather than features inherent to these platforms, we’ll get better at using these tools for positive and constructive purposes.
Get started with generative AI with SADA
For more, download the Gartner report on 5 key steps to pilot Generative AI. And when your organization is ready to develop a custom generative AI strategy, don’t hesitate to contact SADA to schedule a discovery call. SADA AI experts will work with your team to develop a strategy that’s right for your unique business, industry, and regulatory environment. We look forward to hearing about your AI vision and helping you make it happen.