I’m pretty sure most of you have already used ChatGPT. That’s great because you’ve taken your first step on a journey we’re about to embark on! You see, when it comes to mastering any new technology, the first thing you do is use it. It’s like learning to swim by jumping into the water!
You might have heard of the model consumers, tuners, and builders. But hang on, we’re about to break it down even further.
McKinsey looks at it as takers, shapers, and makers which they mentioned in their GenAI Recognise session.
We will take a closer look at each of these layers in this article.
To dig even deeper into this, we’ll turn to a real-life example that’ll make everything crystal clear. In today’s tech landscape, it’s a given that most apps need to work on multiple platforms. However, here’s the catch: each platform has its unique interface and peculiarities. Extending support of an application for additional platforms and maintaining such multi-platform applications is equally challenging.
But that’s where GenAI swoops in to save the day. It empowers us to create a unified and user-friendly interface for our applications, regardless of the platforms they cater to. The magic ingredient? Large Language Models (LLMs) transform this interface into a natural and intuitive language.
To make it more specific for even better understanding, let’s say we want to know what exact command to run for different scenarios on our machine which can be Linux, windows, or Mac. The following diagram illustrates one scenario:
As an end user, you don’t have to learn/know commands for each of these platforms and can get your things done naturally and intuitively. As a developer of the application, you don’t have to explicitly translate each of the user-facing application interfaces into each of the underlying supported platforms.
Several LLMs, including GPT3, GPT3.5, and GPT4, reside in the Cloud, courtesy of various providers such as Open AI and Azure Open AI. They are made easily accessible by various APIs like completion, chat completion, etc.
AI orchestrators make this access even more seamless and uniform across models and providers. That is the reason that GenAI applications these days typically interact with AI orchestrators instead of directly interacting with underlying providers and models. It then handle the orchestration with configurable and/or possibly multiple, underlying providers and models as required by the application.
You can have a plugin for each of the platforms your application wants to support for flexibility and modularity. We will deep dive into all the things we can do with these plugins and orchestrators in the sections that follow.
Finally, the application has connector(s) to interact with platforms it wants to support to execute the commands generated by GenAI.
There are numerous in the configuration itself that you can tune to achieve the desired results. Here is a typical config.json from a semantic kernel plugin:
{
"schema": 1,
"description": "My Application",
"type": "completion",
"completion": {
"max_tokens": 300,
"temperature": 0.0,
"top_p": 0.0,
"presence_penalty": 0.0,
"frequency_penalty": 0.0,
"stop_sequences": [
"++++++"
]
},
"input": {
"parameters": [
{
"name": "input",
"description": "Command Execution Scenario",
"defaultValue": ""
}
]
}
}
The ‘type’ specifies the API type you want to execute on the underlying LLM. Here we are using the “completion” API. The “temperature” determines the variability or creativity of the model. E.g. while you are chatting you may want AI to respond with different phrases at different times though they all may convey the same intent to keep the conversation engaging. Here however we always want the same precise answer. Hence we are using the value of 0. Your result might consist of different sections with some predefined separator(s) if you want only the first section, like the exact matching command in our case, to be output as a response you make use of “stop_sequences” like here. You define your input with all the parameters, only one in this case.
Now let’s dive into much talked about prompt engineering and how we can leverage it.
System messages tell the model how exactly we want it to behave. E.g. the Linux bash plugin in our case might have something like the following at the beginning of its skprompt.txt
You are a helpful assistant that generates commands for Linux bash machines based on user input. Your response should contain ONLY the command and NO explanation. For all the user input, you will only generate a response considering the Linux bash commands to find its solution.
Which specifies its system message.
It helps the model to give the exact answer if you give it some examples of questions and corresponding answers that you are looking for. It is also called a few shots prompting. E.g. our Linux bash plugin might have something like the following in its skprompt.txt following the system message mentioned above:
Examples
User: Get my IP
Assistant: curl ifconfig.me
++++++
User: Get the weather in San Francisco
Assistant: curl wttr.in/SanFrancisco
++++++
User:"{{$input}}"
Assistant:
You may want to tune your system to pick the right examples/shots that you desired result.
We will put together this configuration and prompt engineering in our simple example and see how we can manage AI orchestration in semantic kernel.
import openai
import os
import argparse
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureTextCompletion
parser = argparse.ArgumentParser(description='GANC')
parser.add_argument('platform', type=str,
help='A platform needs to be specified')
parser.add_argument('--verbose', action='store_true',
help='is verbose')
args = parser.parse_args()
kernel = sk.Kernel()
deployment, api_key, endpoint = sk.azure_openai_settings_from_dot_env()
kernel.add_text_completion_service("dv", AzureTextCompletion(deployment, endpoint, api_key))
platformFunctions = kernel.import_semantic_skill_from_directory("./", "platform_commands")
platformFunction = platformFunctions[args.platform]
user_query = input()
response = platformFunction(user_query)
print (respone)
This Python script takes ‘platform’ as a required argument. It picks up the right plugin from the folder ‘platform_commands’ for the specified platform. It then takes the user query, invokes the function, and returns the response.
For your first few use cases, you may want to experiment only till here as LLMs already have a lot of intelligence. This simple configuration and prompt engineering alone can give results very close to your desired behavior and that too very quickly.
The following techniques are rather advanced at this time, require more effort and knowledge, and should be employed weighing in the return on investment. The technology is still evolving and maturing in this space. We will only take a cursory look at them at this time for completeness and our awareness of what lies ahead.
Fine-tuning involves updating the weights of a pre-trained language model on a new task and dataset. It is typically used for transfer learning, customization, and domain specialization. There are several tools and techniques available for this. One way to do this is using OpenAI’s CLI tools. You can give it your data and generate training data for fine-tuning with commands like:
openai tools fine_tunes.prepare_data -f <LOCAL_FILE>
Then you can create a custom model using Azure AI Studio:
Providing the fine-tuning data that you prepared earlier.
If you are brave enough to dive deeper and experiment further read on! We will look at how to build our custom models.
This is very similar to the fine-tuning that we saw earlier. Here is how we can do it using transformers:
from transformers import AutoTokenizer
# Prepare your data
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True)
# let's say my dataset is loaded into my_dataset
tokenized_datasets = my_dataset.map(tokenize_function, batched=True)
# load your model
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)
# Train
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(output_dir="mydir")
trainer.train()
# save your model which can be loaded by pointing to the saved directory and used later
trainer.save_model()
Here you can start with some known model structures and train them from scratch. It will take a lot of time, resources, and training data though the built model is completely in your control.
You can define your model structure, potentially improving on existing models, and then follow the process above. Amazon’s Titan and Codewhisperer fall into this category.
GenAI holds immense potential for diverse use cases. This article exemplified its application in multi-platform support and quick solution building. While skepticism surrounds GenAI, the path to harnessing its power is clear. However, the journey becomes intricate when delving into model tuning and training.
Key Takeaways:
A. I don’t think so. You can pick up your favorite use case and try it yourselves employing the steps laid out in this article to answer it for yourselves!
A. There are end users, model consumers, model tuners, and model builders.
A. They are LLMs, providers, and AI orchestrators.
A. It is the GPUs, TPUs, and cloud hosting services like Open AI, Azure Open AI, etc.
I am a seasoned software engineer with over 27 years of industry experience, having worked with prominent companies in India and the United States. My educational background includes being an alumnus of the computer science departments at IIT Bombay and Georgia Tech.
Currently, I am the Principal Architect at GS Lab | GAVS. At GS Lab | GAVS, we are deeply involved in pioneering the Generative AI (GenAI) field. We’ve dedicated substantial effort to developing a structured approach to tackle the challenges in this exciting domain. I invite you to visit our website to stay updated with our latest endeavors and innovations.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.