Since its introduction, OpenAI has released countless Generative AI and Large Language Models built on top of their top-tier GPT frameworks, including ChatGPT, their Generative Conversational AI. After the successful creation of conversational language models, developers are constantly trying to create Large Language Models that can either develop or assist developers in coding applications. Many companies have started researching these LLMs, including OpenAI, that would help developers build applications faster with the LLMs knowing programming languages. Google built Codey, a fine-tuned model of PaLM 2, capable of performing varying coding tasks.
Also Read: PaLM 2 By Google To Tackle GPT-4 Effect
Learning Objectives
This article was published as a part of the Data Science Blogathon.
Codey is one of the foundational models built and released by Google recently. The Codey is based on the PaLM 2 Large Language Model. Codey is a fine-tuned model of the PaLM 2 Large Language Model. A large corpus of high-quality codes and coding documents has fine-tuned Codey. Google claims that Codey can code in more than 20+ programming languages, including Python, C, Javascript, Java, and more. Codey was used to enhance Google products like Google Colab, Android Studio, etc.
Codey is built to solve three purposes. One is code completion. Codey can analyze your writing code and make valuable suggestions based on it. Thus it is context-aware of the code you are writing. Another is code generation. Codey can generate complete workable code in any language, provided the prompt. Finally, you can chat with your code. You can provide your code to Codey and chat with Codey related to the code. Codey is now available to the general public through Vertex AI in the Google Cloud Platform.
Also Read: Google’s Med-PaLM 2 to Be Most Advanced Medical AI
To work with Google’s Codey, we must have an account with the Google Cloud Platform. Google Cloud Platform hosts the service called Vertex AI, which holds all the models developed by Google and even the Open Source models fine-tuned by Google. Google has recently made available the recently announced Google Foundational models, which include PaLM 2, Codey, Chirp, and Imagen. GCP users can find them here.
After creating an account in the Google Cloud Platform, we must enable the Vertex AI API to work with Vertex AI. For this, go to the API & Services -> Library, then search for the Vertex AI API. We can see the Vertex AI API in the first pic below. Then click on it. After clicking on it, we will find a blue box with “Enable API” written on it. Click on the blue box to enable the API, which will look similar to the second pic.
This confirmation enables us to work with any of the AI services Google provides, including Google’s foundation models like Chirp, Imagen, and Codey.
This section will look into Code Generation with the Codey model. The prerequisite for this will be enabling the Vertex AI API in the GCP, which we have already done. The code walkthrough here will take place in Google Colab. Before getting to the code, we must install some necessary packages to work with Vertex AI, which we will do through pip.
!pip install shapely
!pip install google-cloud-aiplatform>=1.27.0
The Shapley and the google-cloud-aiplatform are the only two required packages to start working with the Codey model. Now we will import the packages and even authenticate our Google account, so Colab can use our GCP credentials to run the Codey model from Vertex AI.
from google.colab import auth as google_auth
google_auth.authenticate_user()
import vertexai
from vertexai.preview.language_models import CodeGenerationModel
vertexai.init(project="your_project_id", location="us-west1")
parameters = {
"temperature": 0.3,
"max_output_tokens": 1024
}
We will take this imported model, i.e., the CodeGenerationModel, and test it by passing a prompt.
code_model = CodeGenerationModel.from_pretrained("code-bison@001")
response = code_model.predict(
prefix = """Write a code in Python to count the occurence of the
word "rocket" from a given input sentence using Regular Expressions""",
**parameters
)
print(f"Response from Model: {response.text}")
The output for the code can be seen below
We get a Python code as the output for the prompt we have provided. The model has written a Python script matching the query we supplied. Now the only way to test this is to copy the response, paste it into the other cell in the colab and run it. Here we see the output for the same.
The sentence we have provided when the code is run is “We have launched our first rocket. The rocket is built with 100% recycled material. We have successfully launched our rocket into space.” The output successfully states that the word “rocket” has occurred thrice. This way, Codey’s CodeGenerataionModel can be worked with to create quick working codes by just providing simple prompts to the Large Language Model.
The Code Chat function allows us to interact with Codey on our code. We provide the Code to Codey and chat with the Codey model about the code. It can be either to understand better the code, like how it works, or if we want alternate approaches for the given code, which Codey can do by looking at the current code. If we face any errors, then we may provide both the code and the error, which Codey will look at and give a solution to solve the error. We need to navigate to the Vertex AI in the GCP for this. In the Vertex AI service, we then navigate to the Language Section under the Generative AI Studio, which can be seen below
We will go through a non-coding approach, i.e., initially, we have seen how to work with Code Generation through Python with the Vertex AI API. Now we will do this kind of task directly through the GCP itself. Now to chat with Codey on our code, we proceed with the Code Chat option in the center within the blue box. We will click on it to move, then take us to the interface below.
Here, we see that the model we will use is the “codechat-bison@001″ model. Now, what we will do is we will introduce an error to the Regular Expression code that we generated earlier. Then we will give this error code and the error caused to the Code Chat and see if the model corrects our code. In the Python Regex code, we will replace the re.findall() with re.find() and run the code. We will get the following error.
Here we see in the output that we get an error near the re.find() method. Now we will pass this modified code and the error we got to the Code Chat in the “Enter a prompt to begin a conversation.” We get the following output as soon as we hit the Enter button.
We see that the Codey model has analyzed our code and suggested where the error was. It even provided the corrected code for us to work with. This way, the Code Chat can identify and correct errors, understand the code, and even get best code practices.
In this article, we have looked at one of Google’s recently publicly announced foundation models, the Codey, a fine-tuned version of PaLM 2 (Google’s homegrown Generative Large Language Model). The Codey model is fine-tuned on a rich quality of code, thus allowing it to write code in more than 20 different programming languages, including Python, Java, JavaScript, etc. The Codey model is readily available through the Vertex AI, which we can access through the GCP or with the Vertex AI API through API, both of these methods we have seen in this article.
Learn More: Generative AI: Definition, Tools, Models, Benefits & More
Some of the key takeaways from this article include:
A. Absolutely. You only need to provide a prompt, what code you want, and in which language. Codeys’s Code Generation then will use this prompt to generate the code in your desired language for your desired application that you have stated in the prompt
A. Yes. The Codey foundation model is just a fine-tuned model of the PaLM 2, which is fine-tuned on a vast dataset containing codes in different languages.
A. Codey is mainly capable of doing three things. One is code generation from a given prompt, the second is code completion, where the model looks at the code you are writing and provides useful suggestions, and the final is the code chat, where you can chat with Codey on your code, where you provide your code and error if any and then chat with the Codey model related to your code
A. They are not the same but are similar in some ways. GitHub Copilot is based on OpenAI’s model and is capable of auto-code-complete and code suggestions. Codey can do this as well, but it even has the feature of Code Chat, which lets the user ask the model questions related to their code
A. At present, Codey contains three models. The codechat-bison@001 for the Code Chat tasks, the code-gecko@001 for the Code Completion tasks, and code-bison@001 for the Code Generation tasks.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.