AWS, or Amazon Web Services, is one of the world’s most widely used cloud service providers. It is a cloud platform that provides a wide variety of services that can be used together to create highly scalable applications. AWS has many clusters of data centers in multiple countries across the globe. These clusters are known as Regions. These clusters are further divided into AZs or Availability Zones. Due to such an extremely large global footprint, AWS is commonly used by many companies with IT infrastructure.
Some of the domains for which AWS provides its services are:
So, let’s begin the learning process!
Compute services on AWS are services that allow us to develop computational applications and deploy them on the cloud. One of these services is AWS Lambda. AWS Lambda is a cost-efficient serverless computing service, which means that we do not need to provision or manage any servers.
We write our code in the form of lambda functions, and this code can be written in multiple languages as AWS Lambda provides support for various different runtimes.
AWS Lambda charges our account based on the number of requests, the memory used by your function, and the compute time that our code runs for. This makes lambda extremely cost-efficient for the users as we won’t have to pay for any unused resources.
Some common use cases where we can use lambda functions are:
AWS EC2 is a service using which we can create virtual machines with many operating systems and hardware configuration options. Just like AWS Lambda, it is also a computing service. We can decide which configuration would work best for our application. We can choose from using a virtual machine with 4 CPU cores up to 64 CPU cores.
With these things in mind, let us create our first Lambda Function
Now we will be creating our first lambda function. In this function, we will be pulling data from an API and storing it in a DynamoDB table. The API which we will be using here is the Spotify API. Using this API, we will create a lambda function that fetches data from the global top 50 charts daily and stores it in a NoSQL database. The idea is that in real-life industry projects, there will be a lot of times when you’ll have to automate data fetching tasks from APIs to store data.
Prerequisites:
AWS Account, you’ll need an AWS account in order to use its services.
Spotify Account, using this account, we will create an app on Spotify’s developer platform to use Spotify’s API
We will begin by creating a developer account on Spotify; we will need the credentials to connect to this API using python’s Spotipy library.
In this step, we will create the dynamodb table that will save our data.
We must now create an IAM role for the lambda function; this IAM role will allow the lambda function to add data to a dynamodb table.
We will now create the lambda function that will call the API and update the dynamo db table.
Now that the lambda function has been created, we need to write the code file for this function. We will do this on our system locally and upload that code file in a zip.
To do this, you need to create a folder and name it whatever you want.
Then we need to install some dependencies inside this folder only; to do this, we need to use our terminal. Open the terminal inside this folder and type “pip install -t ./Spotify”. We also need to install pandas for our example so you should also install it using “pip install -t ./ pandas”.
After this, we must create a new file called lambda_function.py and write the following code in that file
import json import spotipy from spotipy.oauth2 import SpotifyClientCredentials import boto3 from datetime import datetime import pandas as pd def lambda_handler(event, context): Client_ID =”" # Your Client ID here Client_Secret = "" # Your Client Secret here credentials_client = SpotifyClientCredentials(client_id=Client_ID, client_secret=Client_Secret) sp = spotipy.Spotify(client_credentials_manager = credentials_client) top_50_global = sp.playlist("https://open.spotify.com/playlist/37i9dQZEVXbMDoHDwVN2tF") timestamp = [item['added_at'] for item in top_50_global['tracks']['items']] albums = [item['track']['album']['name'] for item in top_50_global['tracks']['items']] artists = [[artists['name'] for artists in item['track']['album']['artists']] for item in top_50_global['tracks']['items']] songs = [item['track']['name'] for item in top_50_global['tracks']['items']] song_duration = [item['track']['duration_ms'] for item in top_50_global['tracks']['items']] song_popularity = [item['track']['popularity'] for item in top_50_global['tracks']['items']] s_id = [item['track']['id'] for item in top_50_global['tracks']['items']] top_50_df = pd.DataFrame({ 's_id' : s_id, 'timestamp' : timestamp, 'songs' : songs, 'albums' : albums, 'artists' : artists, 'duration' : song_duration, 'popularity' : song_popularity }) dynamo_client = boto3.client('dynamodb') for index, row in top_50_df.iterrows(): row = eval(row.to_json()) Item = {'s_id' : {'S': row['s_id']+str(datetime.now().date())}, 'timestamp' : {'S' : row['timestamp']}, 'songs' : {'S' : row['songs']}, 'albums' : {'S' : row['albums']}, 'artits' : {'S' : str(row['artists'])}, 'duration' : {'S' : str(row['duration'])}, 'popularity' : {'S' : str(row['popularity'])}} dynamo_client.put_item(TableName = 'spotify', Item = Item) return { 'statusCode': 200, 'body': json.dumps('Spotify Table Updated') }
In the above code, we have used the Spotify API to fetch the global top 50 playlist songs and uploaded them to the dynamodb table we created earlier.
Once you’ve created the code file and added the dependencies in a folder, you need to press ctrl + a and compress all the files into a zip.
Now you must upload the zip to the lambda function from the below option.
The final step is to create an eventbridge trigger that will trigger our lambda function every day at a specific time.
This will trigger our lambda function at 12:00 PM UTC every day and fetch the data and add it to our dynamodb table.
A. AWS Lambda is like having a magical computer that runs your code for you without you needing to worry about servers. You write your code and upload it to Lambda, and it takes care of running it whenever needed. It can handle lots of requests and automatically adjusts to handle more or less work. It’s great for building applications and automating tasks without the hassle of managing servers.
A. The three main components of AWS Lambda are:
1. Function: This is the core component of AWS Lambda. It consists of the code you write to perform a specific task or function. It can be written in various programming languages such as Python, Java, or Node.js.
2. Trigger: Triggers are events or conditions that initiate the execution of your Lambda function. They can be external events like changes to data in an S3 bucket, incoming messages from an Amazon SQS queue, or even scheduled events.
3. Execution Environment: This is the computing environment where your Lambda function runs. AWS Lambda provides and manages the necessary resources to execute your code, including the runtime environment, memory allocation, and network access. You don’t have to worry about provisioning or managing servers; AWS handles it for you.
In this article, we understood how we can create an end-to-end data-fetching lambda function script. This script can fetch data from an API and store it in a NoSQL database. The purpose of showing you all this implementation is to demonstrate how you can create lambda functions to automate your data-fetching tasks. I encourage you all to go through the AWS documentation to understand more about AWS Lambda and practice with lambda on your own.