Analysis of Restaurants in the United States

Subhradeep Last Updated : 27 Oct, 2022

9 min read

This article was published as a part of the Data Science Blogathon.

Introduction

After working for a long time in the office, suddenly, we felt a storm brewing in our stomach, saying Hey! I need food. Then you just come out on the road and start searching for a nearby restaurant – it can be part of a chained restaurant or an independent restaurant. Search for nearby restaurants makes restaurant owners think about opening multiple outlets to grow their business. This is where the concept of the Chain Restaurant comes from. Some of the restaurants are owned by those who operate them. Those restaurants are called Independent Restaurants.

In this article, I am going to do some analysis of the chain and independent restaurants in the United States. We know about so many things like –

Which Restaurant has the Highest chains in the United States?
In Which State the most popular restaurants are located?
What is the cuisine type in those restaurants?

And many more.

So, without further ado, let’s get started.

The Data

The dataset is collected from this GitHub repository. The full dataset needs to merge part1, part2, and part3. The column descriptions are given in the below image.

_{Data Dictionary}

Importing the Necessary Libraries

Before preprocessing and analysing the data, let’s import the necessary libraries so that we don’t have to face any import issues while doing the main task.

	import pandas as pd
	import matplotlib.pyplot as plt
	import seaborn as sns
	from bs4 import beautifulsoup as bs
	import requests
	import warnings

	warnings.filterwarnings('ignore')

view raw res_library_import.py hosted with ❤ by GitHub

Now let’s dive into the data preprocessing part.

Data Preprocessing

Now let’s read the 3 parts of the data from GitHub and concat them.

	# loading the data from github
	chain_data1 = pd.read_csv("https://raw.githubusercontent.com/friendlycities-gatech/chainness/main/data/chainness_point_2021_part1.csv")
	chain_data2 = pd.read_csv("https://raw.githubusercontent.com/friendlycities-gatech/chainness/main/data/chainness_point_2021_part2.csv")
	chain_data3 = pd.read_csv("https://raw.githubusercontent.com/friendlycities-gatech/chainness/main/data/chainness_point_2021_part3.csv")

	# concatenating the data
	chain_data = pd.concat([chain_data1, chain_data2, chain_data3])
	chain_data.shape

view raw restaurant_data_import.py hosted with ❤ by GitHub

After running the above code, you will see the below output.

So, this data contains 705622 rows and 14 columns. The image of the first 5 rows of data is given below.

From the above image, we can notice that the states of the United States are in short form in the State column. Commonly, everybody doesn’t know every abbreviation of the US states. So we have to convert those abbreviations into their full form. For this, we scrape a table from a website containing all the full forms of those abbreviations. Afterwards, we merge that table with the original data and remove the abbreviated State column.

Let’s scrape the table from the website.

	r = requests.get("https://www23.statcan.gc.ca/imdb/p3VD.pl?Function=getVD&TVD=53971")

	bs = soup(r.content, 'html')
	table = bs.find_all('table')[0]
	df = pd.read_html(str(table),header=0)[0]
	df

view raw scrape_code.py hosted with ❤ by GitHub

After running the above code, we can see the below output.

This table has 51 rows and 4 columns. I don’t display the whole table here. If you want to see the whole table, visit here. Now let’s merge this table into the original table.

	# taking the 'State' and 'Alpha code' column from the scrapped table
	df = df[['State', 'Alpha code']]

	# renaming the 'State' column of the original data to 'Alpha code'
	chain_data.rename(columns={'State': 'Alpha code'}, inplace=True)

	# merging the scrapped table with the original data
	chain_data = pd.merge(left=chain_data_sort, right=df, left_on='Alpha code', right_on='Alpha code')

	# dropping the 'Alpha code' column
	chain_data.drop('Alpha code', axis=1, inplace=True)

	# taking the necessary columns from the original table
	chain_data_sort = chain_data[['RestaurantName', 'Cuisine', 'State', 'CNTY_NAME', 'UA_NAME', 'Frequency']]

	# displaying the table
	chain_data_sort.head()

view raw scrape_table_merge.py hosted with ❤ by GitHub

Here I do one more thing – as every column of the original data is not necessary to us, so we only take the needed columns from the original data. The final table will look something like this.

Look at the UA_NAME column. This column contains the name of the city and the state in which they are located, separated by a column. Here we remove the state part as we already have a column containing states of the United States. For this, just run the below code.

chain_data['UA_NAME_MOD'] = chain_data['UA_NAME'].str.split(',', expand=True)[0]

Here is the output:

See the difference between UA_NAME and UA_NAME_MOD columns. The state part is removed from the UA_NAME_MOD column. Now you can drop the UA_NAME column as there is no use for it.

Now we are ready for the main task, i.e. Data Analysis.

Data Analysis

_{Statistical Summary}

First, we see the statistical summary of the data as usual.

print(chain_data_sort.describe())
print(chain_data_sort.describe(include='O'))

The output for the first code description is shown below.

From the above result,

We can easily notice that the maximum hotel frequency is 24333. If we see the first 5 rows of the data, we can see that Subway has the highest number of chains. But from this result, we can’t tell that only Subway restaurants have the highest chains. We have to see further for that.
The lowest hotel frequency is 1, which denotes that those are independent restaurants.

Now let’s see the second output, which is given below.

Hmm, we got a lot of information. Let’s check them one by one.

There are Three Lakh Fifty-Six Thousand Eight Hundred and Forty-Seven (356847) unique restaurants in this dataset. Which Subway Restaurant has the top frequency (24333). That means Subway Restaurant has the highest number of chains. We don’t need to verify further for that.
From the Cuisine column, we can see that the Restaurant has more popularity than any other Cuisine. Here Restaurant means that all types of foods are available there.
Most restaurants are located in Los Angeles, California.
Newark is a village in Wayne County, New York, United States, 35 miles (56 km) southeast of Rochester and 48 miles (77 km) west of Syracuse. The above table tells us that most restaurants are located in this area.

_{Where are Most Subway Outlets Located?}

Now we know that Subway has the highest chains. But what about the Cuisine state they reside in and the county? Let’s see. But before that, we write some functions.

	def multi_count_df(data, data_cols):
	dfs = []
	for col in data_cols:
	count_data = data[col].value_counts().rename_axis(col).reset_index(name='count').head(5)
	dfs.append(count_data)
	return dfs


	def count_df(data, data_col):
	count_data = data[data_col].value_counts().rename_axis(data_col).reset_index(name='count')
	return count_data


	def multi_donut_charts(dfs, n_rows, n_cols, cols, colors, titles, figsize=(20,15),):
	fig, axes = plt.subplots(nrows=n_rows, ncols=n_cols, figsize=figsize)
	for ax, col, df, title in zip(axes.flatten(), cols, dfs, titles):
	ax.pie(x='count', labels=col, data=df, autopct='%.1f%%', textprops={"fontsize": 14},
	wedgeprops=dict(width=0.33, linewidth=7), pctdistance=0.85, colors=colors)
	ax.set_title(title, fontsize=18)

	return fig

	def modded_bar_plot(data, x_col, y_col, color_palette, title, fig_size=(7,5),
	label_type='edge', font_size=15, font_style="italic",
	font_weight="heavy", box_stat=False):
	plt.figure(figsize=fig_size)
	fig = sns.barplot(x=x_col, y=y_col, data=data, palette=color_palette)

	fig.get_yaxis().set_ticks([])
	fig.set(xlabel=None, ylabel=None)
	fig.bar_label(ax.containers[0], label_type=label_type)
	fig.set_title(title, fontdict=dict(fontsize=font_size, fontstyle=font_style,
	fontweight=font_weight))

	plt.box(on=box_stat)

view raw rest_functions.py hosted with ❤ by GitHub

Now let me introduce you to those functions:

multi_count_df helps us convert multiple outputs we get by using .value_counts() on different columns of the pandas dataframe.
count_df function is responsible for converting only one output we get by using .value_counts() on a column of the pandas dataframe.
multi_donut_chart function helps plot multiple donut charts in a subplot.
modded_bar_plot function helps us draw a modified version of the bar plot, which you will see later.

Now let’s plot our desired charts.

	# selecting those rows from the dataframe where 'RestaurantName' is 'Subway'
	subway_chains = chain_data[chain_data['RestaurantName']=='Subway']

	# defining colors
	colors = ['#ff9999','#66b3ff','#99ff99','#ffcc99', '#f4bbff']

	# defining a list of columns in which we are interested
	cols = ['State', 'CNTY_NAME', 'UA_NAME_MOD', 'Cuisine']

	# list of titles which we use in the charts
	titles = ['Subway in States', 'Subway in Counties', 'Subway in Urbans', 'Subway Cuisines']

	# plotting the donut charts
	dfs = multi_count_df(subway_chains, cols)
	fig = multi_donut_charts(dfs, 2, 2, cols, colors, titles)
	fig.show()

view raw subway.py hosted with ❤ by GitHub

If we run the above code, we can see the output below.

Looks nice. Always plot charts in such a way that the charts look attractive as well as give comfort to our eyes. Though we create these beautiful charts with a few lines of code, many customisations are happening in the background. If you want to know more about these customizations, see this article.

Now let’s see what we got in those charts.

The First chart shows us that most of the Subway restaurants are located in California, United States.
The second chart shows that Subway took the major part of Los Angeles, the most populated California county (Population: 9,829,544). If you don’t know what a county is, Here is a definition from Wikipedia.

Though most Subway Restaurants are in Los Angeles when it comes to Urban areas, we get from the third chart that Newark, an urban area of New York, has the most number of Subway outlets. We can also see little difference between Long Beach and Anaheim, an urban area of Los Angeles, and Newark, an urban area of New York.
The restaurant is the most popular cuisine in Subway, shown in the last donut chart. Here restaurant cuisine means those are normal restaurants providing all types of food.

Now this result is dependent on the whole data of Subway restaurants. What happens if we are interested only in Subway restaurants in California? Let’s see.

	subway_california_counties = subway_chains[subway_chains['State']=='California']
	subway_urban_counties_of_los = subway_chains[subway_chains['CNTY_NAME']=='Los Angeles']
	subway_urban = subway_chains[subway_chains['UA_NAME_MOD']=='Los Angeles--Long Beach--Anaheim']

	titles = ['Subway in US States', 'Subway in Counties of California', 'Subway in Urbans of Los Angeles', 'Subway Cuisines in Urbans of Los Angeles']

	df1 = count_df(subway_chains, 'State').head(5)
	df2 = count_df(subway_california_counties, 'CNTY_NAME').head(5)
	df3 = count_df(subway_urban_counties_of_los, 'UA_NAME_MOD').head(3)
	df4 = count_df(subway_urban, 'Cuisine').head(5)

	dfs = [df1, df2, df3, df4]

	fig = multi_donut_charts(dfs, 2, 2, cols, colors, titles)
	fig.show()

view raw subway1.py hosted with ❤ by GitHub

The output will look like the one below.

The first and second charts show that California and Los Angeles have the most Subway outlets, as we saw previously. The third and fourth charts changed drastically here. As we select the Urban Areas of Los Angeles, Anaheim and Long Beach came first here. And most of the Subway restaurants in this area are normal restaurants.

_{Which Restaurant has the Highest Chains after Subway?}

Now that’s all about the Subway. Which restaurant has the highest chains after Subway? For this, we have to take the different restaurants in the dataset and take those restaurants with frequencies of more than 5 (as we are considering only the chain restaurants. Less than 5 frequencies are not considered chained.). After that, we plot the bar plot.

	# defining some colors and making a palette
	colors = ['#ff9999','#66b3ff','#99ff99','#ffcc99', '#f4bbff']
	palette = sns.set_palette(colors)

	# taking the top 10 restaurants
	restaurant_count = count_df(chain_data, 'RestaurantName')
	rest_top_10 = restaurant_count[restaurant_count['count'] > 5].head()

	# plotting the bar chart
	modded_bar_plot(data=rest_top_10, x_col='RestaurantName', y_col='count', color_palette=palette,
	title= "Chainness of Restaurants")
	plt.show()

view raw rest_after_subway.py hosted with ❤ by GitHub

After running the above code, we will see the below output.

This is what I am talking about. This bar plot has no box, no x labels, and y labels, only the necessary part of the bar plot. From this plot, we can see that McDonald’s has the highest chains after Subway. Now let’s know about McDonald’s whereabouts.

cols = ['State', 'CNTY_NAME', 'UA_NAME_MOD', 'Cuisine']
dfs = multi_count_df(mcdonalds_chains, cols)
titles = ["McDonald's in  US States", "McDonald's in Counties", "McDonald's in Urbans", "McDonald's Cuisines"]
fig2 = multi_donut_charts(dfs, 2, 2, cols, colors, titles)
fig2.show()

The output of the above code is shown below.

The results are the same as we saw for the subway restaurants. It seems that the most popular restaurants are located in Los Angeles, California.

_{What about Independent Restaurants?}

Enough for the chain restaurants; now we see the stats about the independent restaurants. Independent restaurants are those restaurants whose frequencies are 1. We first filter those restaurants from the data and then see if they are also located mostly in Los Angeles. Let’s do it.

independent_rest = chain_data[chain_data['Frequency']==1]
cols = ['State', 'CNTY_NAME', 'UA_NAME_MOD', 'Cuisine']
dfs = multi_count_df(independent_rest, cols)
titles = ["Independent Restaurants in  US States", "Independent Restaurants in Counties", "Independent Restaurants in Urbans", "Independent Restaurants Cuisines"]
fig = multi_donut_charts(dfs, 2, 2, cols, colors, titles)
fig.show()

All types of restaurants are located in Los Angeles, California. Los Angeles is heaven for food lovers. If we see for the Urbans, Newark wins. All restaurants are normal restaurants that provide different types of food.

_{Which Restaurants have the Most Popular Cuisines in the United States?}

Until now, we don’t get any good idea about the most popular cuisine among Americans. Now we are going to focus on this. After a bit of surfing on the internet, it comes out that Most Americans like Italian foods in the first place, then Mexican, and Chinese in the third preference (see here). Now let’s see which Restaurants have those popular cuisines. First, we see the chain restaurants.

	# selecting chain restaurants which have 'Italian', 'Mexican', and 'Chinese' cuisines.
	italian_chain = chain_data[(chain_data['Cuisine']=='Italian') & (chain_data['Frequency'] > 5)]
	mexican_chain = chain_data[(chain_data['Cuisine']=='Mexican') & (chain_data['Frequency'] > 5)]
	chinese_chain = chain_data[(chain_data['Cuisine']=='Chinese') & (chain_data['Frequency'] > 5)]

	# getting the name of the restaurants where those cuisines mostly available
	italian_rest = count_df(italian_chain, 'RestaurantName').head(5)
	mexican_rest = count_df(mexican_chain, 'RestaurantName').head(5)
	chinese_rest = count_df(chinese_chain, 'RestaurantName').head(5)

	# defining titles and colors
	titles = ["Italian Restaurants", "Mexican Restaurants", "Chinese Restaurants"]
	colors = ['#ff9999','#66b3ff','#99ff99','#ffcc99', '#f4bbff']

	# plotting the donut charts
	fig = multi_donut_charts([italian_rest, mexican_rest, chinese_rest], 1, 3,
	['RestaurantName','RestaurantName', 'RestaurantName'], colors, titles,)
	plt.tight_layout()
	fig.show()

view raw cuisines.py hosted with ❤ by GitHub

Pheww! We are getting different results after a long time. I thought I would see the name Subway in the Italian restaurant category. But we got a different restaurant. Olive Garden has the most outlets for Italian Cuisine. For Mexican and Chinese cuisines, Taco Bell and Panda Express have the most outlets, respectively. Now, are those restaurant outlets located mostly in California? Let’s see.

	olive_data = chain_data[chain_data['RestaurantName']=='Olive Garden']
	taco_data = chain_data[chain_data['RestaurantName']=='Taco Bell']
	panda_data = chain_data[chain_data['RestaurantName']=='Panda Express']

	olive_states = count_df(olive_data, 'State').head(5)
	taco_states = count_df(taco_data, 'State').head(5)
	panda_states = count_df(panda_data, 'State').head(5)

	titles = ['Olive Garden in States', 'Taco Bell in States', "Panda Express in States"]

	fig = multi_donut_charts([olive_states, taco_states, panda_states], 1, 3, ['State', 'State', 'State'],
	colors, titles,)
	fig.show()

view raw cuisines1.py hosted with ❤ by GitHub

Our guess is not bad. Only Olive Garden’s outlets are mostly in Texas instead of California. Otherwise, Taco Bell and Panda Express outlets are mostly in California. Now What about the counties and urban? Let’s do this.

	olive_data = chain_data[(chain_data['RestaurantName']=='Olive Garden') & (chain_data['State']=='Texas')]
	taco_data = chain_data[(chain_data['RestaurantName']=='Taco Bell') & (chain_data['State']=='California')]
	panda_data = chain_data[(chain_data['RestaurantName']=='Panda Express') & (chain_data['State']=='California')]

	olive_states = count_df(olive_data, 'CNTY_NAME').head(5)
	taco_states = count_df(taco_data, 'CNTY_NAME').head(5)
	panda_states = count_df(panda_data, 'CNTY_NAME').head(5)

	titles = ['Olive Garden in Counties of Texas', 'Taco Bell in Counties of California', "Panda Express in Counties of California"]

	fig = multi_donut_charts([olive_states, taco_states, panda_states], 1, 3,
	['CNTY_NAME', 'CNTY_NAME', 'CNTY_NAME'], colors, titles,)
	plt.tight_layout()
	fig.show()

view raw cuisines2.py hosted with ❤ by GitHub

The outlets of Olive Garden are located mostly in Harris, Texas. And both Taco Bell and Panda Express outlets are in Los Angeles, California. Now let’s see about the Urban. Just replace CNTY_NAME with UA_NAME_MOD in the above code, and you will see the below output.

Most outlets of Olive Garden are located in Dallas, the third largest city of Texas; Fort Worth, the fifth largest city of Texas; and Arlington. Los Angeles, Long Beach, and Anaheim have the most Taco Bell and Panda Express outlets. If you like Italian cuisine, Dallas, Fort Worth, and Arlington – those three cities of Texas, should be your first choice.

Now, this is all about chain restaurants. What about independent restaurants? They also have some popularity, of course. But here is a problem. You can’t tell which restaurant outlets have mostly Italian cuisine by their frequencies as you found so many restaurants with the same frequencies. But we can tell in which state the most independent restaurants having Italian cuisine are located.

	# selecting chain restaurants which have 'Italian', 'Mexican', and 'Chinese' cuisines.
	italian_chain = chain_data[(chain_data['Cuisine']=='Italian') & (chain_data['Frequency'] < 5)]
	mexican_chain = chain_data[(chain_data['Cuisine']=='Mexican') & (chain_data['Frequency'] < 5)]
	chinese_chain = chain_data[(chain_data['Cuisine']=='Chinese') & (chain_data['Frequency'] < 5)]

	# getting the name of states where those cuisines mostly available
	italian_rest = count_df(italian_chain, 'State').head(5)
	mexican_rest = count_df(mexican_chain, 'State').head(5)
	chinese_rest = count_df(chinese_chain, 'State').head(5)

	# defining titles and colors
	titles = ["Italian Restaurants(Independent)", "Mexican Restaurants(Independent)",
	"Chinese Restaurants(Independent)"]
	colors = ['#ff9999','#66b3ff','#99ff99','#ffcc99', '#f4bbff']

	# plotting the donut charts
	fig = multi_donut_charts([italian_rest, mexican_rest, chinese_rest], 1, 3,
	['State','State', 'State'], colors, titles,)
	plt.tight_layout()
	fig.show()

view raw independent_cuisine.py hosted with ❤ by GitHub

From the above charts, we can see that most Italian restaurants(independent) are located in New York. On the other side, Mexican restaurants(independent) and Chinese restaurants(independent) are mostly located in California.

Conclusion

So, That’s all I got. I know this article is pretty long, but you learned many things. Most of us don’t have knowledge about this restaurant’s types. But after reading this article, we know not only about this type but also how to deal with this type of data. We also plot some beautiful charts. In a nutshell, we learned –

How to modify a column when no information is available in the data?
How to use python functions to shorten the code so we can save time?
Plot customized graphs using seaborn and matplotlib – how to create attractive graphs which are also comfortable for the eyes. And,
As a bonus, there are facts about chain and independent restaurants in the United States.

But the analysis doesn’t end here. If you got something, let me know in the comments. If there is something wrong from my side, I am always here to listen to you. You can also do some more in the plots – there are so many options for customization. I don’t explain how the plots are drawn because this article mainly focuses on the analysis. Try the parameters and see which does what; this is not hard for you.

If you want more articles like this, visit my AnalyticsVidhya profile.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Subhradeep

Recently pursuing M.Tech in Artificial Intelligence and love to do anything about Data Science, Machine Learning, and AI. I also like to share my knowledge through Blogs. Ask me anything about Data Science, Machine Learning, and AI at srang992@gmail.com.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction

Tools

Libraries

Plots

Use cases

Analysis of Restaurants in the United States

Introduction

The Data

Data Dictionary

Importing the Necessary Libraries

Data Preprocessing

Data Analysis

Statistical Summary

Where are Most Subway Outlets Located?

Which Restaurant has the Highest Chains after Subway?

What about Independent Restaurants?

Which Restaurants have the Most Popular Cuisines in the United States?

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

_{Data Dictionary}

_{Statistical Summary}

_{Where are Most Subway Outlets Located?}

_{Which Restaurant has the Highest Chains after Subway?}

_{What about Independent Restaurants?}

_{Which Restaurants have the Most Popular Cuisines in the United States?}