Geospatial Data Visualization Using Pydeck

Adil Mohammed 24 Jun, 2024
8 min read

Introduction

Pydeck is a Python library that is by far the most powerful tool for creating an interactive map using Python. It works seamlessly with the PyData stack and Pandas, making it an easy-to-use option for plotting geospatial data. Pydeck is known for its feature of allowing us to create custom layers or even stack layers. It has the ability to handle large geospatial datasets efficiently.

Pydeck interacts with Deck.gl, a GPU-powered open-source framework developed by Uber, to create high-performance visualizations. Users can build 3D visualizations and geographical maps in just a few lines of Python code. Data scientists, analysts like me, and you can use this library to easily understand complex geospatial relationships and patterns in data easily.  

Learning Outcomes

  • Comprehend the fundamentals of Pydeck and its integration with the PyData stack and Pandas for geospatial visualizations.
  • Learn how to install Pydeck using PIP and CONDA.
  • Gain the ability to create various geospatial visualizations such as GeoJSON, Path, and Great Circle visualizations using Pydeck.
  • Learn to prepare and manipulate datasets for visualization, including converting color codes and extracting necessary information from JSON data.
  • Learn to integrate layers and initial view states using the Deck object to complete visualizations.

This article was published as a part of the Data Science Blogathon.

How to Install Pydeck?

The easiest way to install Pydeck is via PIP. Here’s how to do it in your terminal or command prompt:

pip install pydeck

Another way to install Pydeck is via CONDA. Here’s how to do it:

conda install -c conda-forge pydeck

How to Use Pydeck?

After the installation, we can import the Pydeck library into our ‘.py’ files and start building good geospatial visualizations. In this section, we will build different types of visualizations like GeoJSON visualization, path visualization, great circle visualization and many more. These visualizations help us gain geographical insights from our geospatial data.

The basic structure to build a visualization using Pydeck is to first load the dataset into a variable, create a specific layer related to the dataset and the desired visualization, define an initial viewpoint, and finally create a Deck object to integrate the layer and the initial viewpoint to complete the visualization.

BART Line Path Visualization

The BART Line Path stands for the Bay Area Rapid Transit(BART) line path, which is used to visualize transit systems, routes, and other types of geographical paths. We will use the ‘PathLayer’ of the Pydeck library to display lines connecting multiple geographical locations.

In the example below, we will visualize the South Indian national highways.

To build this visualization, we will first import the required libraries such as pandas to load data and pydeck to create the visualization. The next step is to load the BART LINES data from a URL using the read_json() function from pandas, as the data is in JSON format. We will then convert the hex color codes to RGB values, as Pydeck requires RGB values for colors.

import pandas as pd
import pydeck as pdk

HIGHWAYS_URL = "https://raw.githubusercontent.com/adil200/Pydeck-Datasets/main
                /south_indian_highways.json"
highways_df = pd.read_json(HIGHWAYS_URL)

def hex_to_rgb(hex_str):
    hex_str = hex_str.lstrip("#")
    return tuple(int(hex_str[i:i+2], 16) for i in (0, 2, 4))

highways_df["rgb_color"] = highways_df["color"].apply(hex_to_rgb)

Now that our data is ready for visualization, we will first define an initial view state to set the initial focus and zoom level on a specific geographical location. Next, we will create a PathLayer using the Layer() function from Pydeck. We will pass parameters such as dataset name, colour, path, width, and others. The main part involves creating a deck and adding this PathLayer to our initial view state to generate the visualization. Finally, to save and view the visualization, we will use the to_html() function.

initial_view_state = pdk.ViewState(latitude=12.9716, longitude=77.5946, zoom=6)

path_layer = pdk.Layer(
    type="PathLayer",
    data=highways_df,
    pickable=True,
    get_color="rgb_color",
    width_scale=20,
    width_min_pixels=2,
    get_path="path",
    get_width=5,
)

deck = pdk.Deck(layers=[path_layer], initial_view_state=initial_view_state,
                tooltip={"text": "{name}"})
deck.to_html("south_india_highways.html")

The visualization is saved as an HTML file. You can just load it in your browser to view it.

Output:

Pydeck

Text Layer Visualization

In this section, we will create a TextLayer visualization where we will add text to a map. This layer is particularly useful for adding extra details about a geographical area or for mentioning the names of small areas that are not labelled on the map. 

In the example below, we will add short descriptions to areas of Bangalore city. To do this, I have created a JSON dataset with a short 1-2 word description of each area in the name variable, along with coordinates and addresses. Using these variables, we can easily create the TextLayer visualization.

To create this visualization, we will follow the same steps as before. First, we will import the Pydeck library and store the dataset link in a variable. Then, we will create our TextLayer using the Layer() function of Pydeck, passing parameters such as the layer name, dataset name, and others. Next, we will define an initial view state with the latitude and longitude of Bangalore, zoom level, and other parameters. Finally, we will create a Deck object to integrate the layer and initial view state to complete the visualization. To view the output, we will save it using the to_html() function.

import pydeck as pdk
from pydeck.types import String
import pandas as pd

BANGALORE_AREAS_URL = "https://raw.githubusercontent.com/adil200/Pydeck-Datasets
                       /main/scatterplot.json"
bangalore_df = pd.read_json(BANGALORE_AREAS_URL)

text_layer = pdk.Layer(
    "TextLayer",
    bangalore_df,
    pickable=True,
    get_position="coordinates",
    get_text="name",
    get_size=16,
    get_color=[0, 0, 0],
    get_angle=0,
    get_text_anchor=String("middle"),
    get_alignment_baseline=String("center"),
)

blr_view_state = pdk.ViewState(latitude=12.9716, longitude=77.5946, zoom=10, bearing=0, pitch=45)

bangalore_deck = pdk.Deck(
    layers=[text_layer],
    initial_view_state=blr_view_state,
    tooltip={"text": "{name}\n{address}"},
    map_style=pdk.map_styles.DARK,
)
bangalore_deck.to_html("bangalore_text_layer.html")

Output:

Pydeck output

Great Circle Flight Visualization

In this visualization, we will display flight routes using a GreatCircleLayer, which connects departure and arrival cities on an Indian map. The green side of the line represents the departure point, and the blue side represents the arrival point. This layer is useful for visualizing flight datasets geographically. Additionally, we can hover over a particular line to view information about the flight route.

To build this visualization, we will first load the required libraries and the dataset using the read_json() function from pandas. Next, we will extract city names from the ‘from’ and ‘to’ dictionaries in the DataFrame. Then, we will create the GreatCircleLayer using the Layer() function from Pydeck, passing the dataset, coordinates of departure and arrival cities, and two colors.

import pydeck as pdk
import pandas as pd

FLIGHTS_DATA_URL = "https://raw.githubusercontent.com/adil200/Pydeck-Datasets
                   /main/bangalore_flight.json"
flights_df = pd.read_json(FLIGHTS_DATA_URL)

flights_df["departure_city"] = flights_df["from"].apply(lambda departure: departure["name"])
flights_df["arrival_city"] = flights_df["to"].apply(lambda arrival: arrival["name"])

flight_layer = pdk.Layer(
    "GreatCircleLayer",
    data=flights_df,
    get_source_position="from.coordinates",
    get_target_position="to.coordinates",
    get_source_color=[64, 255, 0],
    get_target_color=[0, 128, 200],
    auto_highlight=True,
    width_min_pixels=2,
    pickable=True,
    opacity=0.8,
    strokeWidth=5,
)

Now, let’s define the initial view of the map with specified zoom levels. Finally, we will create a Deck object where we will integrate the layer and the initial view state to complete the visualization. We will save the visualization using the to_html() function.

initial_view_state = pdk.ViewState(latitude=13.199169, longitude=77.706139,
                                   zoom=5, bearing=0, pitch=0)

flight_deck = pdk.Deck(
    layers=[flight_layer],
    initial_view_state=initial_view_state,
    tooltip={"text": "{departure_city} to {arrival_city}"},
)

flight_deck.to_html("bangalore_flights_visualization.html")

When you load the output in the browser, you can interact with the visualization. When you hover over a line, you can see the description of the route.

Output:

Pydeck output

Scatter Plot Visualization

In this visualization, we will create a scatter plot on the Bangalore map using the ScatterplotLayer. This scatter plot will help us visualize the locations of specific areas.    

Just like all other visualizations, the process of creating the scatter plot is similar. First, we will load the dataset using the read_json() function, which contains the names and coordinates of the areas in Bangalore. Then, we will create the ScatterplotLayer using the Layer() function of Pydeck, where we will pass the layer name, dataset, coordinates, colors, and more. Next, we will create an initial viewport of the map using the ViewState() function of Pydeck. Finally, we will integrate the layer and the initial viewport by creating a Deck object. We can then save the visualization using the to_html() function and view it.  

import pydeck as pdk
import pandas as pd
import math

BANGALORE_BART_URL = "https://raw.githubusercontent.com/adil200/Pydeck-Datasets
                      /main/scatterplot.json"
bart_df = pd.read_json(BANGALORE_BART_URL)

bangalore_bart_df = pd.DataFrame(bart_df)

bangalore_bart_df["exit_radius_sqrt"] = bangalore_bart_df["exits"]
                                        .apply(lambda exits_count: math.sqrt(exits_count))

scatterplot_layer = pdk.Layer(
    "ScatterplotLayer",
    bangalore_bart_df,
    pickable=True,
    opacity=0.8,
    stroked=True,
    filled=True,
    radius_scale=6,
    radius_min_pixels=1,
    radius_max_pixels=100,
    line_width_min_pixels=1,
    get_position="coordinates",
    get_radius="exit_radius_sqrt",
    get_fill_color=[255, 140, 0],
    get_line_color=[0, 0, 0],
)

bangalore_view_state = pdk.ViewState(latitude=12.9716, longitude=77.5946, 
                                     zoom=10, bearing=0, pitch=0)

bangalore_bart_deck = pdk.Deck(
    layers=[scatterplot_layer],
    initial_view_state=bangalore_view_state,
    tooltip={"text": "{name}\n{address}"},
)
bangalore_bart_deck.to_html("bangalore_bart_scatterplot.html")

Output:

output

Types of Layers in Pydeck

Now let’s understand an overview of all the other layers Pydeck provides.

  • ArcLayer: This layer displays arc-like lines between pairs of points. It is useful for visualizing connections between two coordinates.
  • BitmapLayer: This layer adds images, icons, or emojis to a map. It is useful when we want to display a logo or symbol at a particular coordinate.
  • ColumnLayer: This layer adds vertical bars on a map. It is used to represent quantitative data distribution across a geographical area visually.
  • GridCellLayer: This layer displays data as gridded cells on a map.
  • IconLayer: This layer is similar to BitmapLayer but is specifically designed for icons.
  • ScatterplotLayer: This layer displays individual points or markers on a map. It is used to visualize large sets of individual data points.
  • TextLayer: This layer displays text on a map, allowing us to add specific descriptions or small snippets of information.

  Learn more about other layers here: https://deckgl.readthedocs.io/en/latest/layer.html

Conclusion

I hope you now understand how to build geographical visualizations using Pydeck. As discussed, almost every visualization follows a basic structure. First, load a dataset into a variable. Then, create a layer based on the dataset and the scenario from the list of layers mentioned above. Next, create an initial viewpoint, which sets the initial zoom level of the map. Finally, create a Deck object that integrates the layer and the initial view to complete the visualization. This method allows you to create beautiful and interactive geographical visualizations. Next time you have a geographical dataset, try using Pydeck to visualize it.

Key Takeaways

  • Pydeck is a robust Python library for creating interactive and high-performance geospatial visualizations.
  • Pydeck seamlessly integrates with the PyData stack and Pandas, making it user-friendly for data scientists and analysts.
  • Utilizing Deck.gl, Pydeck leverages GPU power to render high-performance visualizations.
  • Pydeck provides various layer types like PathLayer, TextLayer, GreatCircleLayer, and ScatterplotLayer, catering to different visualization needs.
  • Converting color codes to RGB and extracting relevant data from JSON files are crucial steps in preparing data for visualization.
  • Pydeck’s documentation and resources provide further insights into its features and additional layer types.

Frequently Asked Questions

Q1. What is Pydeck?

A. Pydeck is a Python library used for creating interactive and high-performance geospatial visualizations. It integrates seamlessly with the PyData stack and Pandas.

Q2. What are the key features of Pydeck?

A. Key features of Pydeck include the ability to create custom layers, stack multiple layers, handle large geospatial datasets efficiently, and render high-performance visualizations using Deck.gl.

Q3. Can I use Pydeck for 3D visualizations?

A. Yes, Pydeck allows for the creation of 3D visualizations and geographical maps with just a few lines of Python code.

Q4. What is the purpose of the initial view state in Pydeck?

A. The initial view state sets the initial focus and zoom level on a specific geographical location, ensuring the map displays the desired area when first loaded.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Adil Mohammed 24 Jun, 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear