Time Series Forecasting Made Easy Using Darts

Purnendu Last Updated : 24 Oct, 2021

6 min read

This article was published as a part of the Data Science Blogathon

Overview of Time Series Using Draft

Imagine this!

You work as a data scientist for a company that provides solutions to business. On a single day, your boss handovers two datasets with records of no air passenger(air-passenger) and milk produced by the cow(mainly milk), ask for detailed analysis of the data and predicts its future values.

By inspecting, you found no relationship between the two. As a result, you decided to build two models separately. So you processed the dataset(pandas), found seasonality(stats-model), and trained two models(Tensorflow/Pytorch). After a lot of hyper-tuning, you found a good fit and predicted the results.

A with consequences, you went to your boss, and as usual, your boss asks’s “can’t we do this using one model?“.

The question asked by the boss was indeed correct. Despite many improvements in the field, people find it challenging to work on TIME-SERIES DATA. So the company, Unit8, created a python package called DARTS, which aims to solve the problems in the scenario.

This article is a practical introduction to how to get started with the library. Precisely, we will recreate the same scenario and see what this library has to offer us. So let’s get started.

Installation of Drafts for Time Series

To start, we will install darts. Using an anaconda environment is highly recommended. Assuming you have created an environment, open the terminal and enter the following command:

conda install -c conda-forge -c pytorch u8darts-all

Note: It may take time because the downloadable size is approximately 2.98 Gb and will download all the available models!

After installation, launch a jupyter notebook and try importing the library using:

import darts

If nothing outputs, it means successfully imported, else google the error:)

Loading Dataset from Dart’s Library

For simplicity, we will use the dart’s dataset library to load the data.

from darts.datasets import AirPassengersDataset, MonthlyMilkDataset

Here we have imported two required datasets as we are mimicking the scenario.

Printing Dataset

Now let’s print the dataset. There is numerous way to do it, but we will focus on two most intuitive.

As Data Array

In general, we can load the dataset directly using the load() method, resulting in a mixture of the array and coordinate – Data Array having month datatype as date-time(represents time series).

AirPassengersDataset().load()

MonthlyMilkDataset().load()

Output:

air passenger | Time Series using Darts — **air-dataset**

milk dataset | Time Series using Darts — **milk-dataset**

As Dataframe

Alternatively, it can be loaded as a data frame using pd_dataframe().

display("Air Passanger Dataset",AirPassengersDataset().load().pd_dataframe())

display("Monthly Milk Dataset",MonthlyMilkDataset().load().pd_dataframe())

It looks like we have 144 observations for air-passenger and 168 observations for the monthly-milk dataset.

Plotting Datapoints

Dataframe is ok, but It doesn’t reveal much, so we can plot our dataset using matplotlib:

# loading library
import matplotlib.pyplot as plt
%matplotlib inline

# Loading Dataset as Data-Array
air_series = AirPassengersDataset().load() 
milk_series = MonthlyMilkDataset().load()

#plot chart
air_series.plot(label="Number Of Passengers")
milk_series.plot(label="Pounds Of Milk Produced Per Month" )
plt.legend();

In the above code, we just loaded the data and plotted it. Also, we include a label and legend so that it is easy to understand.

The result comes out as a nice looking chart:

Here x-axis represents the month, and the y-axis – data.

Note you need to downgrade to matplotlib 3.1.3 for code to work in collab.

Data Preprocessing

Carefully observing the above, we can find that the data is not scaled, i.e., it shows variability, so it’s good if we rescale it.

Standard Scaling

Luckily we have Scaler () class for this in the library itself, and we can use it by creating a scaler object and then fitting it to the dataset.

```
# import
```

from darts.dataprocessing.transformers import Scaler

```
# creating scaler object
```

scaler_air , scaler_milk = Scaler(), Scaler()

```
# perfoming the scaling
```

air_series_scaled = scaler_air.fit_transform(air_series)

milk_series_scaled = scaler_milk.fit_transform(milk_series)

In the above code, we have imported Scaler, created two scaler objects, and performed the scaling.

Now let’s look at the changes to see the difference:

# plottingair_ds.plot(label="Number Of Passengers")
air_series_scaled.plot(label = "Air Passangers Scaled")
milk_series_scaled.plot(label = "Milk In Pounds")
plt.legend();

As can be witnessed, our data is scaled and is evident in the plot’s y-axis.

Train Test Split

Now using scaled data, we will split our dataset. For this, we will take the first 36 samples as a training set and remain as a validation set for both datasets.

air_series_train,  air_series_val = air_series_scaled[:-36], air_series_scaled[-36:] 
milk_series_train, milk_series_val = milk_series_scaled[:-36], milk_series_scaled[-36:]

Note this is not a data frame rather a darts time-series object.

For confirming, you can do:

type(air_series_train)

Training the Time Series Model using Darts

Finally, we are in a state to perform the training. DART’s provide many solutions like Arima, Auto-Arima, Varima FFT, Four Theta, Prophet, and a few deep learning models like RNN, Block RNN(Uses LSTM), TCN, NBEATS, Transformer.

Loading the Darts Time Series Model

For our use case, we will go by the N-BEATS model provided as it supports multivariate time series forecasting(data having multiple features), which will allow us to perform all forecasting using a single model. So let’s load it.

# importing model
from darts.models import NBEATSModel
print('model_loaded')

>> model_loaded

NOTE: you are free to use any model of your liking, but make sure you read the documentation and see each model’s features.

Creating Model Object

Having loaded our model, let’s initiate it.

# creating a model object
model = NBEATSModel(input_chunk_length=24 , output_chunk_length=12, n_epochs = 100 , random_state = 15)

One of the quick things to note here is input, and output chunk length is 24 months and 12 months, respectively. In time series, we usually prefer a window over time instead of using real data. Also, we are doing our training for 100 epochs.

Additional Elaboration

In the first steps/epoch, we will provide 24 months of data as input data and 12 months as output data.

In the next step, we will move one step ahead and provide the next 24 months’ data as input and 12 months ‘ output, and so on till all data points of the training set are exhausted.

Based on this at each step a loss is calculated and the model learns to perform better and better over time.

Fitting Data To Model

Now finally, let’s train our model by fitting our training data. It may take time, so be patient!

# fitting the model
model.fit([air_series_train, milk_series_train], verbose = True)

Verbose = True insures logs.

Fitting Data To Model — **Training Logs**

Prediction and Evaluation of Time Series Model Using Darts

To ensure the model trained is performing well, we can check it MAPE – Mean Absolute percentage error for the predicted data.

# imports
from darts.metrics import mape

pred_air = model.predict(n = 36, series = air_series_train)
pred_milk = model.predict(n =36, series = milk_series_train)

print("Mape = {:.2f}%".format(mape(air_series_scaled , pred_air)))
print("Mape = {:.2f}%".format(mape(milk_series_scaled , pred_milk)))

Output:

>> Mape = 6.74% 
>> Mape = 16.82%

As evident, the errors are quite low, and an interesting pattern emerges that there is somewhat relation between milk produced by cows and the number of air passengers traveling, which is quite interesting. Also, note that we have used a single model to predict both the dataset! (an issue addressed by our boss to fix😁).

Visualization the Time Series using Darts Model Prediction

Finally, let’s look at how our predictions come out on a graph using the same way we checked our dataset but using our scaled dataset for better interpretability.

# plotting results
air_series_scaled .plot(label = "actual")
pred.plot(label = "forecasted") # validation data set
plt.legend()

# plotting results

milk_ds_scaled.plot(label = "actual")

pred.plot(label = "forecasted") # validation data set

plt.legend()

Visualization the Time Series using Darts Model Prediction 1 — **air_dataset-predection**

Visualization the Time Series using Darts Model Prediction 2 — **milk_dataset-predection**

The blue line represents the predicted data are very close to actuals(black line). Meaning we have succeeded in our quest and solved the issue addressed in our scenario. Congrats to you!

Conclusion

Some conclusions to derived are:

Libraries like darts can provide us with a new way to work over time-series, allowing for flexibility and efficiency.

As evident, we can load, process, and even train multiple datasets using a single library and model.

With all the charm, one can successfully throne it as a scikit-learn for time series data.

With this, we have come to an end of this miniature article. I hope you have found this interesting and will apply the learned in some of your projects.

See you later.

REFERENCES:

Code Notebook:- Collab

To Connect:- Linkedin, Twitter, Github, AnalyticsVidhya

All images are by the author.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

Purnendu

A dynamic and enthusiastic individual with a proven track record of delivering high-quality content around Data Science, Machine Learning, Deep Learning, Web 3.0, and Programming in general.

Here are a few of my notable achievements👇

🏆 3X times Analytics Vidhya Blogathon Winner under guides category.

🏆 Stackathon by Winner Under Circle API Usage Category - My Detailed Guide

🏆 Google TensorFlow Developer ( for deep learning) and Contributor to Open Source

🏆 A Part Time Youtuber - Programing Related content coming every week!

Feel free to contact me if you wanna have a conversation on Data Science, AI Ethics & Web 3 / share some opportunities.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction

Common Patterns

Validation Techniques

Time Series Forecasting

Exponential Smoothing

ARIMA

Prophet

Deep Learning

Time Series Forecasting Made Easy Using Darts

Overview of Time Series Using Draft

Installation of Drafts for Time Series

Loading Dataset from Dart’s Library

Printing Dataset

Output:

Plotting Datapoints

Data Preprocessing

Standard Scaling

Train Test Split

Training the Time Series Model using Darts

Loading the Darts Time Series Model

Creating Model Object

Additional Elaboration

Fitting Data To Model

Prediction and Evaluation of Time Series Model Using Darts

Visualization the Time Series using Darts Model Prediction

Conclusion

REFERENCES:

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme