This article was published as a part of the Data Science Blogathon.
Introduction
Gentle Overview
What is Time Series Analysis?
ARIMA
Moving Average
Exponential Smoothing
Heard of DogeCoin?
Implementation of Dogecoin price prediction
Conclusion
Machine learning will automate jobs that most people thought could only be done by people.” ~Dave Waters
This must be one of the famous quotes you must have heard about machine learning. When diving deep into ML and its applications, everyone realizes this fact. It might seem intimidating for immediate beginners to primarily understand or completely understand the concept of time series analysis. But I’ll try to keep it simple. We’ll begin from the very basic understanding of the walkthrough of a recent project that I have been involved in.
The basis of any machine model is to predict future data based on insights from data obtained in the past. Whereas the complexity of the model depends on the type of data you need to predict. The output of a model varies from simple house price prediction or predicting the salary of an employee to complex weather prediction, solar radiation predictions or sales forecasting. So when the type of input/output data changes, the approach must also be different. This is where time series analysis comes in handy. It is basically an effective approach to deal with complicated machine learning predictions.
Time series analysis is a technique of analyzing a sequence of data in the duration of a specific interval of time. It works by recording data consistently over a period rather than simple random analysis. It can otherwise be understood as a sequence of
observations on a particular variable.
Following are major components of time series analysis –
Time series analysis itself has its own different types which are basically ARIMA, moving average and exponential smoothing. And the most important part of building your model is definitely choosing the right type of analysis or forecasting.
Autoregressive integrated moving average(ARIMA). The basic concept is about a simple autoregressive moving average, but integration upon that. ARIMA is a statistical model for time series data analysis or predicting future trends. These models convert a non-stationary series to stationary using differencing and then predict outputs from previous data.
The model is based upon the statement that the next observation is the mean of each of the past observations. It used the concept of previous forecast error rather than directly using previous values as in regression. The moving average model is always stationary.
Exponential smoothing is a forecasting method used for univariate data. It can be helpful to support the data with a seasonal component. Moreover, it’s a powerful method that could be used as an alternative to the popular ARIMA. It takes into account recent data more than older data. Hence the model could be more efficient.
A cryptocurrency is a form of digital currency that is a medium of exchange of money through computer networks which does not subject to any governmental organizations. And Dogecoin is an open-source cryptocurrency that is a unit of currency but has no real-world utility. It was created by Billy Markus and Jackson Palmer, who were both software engineers, they decided to create this payment system as a “joke” earlier, poking the speculation in cryptocurrencies that existed at that time.
On present day – dated 11/03/2022- 1 DogeCoin = 8.85INR , values have crashed.
Dogecoin is also termed as the first “meme coin”.
Now, this project that we’ll discuss below is all about predicting the future dogecoin values with the help of a machine learning model built using a time series analysis of data.
Implementation
So let’s go through the implementation
For this project, we are using Dogecoin Historical Data, which contains the open, high, low and close price of dogecoin from 2017 November to 2022 January. With this data, we can predict the daily future price of dogecoin.
First of all, let’s import all the required libraries for the project.
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.ensemble import RandomForestRegressor import warnings warnings.filterwarnings("ignore")
Let’s import the Dogecoin historical dataset.
data = pd.read_csv("DOGE-USD.csv")
data.head()
data.tail()
Now, check if there are any null values present in the data.
data.isnull().any()
Output
Date False Open False High False Low False Close False Volume False dtype: bool
From the above output, it is clear that there are no null values in the given dataset.
In the data set, we are interested only in the closing price of the dogecoin in each day for performing the time series analysis. So we are taking the close price as X.
X = data['Close'] X = np.array(X).reshape(-1,1)
Now let’s plot the trend of the closing price of dogecoin.
plt.plot(X)
You can notice the fluctuations in the daily closing price of dogecoin.
For time series analysis we need to create a data set from the given X. So, here for making the data set we’re considering i to i+Nth data as input data and i+N+1th data as output data where i ranging from 0 to total length of data – N.
x_data = [] y_data = [] column_len = 25 for i in range(len(X)-column_len-1): x_data.append(X[i:i+column_len,0]) y_data.append(X[i+column_len,0])
x_data = np.array(x_data) y_data = np.array(y_data)
We prepared our input and output data and it is now ready for creating the machine learning model. We are using RandomForestRegressor for model creation. So let’s train the model with the dataset.
model = RandomForestRegressor(n_estimators=200) model.fit(x_data, y_data)
Our model is now ready to use for predicting the future dogecoin price.
Now let’s predict the dogecoin price for 10 days starting from the last date given in the dataset.
c = X[len(X)-25:] a = 10 for _ in range(10): x = model.predict(c.reshape(1,-1)).reshape(1,-1) c = np.concatenate((c[1:], x)) print(x)
Output
[[0.17144551]] [[0.17587189]] [[0.17762981]] [[0.17761467]] [[0.17605098]] [[0.17487411]] [[0.17376281]] [[0.17203149]] [[0.16989914]] [[0.16412252]]
That’s it you have got your predictions. Care must be taken in the preprocessing steps.
I really hope you might have got intuition about Time Series Analysis and an understanding of the dogecoin prediction project, the example discussed above. It isn’t much complex to digest but focus. Make sure you go through it once again if it’s helpful and does work out the algorithm by yourself for better understanding. Hope you liked my article on dogecoin prediction. Share in the comments below.
Read more articles on price prediction and time series analysis on our website.
Have a nice day !! : )
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.
Hi A firendly advice ahead : What about autocorrelation playing an important part in time series forecasting? You could have introduced the concept for "autocorrelation" before the methods involved!