This article was published as a part of the Data Science Blogathon
With the rapid development of Artificial intelligence, the words like Machine Learning, Deep Learning, etc have become a buzzword in the technology world. Machine Learning is a part of Artificial Intelligence that is used to automate analytical model building for data analysis. While Deep Learning is the advanced expansion of Machine Learning that uses Artificial Neural Networks for intelligent decision making on its own. In this article, we have discussed one of the hot topics in data mining i.e, Fraud Detection in credit cards. It has also delineated our discussion by using Gated Recurrent Unit (GRU) deep learning Architecture.
In this article we have covered the following topics related to Credit Card Fraud Detection:
A credit card is a plastic card consisting of a thin rectangular slab including a magnetic strip which is issued by the financial institution to their customers. These cards are provided to the customers so that they can buy new items without paying in cash or through cheque. The financial institution pre-sets the limit of their cards before giving them to customers as per their monthly income.
And fraud is defined as wrong activity carries out by the illegitimate person by misrepresenting themselves either for money or property gain. Therefore, credit card scam is nothing but the gaining of confidential information like passwords, CVV numbers, etc. by intruders. So we need credit card fraud detection techniques to protect the cardholders from false activity.
India is on its way to becoming a developed country. To achieve this, the Government of India (GoI) has launched several initiatives and one of these is Digital India Campaign. The main intention of the government through this initiative is to digitally empower the nation. One of its main tasks includes the promotion of a cashless economy, which can be done by making transactions with a debit card, credit cards, net banking, UPI’s, etc. as a mode of payment rather than going for regular cash or cheque payments.
GoI and the Reserve Bank of India (RBI) have focused immensely on digitalizing the transactions. These have come in handy at the time of crisis which includes the ongoing COVID-19 pandemic, Demonetization by GoI in 2016. The government and the other financial institutions have recommended opting for digital transactions because of their several advantages. One of the most important benefits of digital transactions is that it saves time of customers.
They no longer have to visit ATMs and stand in the queue to withdraw money. Whenever they want to make a payment they just have to swipe the card and enter the PIN or need to provide the OTP while doing the online shopping. Another important reason for the promotion of electronic transactions is to trace the flow of black money and charge the tax defaulters.
This transforming technology comes with some disadvantages too. Cybersecurity is one of the challenges it is facing in the present scenario. Online transactions are done on the compromise of our sensitive information. Any sort of data breach can result in a huge loss for both the service provider as well as the customer. This is one of the major issues of this contemporary world where intruders make use of the slightest loophole in the system to carry out fraud transactions. So, we are in dire need to keep a check on the techniques being used to identify the loopholes and detect the fraudsters associated. This threat is globally recognized and it can be carried out in ways like skimming, phishing, stealing credit cards, etc.
Different sources can be responsible for the same. This can be done by the customer or by bank/credit card service provider or by a third party. A customer when makes a payment using a credit card and fails to repay the amount falls into this category. Bank/credit card service providers create fraudulent transactions by charging for crossing the limits or late payments or cash withdrawals from the customer. But the major threat is the one by the third party. In this, if the third party can successfully get the sensitive data of the cardholder can be abysmal.
As per the Reserve Bank of India, during the year 2015-2016, 2017- 2018, 2018-2019 a total number of fraud cases that has been registered are 1192, 1372, 2059 and 921 respectively. Therefore, credit card fraud detection helps an individual to protect himself/herself from such illegal acts. Credit Cards are widely acceptable cards, so the threat of their misuse is huge.
The major issue with credit card fraud detection is how to classify the transaction as fraudulent or nonfraudulent. As the transactional data set used for credit card fraud detection is unbalanced. Because most of the time the presence of fraudulent transactions in a dataset is very less which is another challenge that needs attention.
Credit card fraud usually impacts both the issuer companies as well as the cardholder. Credit card fraud can be done in many ways. There are several types of credit card fraud. Some of them are:
Merchant Related Fraud: These types of fraud are carried out by the merchant organization themselves. It involves:
Merchant Collision: This type of fraud occurs when someone from the organization itself leaks the information about the card and the cardholder and passes it to the fraudsters.
Triangulation: This type of fraud occurs through websites. Whenever someone does online shopping they usually enter the confidential information of their card. When this information is received by the fraudsters they carried out the fraudulent activity.
Site Cloning: It is also called phishing. In this type of fraud, the fraudster creates a clone of the website which is accessible by the user as the genuine one and the information submitted by the user is illegally utilize by the fraudster for fraudulent activity.
False Merchant Site: Some websites ask their users to enter their personal information like name, age, etc. For verifying this information they even ask their user to provide their credit card information and then these websites illegally sell the information to the third party.
Credit Card Generator: By using mathematical algorithms and combinations it is possible to create any kind of credit card in any format.
Other Kinds of fraud: It involves fraud such as stolen or lost credit cards, cardholders not present, erasing metallic strips, etc.
Moreover, credit card fraud impacts various bodies. It has affected everyone including The Merchants, The Cardholders, The Bank.
The Cardholders: They are the one who is least affected by the credit card fraud. Whenever it will occur the credit cardholder can inform the credit card issuer organization then the credit card issuer will investigate whether there is any illegal transaction that has been taken place or not. If there was any such fraudulent activity has taken place then the organization will chargeback the lost credit to their customer.
The Merchants: They are the ones who are affected the most by the fraud. In any case, if they fail to provide the evidence to the challenge issuer it would result in huge losses.
The Banks: As per the norms provided the banks must fetch the spent amount from the users either directly or indirectly. They also need to spend a huge amount to develop technology that can handle the fraudulent activity.
As we know, India is in the virtue of becoming a digitally empowered nation. To accomplish this several initiatives have been launched by the Government of India. Due to digitization, most people are now preferring online shopping which requires payment transactions through credit card, debit card, or Net banking rather than going for the regular mode.
And as we know, in online payments the only requirement is about sensitive information like passwords, CVV numbers, OTP, etc. There is no requirement for any physical card. But in any case, if this sensitive information is compromised then it will lead to huge losses for both the service providers as well as for the customer. Therefore, in such a scenario, a credit card fraud detection technique is required to tackle the challenge face by the cardholder.
Moreover, there is no such technology available to date which will trace the fraudulent transaction in real-time. All the credit card companies or banks are using the old method of analyzing the already happened transaction and then applying the machine learning or deep learning algorithms to predict whether the transaction comes under the fraudulent class or it will fall under the non-fraudulent class, which itself is a time taking procedure. And even though, they get success in knowing the class label of the transaction but it will be too late for them to compensate for the loss.
Also, there are chances that the fraudster may have committed many more illegitimate transactions before being recognized. So, to protect the financial bodies as well as the cardholder it is the need of the hour to have such a technique.
The credit card has been gaining popularity by paving its way into diversified fields. This field includes:
And all these fields, where the credit card is accepted, the chances of getting credit card fraud are very high. We need to have a solid validation structure wherever we find the usage of credit cards to detect any kind of fraudulent activities. So, we can say that all those fields where the credit card is accepted also require to have some fraud detection technique to detect the fraudulent transaction.
Apart, from its need and various application the credit card fraud detection deals with several challenges. Some of them are discussed below:
Also, we have extended our discussion by implementing a credit card fraud detection method using deep learning Architecture. All the experiments are conducted in the Python 3.7.7 programming language. The software operating environment is Jupyter notebook 6.0.3 which is a part of the Anaconda platform. We have implemented deep learning architecture Gated Recurrent Units (GRU) by using Keras library with Tensorflow library as back end. Some other libraries which have been used are NumPy, scipy, pandas, matplotlib, seaborn, sklearn, imblearn. All the data sets have been split in the ratio of 10:90. In our model, we have used one hidden layer, ten neurons, a sigmoid activation function, and 100 epochs.
import numpy as np import scipy as sp import pandas as pd import matplotlib as mpl import matplotlib.pyplot as plt import seaborn as sns # Pandas options pd.set_option('display.max_colwidth', 1000, 'display.max_rows', None, 'display.max_columns', None) # Plotting options %matplotlib inline mpl.style.use('ggplot') sns.set(style='whitegrid') import tensorflow as tf from tensorflow import keras from tensorflow.keras import Sequential from tensorflow.keras.layers import Flatten,Dense,Dropout,BatchNormalization from tensorflow.keras.layers import Conv1D,MaxPool1D transactions = pd.read_csv('desktopcreditcard.csv') transactions.shape (284807, 31) transactions.info() RangeIndex: 284807 entries, 0 to 284806 Data columns (total 31 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Time 284807 non-null float64 1 V1 284807 non-null float64 2 V2 284807 non-null float64 3 V3 284807 non-null float64 4 V4 284807 non-null float64 5 V5 284807 non-null float64 6 V6 284807 non-null float64 7 V7 284807 non-null float64 8 V8 284807 non-null float64 9 V9 284807 non-null float64 10 V10 284807 non-null float64 11 V11 284807 non-null float64 12 V12 284807 non-null float64 13 V13 284807 non-null float64 14 V14 284807 non-null float64 15 V15 284807 non-null float64 16 V16 284807 non-null float64 17 V17 284807 non-null float64 18 V18 284807 non-null float64 19 V19 284807 non-null float64 20 V20 284807 non-null float64 21 V21 284807 non-null float64 22 V22 284807 non-null float64 23 V23 284807 non-null float64 24 V24 284807 non-null float64 25 V25 284807 non-null float64 26 V26 284807 non-null float64 27 V27 284807 non-null float64 28 V28 284807 non-null float64 29 Amount 284807 non-null float64 30 Class 284807 non-null int64 dtypes: float64(30), int64(1) memory usage: 67.4 MB transactions.isnull().any().any() False transactions['Class'].value_counts() 0 284315 1 492 Name: Class, dtype: int64 transactions['Class'].value_counts(normalize=True) 0 0.998273 1 0.001727 Name: Class, dtype: float64 X = transactions.drop(labels='Class', axis=1) # Features y = transactions.loc[:,'Class']#response from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.9, random_state=1, stratify=y) X_train.shape (28480, 30) X_test.shape (256327, 30) from sklearn.feature_selection import SelectPercentile select = SelectPercentile(percentile = 75) select.fit(X_train,y_train) SelectPercentile(percentile=74, score_func=) X_train_selected=select.transform(X_train) X_test_selected=select.transform(X_test) print('X_train.shape is :{}'.format(X_train.shape)) X_train.shape is :(28480, 30) print('X_train_selected.shape is :{}'.format(X_train_selected.shape)) X_train_selected.shape is :(28480, 22) from imblearn.over_sampling import SMOTE sm = SMOTE(random_state = 2) X_train_res, y_train_res = sm.fit_sample(X_train, y_train.ravel()) X_train_res, y_train_res = sm.fit_sample(X_train, y_train.ravel()) print('After OverSampling, the shape of train_y: {} n'.format(y_train_res.shape)) After OverSampling, the shape of train_y: (56862,) print("After OverSampling, counts of label '1': {}".format(sum(y_train_res == 1))) After OverSampling, counts of label '1': 28431 print("After OverSampling, counts of label '0': {}".format(sum(y_train_res == 0))) After OverSampling, counts of label '0': 28431 from sklearn.preprocessing import StandardScaler stdscaler=StandardScaler() X=stdscaler.fit_transform(X) X array([[-1.99658302, -0.69424232, -0.04407492, ..., 0.33089162, -0.06378115, 0.24496426], [-1.99658302, 0.60849633, 0.16117592, ..., -0.02225568, 0.04460752, -0.34247454], [-1.99656197, -0.69350046, -0.81157783, ..., -0.13713686, -0.18102083, 1.16068593], ..., [ 1.6419735 , 0.98002374, -0.18243372, ..., 0.01103672, -0.0804672 , -0.0818393 ], [ 1.6419735 , -0.12275539, 0.32125034, ..., 0.26960398, 0.31668678, -0.31324853], [ 1.64205773, -0.27233093, -0.11489898, ..., -0.00598394, 0.04134999, 0.51435531]]) y_train=y_train.to_numpy() y_test=y_test.to_numpy() X_train_selected=X_train_selected.reshape(X_train_selected.shape[0],X_train_selected.shape[1],1) X_test_selected=X_test_selected.reshape(X_test_selected.shape[0],X_test_selected.shape[1],1) X_train_selected.shape (28480, 22, 1) X_test_selected.shape (256327, 22, 1) y_train.shape (28480,) from keras.models import Sequential from keras.layers import Dense from keras.layers import GRU from keras.layers import Dropout model= Sequential() model.add(GRU(units=22,return_sequences=True,input_shape=(X_train_selected.shape[1],1),activation='sigmoid')) model.add(Dropout(0.2)) model.add(GRU(units=10,return_sequences=False,activation='sigmoid')) model.add(Dropout(0.2)) model.add(Dense(units=1,activation='sigmoid')) model.summary() Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= gru_1 (GRU) (None, 22, 22) 1584 _________________________________________________________________ dropout_1 (Dropout) (None, 22, 22) 0 _________________________________________________________________ gru_2 (GRU) (None, 10) 990 _________________________________________________________________ dropout_2 (Dropout) (None, 10) 0 _________________________________________________________________ dense_1 (Dense) (None, 1) 11 ================================================================= Total params: 2,585 Trainable params: 2,585 Non-trainable params: 0 __________________________ from tensorflow.keras.optimizers import Adam model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy']) history= model.fit(X_train_selected,y_train,epochs=100,validation_data=(X_test_selected,y_test),verbose=1) Train on 28480 samples, validate on 256327 samples Epoch 1/100 28480/28480 [==============================] - 88s 3ms/step - loss: 0.2219 - accuracy: 0.9447 - val_loss: 0.0618 - val_accuracy: 0.9983 Epoch 2/100 28480/28480 [==============================] - 73s 3ms/step - loss: 0.0548 - accuracy: 0.9983 - val_loss: 0.0264 - val_accuracy: 0.9983 Epoch 3/100 28480/28480 [==============================] - 53s 2ms/step - loss: 0.0325 - accuracy: 0.9983 - val_loss: 0.0171 - val_accuracy: 0.9983 Epoch 4/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0236 - accuracy: 0.9983 - val_loss: 0.0139 - val_accuracy: 0.9983 Epoch 5/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0186 - accuracy: 0.9983 - val_loss: 0.0127 - val_accuracy: 0.9983 Epoch 6/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0160 - accuracy: 0.9983 - val_loss: 0.0128 - val_accuracy: 0.9983 Epoch 7/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0156 - accuracy: 0.9983 - val_loss: 0.0120 - val_accuracy: 0.9983 Epoch 8/100 28480/28480 [==============================] - 43s 2ms/step - loss: 0.0141 - accuracy: 0.9983 - val_loss: 0.0117 - val_accuracy: 0.9983 Epoch 9/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0135 - accuracy: 0.9983 - val_loss: 0.0108 - val_accuracy: 0.9983 Epoch 10/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0112 - accuracy: 0.9983 - val_loss: 0.0077 - val_accuracy: 0.9983 Epoch 11/100 28480/28480 [==============================] - 46s 2ms/step - loss: 0.0083 - accuracy: 0.9983 - val_loss: 0.0059 - val_accuracy: 0.9983 Epoch 12/100 28480/28480 [==============================] - 46s 2ms/step - loss: 0.0068 - accuracy: 0.9983 - val_loss: 0.0052 - val_accuracy: 0.9983 Epoch 13/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0062 - accuracy: 0.9984 - val_loss: 0.0048 - val_accuracy: 0.9989 Epoch 14/100 28480/28480 [==============================] - 46s 2ms/step - loss: 0.0057 - accuracy: 0.9988 - val_loss: 0.0047 - val_accuracy: 0.9990 Epoch 15/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0054 - accuracy: 0.9988 - val_loss: 0.0044 - val_accuracy: 0.9991 Epoch 16/100 28480/28480 [==============================] - 43s 2ms/step - loss: 0.0053 - accuracy: 0.9991 - val_loss: 0.0042 - val_accuracy: 0.9992 Epoch 17/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0050 - accuracy: 0.9991 - val_loss: 0.0041 - val_accuracy: 0.9993 Epoch 18/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0047 - accuracy: 0.9990 - val_loss: 0.0041 - val_accuracy: 0.9993 Epoch 19/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0047 - accuracy: 0.9991 - val_loss: 0.0042 - val_accuracy: 0.9993 Epoch 20/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0050 - accuracy: 0.9991 - val_loss: 0.0041 - val_accuracy: 0.9993 Epoch 21/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0047 - accuracy: 0.9993 - val_loss: 0.0042 - val_accuracy: 0.9993 Epoch 22/100 28480/28480 [==============================] - 46s 2ms/step - loss: 0.0047 - accuracy: 0.9992 - val_loss: 0.0039 - val_accuracy: 0.9994 Epoch 23/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0045 - accuracy: 0.9993 - val_loss: 0.0040 - val_accuracy: 0.9994 Epoch 24/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0046 - accuracy: 0.9993 - val_loss: 0.0039 - val_accuracy: 0.9994 Epoch 25/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0044 - accuracy: 0.9992 - val_loss: 0.0040 - val_accuracy: 0.9994 Epoch 26/100 28480/28480 [==============================] - 43s 2ms/step - loss: 0.0046 - accuracy: 0.9991 - val_loss: 0.0040 - val_accuracy: 0.9994 Epoch 27/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0045 - accuracy: 0.9993 - val_loss: 0.0039 - val_accuracy: 0.9994 Epoch 28/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0043 - accuracy: 0.9992 - val_loss: 0.0039 - val_accuracy: 0.9994 Epoch 29/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0045 - accuracy: 0.9993 - val_loss: 0.0039 - val_accuracy: 0.9994 Epoch 30/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0041 - accuracy: 0.9993 - val_loss: 0.0038 - val_accuracy: 0.9994 Epoch 31/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0046 - accuracy: 0.9993 - val_loss: 0.0038 - val_accuracy: 0.9994 Epoch 32/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0045 - accuracy: 0.9992 - val_loss: 0.0038 - val_accuracy: 0.9994 Epoch 33/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0044 - accuracy: 0.9994 - val_loss: 0.0041 - val_accuracy: 0.9993 Epoch 34/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0042 - accuracy: 0.9993 - val_loss: 0.0041 - val_accuracy: 0.9992 Epoch 35/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0041 - accuracy: 0.9994 - val_loss: 0.0040 - val_accuracy: 0.9993 Epoch 36/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0037 - accuracy: 0.9995 - val_loss: 0.0039 - val_accuracy: 0.9994 Epoch 37/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0042 - accuracy: 0.9993 - val_loss: 0.0039 - val_accuracy: 0.9993 Epoch 38/100 28480/28480 [==============================] - 46s 2ms/step - loss: 0.0041 - accuracy: 0.9993 - val_loss: 0.0038 - val_accuracy: 0.9994 Epoch 39/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0041 - accuracy: 0.9992 - val_loss: 0.0038 - val_accuracy: 0.9994 Epoch 40/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0041 - accuracy: 0.9993 - val_loss: 0.0040 - val_accuracy: 0.9993 Epoch 41/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0041 - accuracy: 0.9992 - val_loss: 0.0036 - val_accuracy: 0.9994 Epoch 42/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0039 - accuracy: 0.9993 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 43/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0039 - accuracy: 0.9994 - val_loss: 0.0035 - val_accuracy: 0.9993 Epoch 44/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0039 - accuracy: 0.9993 - val_loss: 0.0036 - val_accuracy: 0.9994 Epoch 45/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0037 - accuracy: 0.9993 - val_loss: 0.0035 - val_accuracy: 0.9994 Epoch 46/100 28480/28480 [==============================] - 43s 2ms/step - loss: 0.0039 - accuracy: 0.9993 - val_loss: 0.0035 - val_accuracy: 0.9993 Epoch 47/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0035 - val_accuracy: 0.9994 Epoch 48/100 28480/28480 [==============================] - 53s 2ms/step - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0036 - val_accuracy: 0.9993 Epoch 49/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0035 - val_accuracy: 0.9994 Epoch 50/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0038 - accuracy: 0.9993 - val_loss: 0.0039 - val_accuracy: 0.9993 Epoch 51/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0038 - accuracy: 0.9993 - val_loss: 0.0035 - val_accuracy: 0.9993 Epoch 52/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0036 - val_accuracy: 0.9993 Epoch 53/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0038 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 54/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0035 - val_accuracy: 0.9993 Epoch 55/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0038 - accuracy: 0.9994 - val_loss: 0.0035 - val_accuracy: 0.9993 Epoch 56/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0035 - accuracy: 0.9995 - val_loss: 0.0036 - val_accuracy: 0.9993 Epoch 57/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0037 - accuracy: 0.9994 - val_loss: 0.0036 - val_accuracy: 0.9993 Epoch 58/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0035 - accuracy: 0.9994 - val_loss: 0.0035 - val_accuracy: 0.9993 Epoch 59/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0035 - accuracy: 0.9994 - val_loss: 0.0038 - val_accuracy: 0.9992 Epoch 60/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0036 - accuracy: 0.9993 - val_loss: 0.0036 - val_accuracy: 0.9993 Epoch 61/100 28480/28480 [==============================] - 43s 2ms/step - loss: 0.0035 - accuracy: 0.9993 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 62/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9992 Epoch 63/100 28480/28480 [==============================] - 56s 2ms/step - loss: 0.0033 - accuracy: 0.9995 - val_loss: 0.0037 - val_accuracy: 0.9993s - loss: 0 Epoch 64/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0035 - accuracy: 0.9993 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 65/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0032 - accuracy: 0.9994 - val_loss: 0.0039 - val_accuracy: 0.9992 Epoch 66/100 28480/28480 [==============================] - 51s 2ms/step - loss: 0.0033 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9992 Epoch 67/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0032 - accuracy: 0.9993 - val_loss: 0.0038 - val_accuracy: 0.9992 Epoch 68/100 28480/28480 [==============================] - 45s 2ms/step - loss: 0.0032 - accuracy: 0.9995 - val_loss: 0.0039 - val_accuracy: 0.9992 Epoch 69/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0032 - accuracy: 0.9995 - val_loss: 0.0044 - val_accuracy: 0.9991 Epoch 70/100 28480/28480 [==============================] - 53s 2ms/step - loss: 0.0033 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9992 Epoch 71/100 28480/28480 [==============================] - 51s 2ms/step - loss: 0.0028 - accuracy: 0.9995 - val_loss: 0.0036 - val_accuracy: 0.9993 Epoch 72/100 28480/28480 [==============================] - 50s 2ms/step - loss: 0.0034 - accuracy: 0.9994 - val_loss: 0.0038 - val_accuracy: 0.9992 Epoch 73/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0034 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9992 Epoch 74/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0031 - accuracy: 0.9995 - val_loss: 0.0038 - val_accuracy: 0.9992 Epoch 75/100 28480/28480 [==============================] - 44s 2ms/step - loss: 0.0033 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 76/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0033 - accuracy: 0.9995 - val_loss: 0.0039 - val_accuracy: 0.9992 Epoch 77/100 28480/28480 [==============================] - 46s 2ms/step - loss: 0.0031 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9992 Epoch 78/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0033 - accuracy: 0.9993 - val_loss: 0.0038 - val_accuracy: 0.9992 Epoch 79/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0030 - accuracy: 0.9994 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 80/100 28480/28480 [==============================] - 50s 2ms/step - loss: 0.0029 - accuracy: 0.9995 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 81/100 28480/28480 [==============================] - 47s 2ms/step - loss: 0.0031 - accuracy: 0.9995 - val_loss: 0.0044 - val_accuracy: 0.9992 Epoch 82/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0029 - accuracy: 0.9995 - val_loss: 0.0037 - val_accuracy: 0.9993 Epoch 83/100 28480/28480 [==============================] - 50s 2ms/step - loss: 0.0033 - accuracy: 0.9993 - val_loss: 0.0046 - val_accuracy: 0.9991 Epoch 84/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0032 - accuracy: 0.9994 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 85/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0028 - accuracy: 0.9994 - val_loss: 0.0039 - val_accuracy: 0.9993 Epoch 86/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0029 - accuracy: 0.9994 - val_loss: 0.0056 - val_accuracy: 0.9986 Epoch 87/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0027 - accuracy: 0.9995 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 88/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0027 - accuracy: 0.9994 - val_loss: 0.0040 - val_accuracy: 0.9991 Epoch 89/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0030 - accuracy: 0.9995 - val_loss: 0.0038 - val_accuracy: 0.9992 Epoch 90/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0031 - accuracy: 0.9992 - val_loss: 0.0039 - val_accuracy: 0.9992 Epoch 91/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0031 - accuracy: 0.9994 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 92/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0027 - accuracy: 0.9995 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 93/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0033 - accuracy: 0.9994 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 94/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0028 - accuracy: 0.9995 - val_loss: 0.0039 - val_accuracy: 0.9993 Epoch 95/100 28480/28480 [==============================] - 49s 2ms/step - loss: 0.0027 - accuracy: 0.9996 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 96/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0026 - accuracy: 0.9996 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 97/100 28480/28480 [==============================] - 50s 2ms/step - loss: 0.0026 - accuracy: 0.9996 - val_loss: 0.0039 - val_accuracy: 0.9993 Epoch 98/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0027 - accuracy: 0.9994 - val_loss: 0.0038 - val_accuracy: 0.9993 Epoch 99/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0029 - accuracy: 0.9993 - val_loss: 0.0040 - val_accuracy: 0.9992 Epoch 100/100 28480/28480 [==============================] - 48s 2ms/step - loss: 0.0026 - accuracy: 0.9996 - val_loss: 0.0039 - val_accuracy: 0.9993 # Predicting the Test set results y_pred = model.predict(X_test_selected) y_pred = (y_pred > 0.5) score = model.evaluate(X_test_selected, y_test) 256327/256327 [==============================] - 36s 140us/step score [0.0039123187033642, 0.9992704391479492] #Let's see how our model performed from sklearn.metrics import classification_report sklearn.metrics.classification_report print(classification_report(y_test, y_pred)) precision recall f1-score support 0 1.00 1.00 1.00 255884 1 0.81 0.75 0.78 443 accuracy 1.00 256327 macro avg 0.91 0.88 0.89 256327 weighted avg 1.00 1.00 1.00 256327 from sklearn.metrics import confusion_matrix confusion_matrix(y_test, y_pred) array([[255807, 77], [ 110, 333]], dtype=int64) from sklearn.metrics import matthews_corrcoef MCC=matthews_corrcoef(y_test,y_pred) print(" Matthews correlation coefficient is{}".format(MCC)) Matthews correlation coefficient is0.7809955296478702 from sklearn.metrics import roc_curve y_pred_keras = model.predict(X_test_selected).ravel() fpr_keras, tpr_keras, thresholds_keras = roc_curve(y_test, y_pred_keras,pos_label=True) from sklearn.metrics import auc auc_keras = auc(fpr_keras, tpr_keras) plt.figure(1) plt.plot([0, 1], [0, 1], 'k--') plt.plot(fpr_keras, tpr_keras, label='Keras (area = {:.3f})'.format(auc_keras)) plt.xlabel('False positive rate') plt.ylabel('True positive rate') plt.title('ROC curve') plt.legend(loc='best') plt.show()
In this article, we have covered the various aspects related to credit card fraud detection. Also, we have implemented credit card fraud detection using the Gated Recurrent Unit. We have used several evaluation metrics but our main focus is on the F1 score. After evaluation, we have noted that our model has achieved 0.78 F1 scores.
https://github.com/saloni151/credit-card-fraud-detection-using-gru/blob/main/GRU%20european%20sigmoid%201%20hl%2010nn.ipynb
For further queries contact us-
Saloni and Ritesh
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
This was worth reading.. good analysis and well explained..