Drawdata – Draw A Customized Data Using Python

Mayur Last Updated : 28 Apr, 2021
4 min read
This article was published as a part of the Data Science Blogathon.

Overview

  • Introduction
  • Need of data in machine learning
  • Introduction to Drawdata
  • Installation
  • Importing important modules
  • Draw varieties of plots
  • End Notes

Introduction

Drawdata

Data is the new oil ”

   — Clive Humby

When we say data is everything, will it? if you belonging to the technical field then you definitely said “yes”, because if there is a river then there is always find water.  Worlds need more data to perform fundamental tasks. Let’s take a brief intro on data:

Data can be defined as the units of information, either it was numeric or collected from any type of observation. in simple words, we can say that the collection of facts, numbers, measurements, or observations can refer as data. Most of us are confused about data and information, let’s solve this data is an individual unit but the information is the collection or group of data.

We are belonging to the data science community and we definitely knew the meaning of what Machine Learning is,  But you know why there is a need for data to create a dataset for training ML models. If not then will discuss-

 

Need of data in machine learning

We know that machine learning data analysis uses algorithms to continuously improve itself over time, but it is quite important that the quality of data is also necessary for performing algorithms on ML models for better accuracy.

” You just need to understand the data to truly understand how machine learning works “

For building the machine learning model we need a collection of data, for that we create a group of data called a dataset. Datasets are the collection of cases that all share a common attribute. Creating the ML models helps to understand between data points.

In any data science project life cycle, you probably notice EDA, feature selection, model building but you unnoticed how data will create, we generally used a dataset that is already created.  we do not probably want to use so many complex datasets for machine learning models, rather then we want to use a simple dataset. Have you have any knowledge on how data is generated, if not then don’t worry we will discuss below:

Drawdata

This is a beautiful open-source python library that allows generating data by just drawing it, In simple words, we can say that we only have to draw data, and then it will generate automatically. It is very easy to use and also has a user-friendly environment. It allows users to create varieties of datasets which have different shapes and sizes which is used for machine learning model.

This library is only used in the jupyter notebook, so let’s discuss the techniques of Drawdata here below:

Drawdata let's go

Using Drawdata library into jupyter you need to install it first:

Installation

We install two libraries drawdata and pandas, you only need to execute the following code in the command prompt:

pip install drawdata
pip install pandas
Drawdata installation

Importing important modules

Import pandas as pd
form drawdata import draw_scatter
from drawdata import draw_line
from drawdata import draw_histogram

After importing the library we will now take understanding how data is being drawn:

Draw varieties of plots

1. Scatter draw

scatter_plot = draw_scatter()
scatter_plot
scatter draw Drawdata

In this Gif, you can see that there are certain options at the top which is used to draw the scatter pattern into the draw section.

2. Line draw

line_plot = draw_line()
line_plot
line draw

As here we draw lines of different colors to represent the variety of data into the dataset.

3. Histogram draw

hist_draw = draw_histogram()
hist_draw
Histogram draw

In the above image, there is a variety of data in the form of a histogram, by this, we can draw fake data to visualize histogram.

This is the way to create different datasets using the library drawdata, but how we save this data into a dataframe to create a machine learning model. see below:

You have to click on  the above options which formate you wish to store data, see below:

format

After this, you can use pandas to read the clipboard to get your drawn data into a dataframe.

df = pd.read_clipboard(sep=",")
df
clipboard

So, this is the resultant dataset that we have created using the drawdata library.

Endnote

This article was suggested by my friend, it was pretty much an amazing library that I have an interest in. Now it’s time to build your own dataset by drawing.

Thank You.

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion. 

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details