Top 5 Data Science & Machine Learning Repositories on GitHub in Jan 2018

Pranav Dar Last Updated : 05 Jun, 2020
4 min read

Introduction

Breakthroughs in data science and machine learning are happening at a break-neck pace. If you are working in this field, it’s extremely important to keep yourself updated with what’s new.

Following GitHub repositories is one such way to do so. You can see the latest developments, interesting projects and their applications. I can not tell how much learning can happen through this.

You can download the code and run it on your own machine or simply just keep it as a reference point for your project. Whatever the application, GitHub communities are invaluable resources.

In this post, we look at 5 GitHub repositories created in January 2018 that you must follow. This is part of a series from Analytics Vidhya that will run every month.

 

Detectron

Detectron is a software system developed by Facebook’s AI Research team (FAIR) that “implements state-of the art object detection algorithms”. It is written in Python and leverages the Caffee2 deep learning framework underneath.

Along with the Python code, FAIR has also released performance baselines for over 70 pre-trained models. Once the model(s) is trained, it can be deployed on the cloud and even on mobile devices.

Detectron has been covered by us here.

 

DeepReinforcementLearning

 

This is a replica of the AlphaZero methodology developed in Python. The author has written the code to train an algorithm to play the Connect4 game. It’s not quite as complex as the famed ‘Go’ game, but there are 4,531,985,219,092 possible game positions so it’s perfect for this situation.

The main advantages of this repository are two-fold, namely:

  1. How you can build a replica of the AlphaZero methodology to play the game Connect4
  2. How you can adapt the code to plug in other games

Run it and you will see the beauty in AlphaGo!

 

Caire

Caire is a content-aware image resizing library. Currently, most applications either give you the option of cropping an image or changing it’s aspect ratio. This often leads to either the main parts being left out or the image becoming blurred. This is where Caire comes into play.

It has support for both shrinking and enlarging any image, resizing it horizontally or vertically and does not require any third party library. It uses edge detection to generate an energy map of the image. Based on that, it finds seams in the image and uses it’s algorithm accordingly. The process of how this works has been illustrated in the three images below:

It is based on the Seam Carving for Content-Aware Image Resizing paper. This has been covered by Analytics Vidhya here.

 

Minigo

Covered by Analytics Vidhya here, this is an open-source Python implementation inspired by DeepMind’s AlphaGo. It’s a Neural Network based AI, developed using Tensorflow.

                                                              Source: WIRED

The goals of this project, as described by the authors, are listed below:

  1. Provide a clear set of learning examples using Tensorflow, Kubernetes, and Google Cloud Platform for establishing Reinforcement Learning pipelines on various hardware accelerators.
  2. Reproduce the methods of the original DeepMind AlphaGo papers as faithfully as possible, through an open-source implementation and open-source pipeline tools.
  3. Provide our data, results, and discoveries in the open to benefit the Go, machine learning, and Kubernetes communities.

You can access the entire Python code on this GitHub repository.

 

Alpha Pose

Alpha Pose is a remarkably accurate tool to estimate the poses of multiple people (you can see this in their GitHub’s GIFs). It’s the first open-source systems that has achieved 70+ mAP on the COCO dataset 80+ mAP on the MPII dataset. Additionally, the authors have also developed ‘Pose Flow’, which is an online pose tracker.

 

And here are two bonus repositories for you!

 

VisualDL

VisualDL is a tool that can visualize the entire deep learning process for us. It’s an incredibly powerful visualization tool that helps us design deep learning jobs. VisualDL was built to support Python. Just by adding a few lines of Python code and inserting them into our neural network model, we can generate plenty of visualizations to understand the framework. VisualDL has also been written in low level C++.

Currently, VisualDL provides four components (more will be added soon):

  • graph
  • scalar
  • image
  • histogram

You can read more about these components, and how VisualDL works, in our post here.

 

TensorFlow Project Template

There are a ton of things to do when starting a TensorFlow project. The underlying idea behind this repository is to wrap up thonse things into a simple and well-defined structure. The TensorFlow Project Template combines simplicity, best practices for creating and maintaining folder structure and excellent OOP design.

 

Do you know of any other repositories created last month that we should be aware of? Feel free to let us know in the comments below.

 

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Responses From Readers

Clear

raymond doctor
raymond doctor

Hi, Am looking for a simple tools for prediction using AI. I have a large trained data with the following format A=B in which A is the source and B is the target language. Am not very good in coding. I work with C, Perl. Awk. Sed. . Is there a simple tool in Windows by which data can be trained using 300,000 samples and then used to predict from a test data? Many thanks for the infor

Top 5 Data Science Machine Learning Repositories on GitHub in Jan 2018
Top 5 Data Science Machine Learning Repositories on GitHub in Jan 2018

[…] This is part of a series from Analytics Vidhya that will run every month. Detectron is a software system developed by Facebook’s AI Research team (FAIR) that “implements state-of the art object detection algorithms”. Read more from analyticsvidhya.com… […]

Saad Raja
Saad Raja

I am fond of Github such a great platform to discover amazing repositories. Today we all need to connect with latest technology advancements and Github is the main boost for us to discover what we are looking for.

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details