I had the pleasure of volunteering for ICLR 2020 last week. ICLR, short for International Conference on Learning Representations, is one of the most notable conferences in the research community for Machine Learning and Deep Learning.
ICLR 2020 was originally planned to be in Addis Ababa, Ethiopia. But due to the recent COVID-19 induced lockdowns around the world, the conference was shifted to a fully virtual format. While this took away the networking aspect of the event, the fully online conference allowed budding researchers and machine learning enthusiasts to join in as well.
In this article, I will share my key takeaways from ICLR 2020. I will also share a data-based survey (that I personally undertook) to find the preferred tools among the research community, with an emphasis on tools for performing cutting-edge Deep Learning research, aka PyTorch and TensorFlow.
To start off, here’s the link to the ICLR 2020 website and a summary of the key numbers as shared by the organizers:
Now, let’s dig in!
ICLR 2020 was held between 26th April and 1st May, and it was a fully virtual conference. All the interactions from participants, presenters and organizers was online through their website.
The first day consisted entirely of really interesting workshop sessions, and it was followed by keynote talks, paper presentations/poster sessions, and socials/expos for each day for the rest of the day. The entire day was action-packed!
There were, in total, 650+ research papers accepted, and each of the papers involved a 5-min video presentation and live Q&A sessions by the authors on the paper itself (marked in green in the schedule). The video was more or less the highlights/summary of the paper where the authors presented their approach and key findings/contributions. The videos were mostly self-paced, while the live Q&A sessions were organized keeping in mind the various time zones of the attendees – a really thoughtful gesture, which I am sure involved a lot of effort.
There were keynote talks too (marked in blue), involving highly influential researchers with the likes of Yoshua Bengio and Yann LeCun; and expos from the industry giants and even leading AI startups. Note the specific timings of Keynote talks, which were kept in the specific time to accommodate as many people from diverse time zones as possible.
ICLR adopted a very engaging approach when it came to the informal session. There were numerous socials (aka video conferences) organized and you could attend the one pertaining to your topic of interest. Additionally, depending on the demand, there were more than one zoom calls for certain topics.
For example, Reinforcement Learning had 10 parallel sessions going on – since it is a very popular domain. Also, there were separate channels for beginners as well and ones for mentoring. In these sessions, whether it be through chat or video conference, you could interact and discuss the topic without any specific pre-defined agenda. For example, in mentoring sessions, you could ask an experienced researcher what it’s like to do research in the XYZ domain, and he/she was more than happy to answer!
Another impressive, fun, and novel approach I would like to give kudos to was the ICLR Town. Think of it as a Virtual Reality session that involved discussions in the backdrop of VR locations – like beaches, the outdoors, etc. The best part? Mario avatars for the users in a meeting!
Now we are at a #beachparty #iclrparty pic.twitter.com/ODeXLYRsr8
— Shagun Sodhani (@shagunsodhani) April 28, 2020
The ICLR Town took Twitter by storm and thus, was really popular amongst the conference attendees. Needless to say, most of the key learnings from this fully virtual conference will be used by upcoming conference organizers.
Being the premier research conference in AI and Deep Learning, ICLR has always attracted leading personalities and thought leaders in this domain, and this year was no different. Despite the lockdown restrictions and the entire conference being conducted online, notable speakers included:
and many more influencers.
Similarly, many sponsors were equally enthusiastic about the online ICLR conference:
This impressive list highlights the standard ICLR sets each year and the level of research that is presented in the Sponsor Booths as well.
One key aspect of ICLR 2020 that I would like to draw your attention to is Open Review. We are quite familiar with the usual review processes of research conferences. These reviews are closed and thus, the lack of transparency has been a bone of contention amongst researchers.
ICLR follows the Open Review process, in which the papers are available for reading online openly. You can comment on it and check reviews of other researchers. This peer-review process was well-received and I hope this creates a mark for other future conferences as well to introduce transparency among the research community.
Deep learning is a hot topic right now, and ICLR 2020 just cemented that notion. There was an unofficial survey to find the most explored topics by researchers, and here’s what they came up with:
Understandably, the most covered topics included Deep Learning applied both in a supervised and an unsupervised manner, Reinforcement Learning, and Representation Learning. Other significant topics included attention mechanisms for supervision and Generative Adversarial Networks (GANs).
As far as the domains are concerned, Computer Vision and Natural Language Processing (NLP) were the prevalent topics, though topics like Game Playing and Numerical Optimization had many submissions as well.
There was so much to learn from the conference, which at times seemed overwhelming. This thought is perfectly summarized in this tweet:
Exhausted! Can't believe how much I learned in one day! All thanks to the organizers and this mind-blowing virtual format #ICLR2020 pic.twitter.com/oqLblIxp7q
— Servando (@vandotorres) April 26, 2020
Being a completely virtual and online conference, I would particularly like to highlight the great website Interface for the attendees and the volunteers.
The sessions were easily accessible and the website was easy to use. I could search and explore the papers without any hassle and there were many additional features available to fully leverage the research presented. I especially liked the availability of a single visualization for all the paper/posters on the conference webpage – a scatter plot based on topic similarity was easy to understand and made foraging through the research papers much easier:
Even Andrew Ng had good things to say on how great the website was:
I'm really enjoying the new #ICLR2020 website, and am having fun browsing the workshop & poster talks. Congrats to the organizers and speakers on what's happening so far in this all-virtual conference! Great job! @srush_nlp @shakir_za @kchonyc @dawnsongtweets @syhw et al. pic.twitter.com/e4e5MfYh6W
— Andrew Ng (@AndrewYNg) April 27, 2020
Along with the website, the live videos and slides used were hosted via SlidesLive while Zoom was used for Q&A sessions. The chat interface was RocketChat and the Sponsor Booths were hosted via 6Connex.
To conclude, here are a few interesting resources I would like to share with you so that you can get a sense of how well-organized and comprehensive ICLR 2020 was:
Now we come to my favorite part of the article!
Just to give you a brief background, the idea was brought about in the discussions of an informal session on “Open source tools and practices in state-of-the-art DL research” (i.e, social_OSS). The original question was:
“Which tool is actually preferred by the researchers at ICLR – is it PyTorch or TensorFlow?”
This question can be rethought as:
“What are the open-source tools used by the researchers who shared their codes at ICLR 2020 pertaining to Deep Learning research?
And from this, can we make an educated guess on which tools will be relevant in the near future and will probably be adopted in some way by the industry?”
But what is the motivation behind doing this case study?
Let’s get down to business – I’ll quickly summarize my findings here, and then I will walk you through the code that I used to reach these findings. For the impatient folks, here’s the link to the code and the data using which you can explore the patterns yourself.
Of the 237 codes shared by the researchers that I could successfully parse, 154 of them used PyTorch for their implementation, in comparison to 95 for TensorFlow, and 23 for Keras.
This clearly depicts that PyTorch is becoming more and more accepted by the research community, although TensorFlow is not far behind. Researchers still use Keras in their codes, but the number is quite less. This might be because, in terms of flexibility, both PyTorch and TensorFlow provide more flexibility than its counterpart.
Note: You can start learning all about PyTorch here.
Also, it is worth mentioning that a few tools like HuggingFace’s Transformers, OpenAI’s Gym framework, or networkX, which have proven to be useful in their niche domain such as NLP, Reinforcement Learning and Graph Networks respectively, are getting accepted by the community. Along with that, TensorBoard is getting accepted as a norm for visualizing the training of Deep Neural Networks:
Here, we can see that the usual data science/machine learning tools such as NumPy, SciPy, matplotlib, Pandas, and scikit_learn are still relevant, even in the research community. Emerging tools like tqdm and torchvision are being used much more by the researchers.
Here, we can better visualize the relevant tools or packages used by the researchers in the form of a word cloud. Looks cool, right?
For those of you interested to know how I came up with these findings, I will now describe the steps I followed, and explain in detail the code you can use to reproduce the steps. Note that I used Google Colab for implementation as it has most of the data science libraries preinstalled. If you don’t know how to use Colab, you can refer to this guide. Otherwise, if you prefer using your own machine, you would have to set up the system with the below libraries:
These were the steps I followed for the case study:
I will focus mainly on the data analysis part because most of the readers would find it helpful. If you want me to explain how the data is generated, do let me know in the comments below and I will add it in this article itself.
The first thing we do is install libraries in Colab, namely OpenReview API and pipreqs:
After this, we will import all the necessary libraries/packages we will be using throughout the notebook:
Here, as I mentioned, we follow these steps:
To summarize, in terms of numbers, there are 687 papers accepted in total, of which 344 researchers have shared their implementations for the respective papers. Of these, 268 are GitHub repositories on which I have primarily focused. Out of these 268, I could successfully parse 237 GitHub repositories based on their requirements.txt file and summarized the tools used on the basis of this requirements.txt.
After running the code, we get a file called “all_tools.csv”. If you want to follow along with me, you can download the data file from this link.
In this section, we will be asking two main questions to the data:
Let us read the data file:
Let’s see the first five rows of the all_tools dataframe and we will see that our data is very unclean:
Python Code:
import os
import re
import sys
import requests
import openreview
import pandas as pd
import matplotlib.pyplot as plt
from random import choice
from wordcloud import WordCloud
from urllib.parse import urlparse
all_tools = pd.read_csv("all_tools.csv")
print(all_tools.head())
If you run this, the output comes out to be (237, 2), specifying that we have 237 rows, i.e. we are analyzing 237 GitHub repositories.
Now, let’s clean our data. Here’s a rough pseudo code of what this code block does:
After cleaning, if we print the head of the all_tools dataframe again, we get an output like this:
Now, let’s see how many repositories contain PyTorch. We can use the “str.contains” function of Pandas to find this:
This gives us a count of 154, which tells us that there are 154 repositories in total that use PyTorch in some way or other.
Let’s write a formal function which we can use to replicate this for other tools:
Notice that we are using an offset variable to add the count. We do this to make the function flexible enough to add repositories that haven’t been parsed, but we can assume there’s a high chance the specific tool is present in that repository.
Let’s run this function for three specific deep learning tools, namely PyTorch, TensorFlow, and Keras:
We get an output like this:
This clearly expresses that PyTorch is fairly popular among the research community in comparison to TensorFlow or Keras.
Here, you can see that I’ve used an offset of 12 for TensorFlow – this is because there are a few instances where the GitHub repositories are not parsed, but they are from either Google-research GitHub username or TensorFlow’s original repository, which clearly indicate that they might be built on TensorFlow.
You can follow similar steps for a different tool/package/framework. Here I’ve done this for HuggingFace’s Transformers, TensorBoard, OpenAI’s Gym, and networkX:
Next, let see how many unique tools are present in our data:
This code gives us an output of (687,) indicating that there are 687 unique tools used in total.
Let’s print the top 50 most frequent tools that occur in the data:
We get an output like this:
As expected, most of the repositories use common data science tools/libraries such as NumPy, SciPy, matplotlib, and Pandas.
Let’s plot this in a more visual way. For this, we can either use a bar plot or a word cloud. The below code is for creating a bar plot:
We get a plot like this:
We can instead visualize this as a word cloud:
This gives us an output like this:
In the interest of time, this is the extent of the data exploration I’ve done. We can perhaps do more than this, and I’ve mentioned in the next section some of the ideas that can be explored.
In this article, I have summarized my key takeaways from ICLR 2020. I have also explained a case study to understand the usage of tools in the research community.
As an acknowledgement, I would like to thank ICLR 2020 organizers for hosting this wonderful conference, and also for giving me the opportunity to be a part of the event as a volunteer. In addition, I would like to especially mention Sasha Rush, Patryk Mizuila and the Analytics Vidhya team who helped me through the case study.
What I’ve done in this article is a simple exploration of the data. There are a few more ideas that can be explored, such as:
Alas! There’s so much I can do with my time, right?