11 Powerful Talks from rstudio::conf 2020 you Must Watch – A Treat for R Users!

Pranav Dar Last Updated : 19 Feb, 2020

9 min read

Introduction to the rstudio::conf 2020!

I’m a heavy R user. It was the first programming language I learned (thanks to my interest in data science) and it has stayed the course with me. Even after Python’s rapid rise in recent years, I often find myself working with the wonderful ggplot2 library in RStudio.

I switch between Tableau and R for my data visualization projects and I couldn’t be happier with the progress R has made. The subtle changes every year to the traditional packages like ggplot2 and the Tidyverse have kept me coming back.

And what better place to see these changes than the best R conference on the planet? I eagerly await for the folks behind rstudio::conf every year to release the recordings and resources to the community. Last year, I brought the top highlights from rstudio::conf 2019 and the response from our community was overwhelming!

So this year, I’ve picked out my top 10 talks from studio::conf 2020 in this article. There were some awesome talks this year ranging from the current state of the Tidyverse by Hadley Wickham to how you can use R with TensorFlow to build deep learning models. Let’s check them out!

Framework for Picking out these 11 Talks from rstudio::conf 2020

rstudio::conf 2020 had over 40 talks so why these 11 talks in particular? Here’s how I made my selection:

Fit in the Machine Learning pipeline: Where does the talk fit into a typical machine learning project? I always look out for resources, libraries, and frameworks that will help me expand my skill set to understand how a machine learning project works. These talks are a great way to leverage the thought process of data science experts
Relevance: How relevant is the talk in today’s rapidly evolving data science field? I picked the talks that I feel are most relevant to a data scientist’s’ skill set – data engineering, deep learning, data visualization, etc.

Data Cleaning and Preprocessing Talks at rstudio::conf 2020

Let’s start with the data cleaning and data preprocessing step, shall we? I’m sure you know which libraries are coming up in this section! The Tidyverse, of course.

If you’re new to the Tidyverse and the incredible number of R libraries it provides, I strongly recommend going through the below article:

A Beginner’s Guide to Tidyverse – The Most Powerful Collection of R Packages for Data Science

1. State of the Tidyverse by Hadley Wickham

Hadley Wickham is the most recognizable person in the R universe. He’s the man behind Tidyverse – the collection of R packages for data cleaning and data manipulation. I can’t thank him enough for creating these packages – they’ve been a godsend!

In his talk at rstudio::conf 2020, Hadley spoke about the latest developments and updates to the slew of R packages under the Tidyverse umbrella. Here are the three key takeaway’s from this talk:

There’s no other talk you need to listen to here – Hadley Wickham’s session is enough. View the full talk here and share your thoughts in the comments section below.

Data Exploration and Data Visualization Talks at rstudio::conf 2020

Ah, my favorite step in the machine learning pipeline. I’m a huge advocate of data visualization and telling visual stories through data. I fell in love with ggplot the moment I came across it all those years ago and it remains a faithful companion in my visualization journey.

I’ve even designed and create an entire course on data visualization and storytelling here:

Mastering Tableau from Scratch: Become a Data Visualization Rockstar

2. Effective Visualizations by Miriah Meyer

A very, very interesting talk. Miriah Meyer and her team work with building effective data visualizations in a research lab environment. That’s a topic I haven’t heard about much before.

Miriah talks about how to design these effective visualizations in R, the principles her team follows and how you can replicate their idea in your own work.

If you love data visualization and are looking to create innovative work, this talk is for you. And here’s a list of books Miriah recommends reading to become a better visualization expert:

Watch this excellent talk by Miriah here.

3. 3D ggplots with rayshader by Dr. Tyler Morgan-Wall

rayshader is an open-source R package for creating data visualizations. We can create both 2D and 3D visualization in R using this superb package.

This talk by Dr. Tyler Morgan-Wall illustrates how you can use this rayshader package to create stunning 3D figures and animations in R. Dr. Tyler talks about how to use principles of cinematography (any movie buffs out there?) to take your audience on a visually appealing tour of your data.

Here’s an example of the power of rayshader:

rayshader definitely deserves an entire tutorial on its own (I’ll get down to write one soon!).

Check out the full recording here.

Data Engineering Talks at rstudio::conf 2020

Data engineering is the hottest role in the data science space right now (yes you read that right). And is that really a surprise? Given the amount of data we’re generating these days, we need skilled people to collect that data from multiple sources, store it, figure out a way to get it to the data scientists and also set up the production environment.

This is by no means an easy role and the demand for data engineers is rising multi-fold at the moment. That’s a key reason behind including 4 talks from rstudio::conf 2020.

Also, if you’re curious about what it takes to become a data engineer, I’ve put together a comprehensive list of resources for you:

Want to Become a Data Engineer? Here’s a Comprehensive List of Resources to get Started

4. Practical Plumber Patterns by James Blair

Plumber is a popular R package among the data engineering community. As the developers behind Plumber put it:

“Plumber is an R package that converts your existing R code to a web API using a handful of special one-line comments.”

Plumber’s flexible approach allows R processes to be accessed by frameworks outside of the R environment. You can install the Plumber package right now with just one line of code:

install.packages("plumber")

James Blair expounds on how Plumber works in this talk and shows useful patterns for developing and working with robust APIs built in R using this package.

Interested in watching the talk? Here you go!

5. MLOps for R with Azure Machine Learning by David Smith

Azure ML is Microsoft’s flagship cloud-based machine learning platform. Data science teams can use Azure ML solutions to build end-to-end machine learning pipelines at scale. Microsoft and Satya Nadella bet big on their cloud services to pull Microsoft up in the tech industry and the strategy has worked so far.

David Smith’s talk at rstudio::conf 2020 focused on four things primarily:

Carry out machine learning workflows using the authoring experience of their choice, from no-code to code-first options
Use the Azure machine learning R SDK to manage cloud resources and train, hyperparameter tune, and log and visualize metrics for their models at scale on Azure compute
Build machine learning pipelines in R for defining and orchestrating reusable and reproducible machine learning workflows
Deploy, manage, and monitor your R machine learning models and applications as web services

Watch David Smith’s talk on MLOps with R here.

6. Deploying End-to-End Data Science with Shiny, Plumber, and Pins by Alex Gold

Another Plumber talk? That’s right – and this time we’ll combine it with the awesome Shiny feature of RStudio.

Shiny, for the uninitiated, is an open package from RStudio. We use Shiny to build interactive web pages with R. It provides a very powerful way to share your analysis in an interactive manner with the community. The best part about Shiny is that you don’t need any knowledge of HTML, CSS or JavaScript to get started.

You can read more about how to get started with Shiny in R here:

Creating Interactive Data Visualization using Shiny App in R (with examples)

This talk by Alex Gold, a Solutions Engineer at RStudio, delves into how you can use R to bring your modeling and visualization work into the production environment. That’s a tricky task as experienced data science professionals will attest.

Alex gave us some awesome tips and tricks I’ll surely be using soon!

You should go through Alex’s full session here.

7. Bridging the Gap Between SQL and R by Ian Cook

SQL remains the most popular database language in the world. Did you know that SQL is celebrating it’s 50th birthday this year? Yep – and it continues to be at the core of working with structured data.

SQL is a language every data scientist should know. Here’s a comprehensive course to learn it:

SQL for Data Science

Learning R can be a frustrating experience if you’re coming directly from SQL. The syntaxes are all over the place and it’s not easy to get the hang of it. As Ian Cook mentions in this talk, the popularity of the sqldf package confirms this.

Here’s the good news – we can directly query an R dataframe without having to move out of the R environment. Ian Cook introduces the tidyquery package that runs SQL queries directly on R dataframes!

This is a must-watch talk.

Deep Learning in R

Can you do deep learning in R? This was a knock on R by Python users for a long time but the tide is turning. Popular deep learning packages like TensorFlow can now be used within R itself (and with great effect).

Here’s a good tutorial to get started with deep learning in R:

Creating & Visualizing Neural Network in R

8. What’s New in TensorFlow for R by Daniel Falbel

TensorFlow is the most popular deep learning framework right now (PyTorch users might have something to say about that). It has its flaws (which have been addressed to quite an extent in TensorFlow 2.0), but the TensorFlow community remains a huge one.

This talk by Daniel Falbel explores what’s new in TensorFlow 2.0 as well as how to build data preprocessing pipelines using the tfdatasets package. Additionally, Daniel also shows how to use pre-trained models with tfhub.

It’s a good starting point if you’re interested in deep learning but don’t want to switch from R.

You can watch the recording here.

9. Deep Learning with R by Paige Bailey

Paige Bailey is a familiar name among the Analytics Vidhya community. She was a guest on our DataHack Radio podcast last year and made quite an impression on our readers.

Paige is the product manager for TensorFlow and Swift for TensorFlow. She also talks about what’s new in TensorFlow 2.0 and explains why this is a great time to get on the TensorFlow bandwagon if you haven’t already.

Paige takes the audience on a trip down building deep learning models using R. This is one of my favorite talks from rstudio::conf 2020!

Here’s the full video of Paige’s talk at rstudio::conf 2020.

Other Relevant Talks from rstudio::conf 2020

I’ve included a couple of other talks from rstudio::conf 2020 that didn’t quite fit in the above sections. These are excellent talks in their own right and I wanted to highlight them for our community.

10. Journalism with RStudio, R, and the tidyverse by Larry Fenn

Associated Press is among the top media outlets in the world. I found it quite intriguing that they primarily use R and the tidyverse for performing data analysis. Given my own interest in data journalism, this was a much-needed talk.

Source: R Journalism

Larry Fenn, a data journalist at Associated Press, showed us the power of R for telling stories:

Using dbplyr to work off a hosted database containing 380 million opioid records to identify “pill mills”
Using open-sourced AP style templates for R Markdown and ggplot to quickly produce graphics and reports off breaking news
R Markdown and htmlwidgets to give reporters and editors interactive reports to identify reporting leads

I urge you to watch Larry’s talk here and use his ideas in your daily projects.

11. Panel: Career Advice for Data Scientists

No data science conference is ever complete without a discussion among the top minds in the industry. This year at rstudio::conf 2020, the panel discussion focused on how to build a career in data science using R. The panelists (mentioned below) discussed topics like the different stages of career growth.

Here are the panelists (hosted by Jen Hecht of RStudio):

Gabrielade Queiroz – Sr. Machine Learning Manager, AI Developer
David Keyes – Consultant and Instructor, R for the Rest of Us
Sydeaka Watson – Senior Data Scientist

Watch the panel discussion here.

End Notes

I love rstudio::conf! It’s a paradise for R lovers and this year’s conference did not disappoint. I personally loved Paige Bailey’s talk on Deep Learning using R and the talk on data journalism as well.

You can get the code files and PPTs for the talks here.

What was your favorite talk from rstudio::conf 2020? Share your thoughts in the comments section below and let’s get the R community together!

Pranav Dar

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

11 Powerful Talks from rstudio::conf 2020 you Must Watch – A Treat for R Users!

Introduction to the rstudio::conf 2020!

Framework for Picking out these 11 Talks from rstudio::conf 2020

Data Cleaning and Preprocessing Talks at rstudio::conf 2020

1. State of the Tidyverse by Hadley Wickham

Data Exploration and Data Visualization Talks at rstudio::conf 2020

2. Effective Visualizations by Miriah Meyer

3. 3D ggplots with rayshader by Dr. Tyler Morgan-Wall

Data Engineering Talks at rstudio::conf 2020

4. Practical Plumber Patterns by James Blair

5. MLOps for R with Azure Machine Learning by David Smith

6. Deploying End-to-End Data Science with Shiny, Plumber, and Pins by Alex Gold

7. Bridging the Gap Between SQL and R by Ian Cook

Deep Learning in R

8. What’s New in TensorFlow for R by Daniel Falbel

9. Deep Learning with R by Paige Bailey

Other Relevant Talks from rstudio::conf 2020

10. Journalism with RStudio, R, and the tidyverse by Larry Fenn

11. Panel: Career Advice for Data Scientists

End Notes

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV