4 Tricky R interview questions

Tavish Srivastava Last Updated : 24 Jun, 2022

4 min read

Analytics industry in India is dominated by SAS currently. But, it will be too optimistic to hope that this remains to in years to come. R, on the other hand is open source, and can be implemented in any environment. SAS grows by efforts of smart people employed by SAS but R grows by the effort of anyone who works on the language. Anyone can contribute to the language R. Hence, I feel that every analyst should develop expertise in both the languages.

There are some key differences in coding on R vs. coding on SAS. This makes some of the interview questions on R tricky and handling them becomes overwhelming for some candidates. I strongly feel a need of a common thread which has all the tricky R questions asked in interviews. This article will give a kick-start to such a thread. We have a similar series of articles published on SAS (Part 1 and Part 2). Please note that the content of this article is based on the information I gathered from various R sources.

If you’re looking to understand these questions through the lens of cracking data science interviews, look no further! We have put together a comprehensive course to help you land your first data science role!

[stextbox id=”section”] Question 1 : Rotational multiplication [/stextbox]

You have two vector defined as follows :

[stextbox id=”grey”]

> a <- c(2,3,4) 
> b <- c(1,2)

[/stextbox]

What is the value of the vector d, which is defined as follows :

[stextbox id=”grey”]

> d <- a*b

[/stextbox]

Answer : 2 , 6 , 4

R language does vectorized operations. ‘a’ and ‘b’ are two vectors with different length. By process, R multiplies the first element of a with 1st element of b, than second element of a with that of b, and so on. But in this case, after the second multiplication R hits the end of vector “b”. In such cases R, starts with the first element of smaller vector till each element of longer vector is exhausted. The vectorized operation always leads to a vector of length equal to that of longer vector.

[stextbox id=”section”] Question 2 : Scoping Rules [/stextbox]

You need to understand the following code and answer a question based on this understanding.

[stextbox id=”grey”]

> y <- 3

> f <- function(x) {

+                            y <- 2

+                            y ^ 2 + g(x)

+                            }

> g <- function(x) {

+                             x * y

+                             }

[/stextbox]

What is the value of f(6)?

Answer : 22

If you answered anything other than 22, you probably need to refresh the lexical scoping in R. The function f(x) returns a value y^2 + g(x). y in this environment has been defined as 2 and g(x) from inside this function. The value of x is passed of function g as 6. Now comes the catch, what is the value of free variable y here? Unlike dynamic environment where the value is assumed from the parent environment, lexical scoping assumes the value of a variable from the environment where the function is defined. The function g(x) is defined in the global environment here, and hence the value of y is assumed to be 3. Therefore a value of 18 is returned from the function g(x). f(6) is finally returning as 22.

[stextbox id=”section”] Question 3 : Summarizing at each factor [/stextbox]

You have been assigned to check two race tracks. To complete this task you are expected to find the means of the total time taken by cars to cross the track. In the following data assignment, “b” is the vector of total time taken by different cars and “a” is the vector of track on which this time is taken. The first element of the vector “b” corresponds to the first element of vector “a” (and so on).

[stextbox id=”grey”]

> a <- c(1,1,1,1,2,2,2,2,2)

> b <- c(10,12,15,12,NA,30,42,38,40)

[/stextbox]

How do you find the mean time of each track using split function?

Answer : Code is as follows

[stextbox id=”grey”]

> s <- split(b,a)

> lapply(s,mean)

[/stextbox]

[stextbox id=”section”] Question 4 : Treating missing values [/stextbox]

Following is the output of the last section :

[stextbox id=”grey”]

$`1` [1] 12.25

$`2` [1] NA

[/stextbox]

How do you modify the code, to treat the missing value in the second track record?

Answer : The modified code is as follows :

[stextbox id=”grey”]

> lapply(s,mean,na.rm=TRUE)

$`1` [1] 12.25

$`2` [1] 37.5

[/stextbox]

[stextbox id=”section”]

End Notes : [/stextbox]

Coders are lazy! and R language is built for coders. Codes in R are much more compact as compared to SAS. But it makes the language more difficult to retain all the syntax. You will probably need a lot of practice to get a hang of it (if you have been using SAS extensively). In one of our coming articles, we will compare coding in SAS and R. Have you faced any other R problem in analytics interview? Are you facing any specific problem with R codes? Do you think this provides a solution to any problem you face? Do you think there are other methods to solve the problems discussed in a more optimized way? Do let us know your thoughts in the comments below.

If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.

Tavish Srivastava

Tavish Srivastava, co-founder and Chief Strategy Officer of Analytics Vidhya, is an IIT Madras graduate and a passionate data-science professional with 8+ years of diverse experience in markets including the US, India and Singapore, domains including Digital Acquisitions, Customer Servicing and Customer Management, and industry including Retail Banking, Credit Cards and Insurance. He is fascinated by the idea of artificial intelligence inspired by human intelligence and enjoys every discussion, theory or even movie related to this idea.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Responses From Readers

Ashish Jain

Hi Tavish, It's good you started this thread. My view on above R topics is to create separate section on R interviews categorized as: 1) R basic interview question - To test how strongly a person knows R. 2) R intermediate level questions- To test multiple ways(alternatives) of achieving same results on R and which is more efficient way of analysis. Advance level 3) How to handle or process millions of rows(Big Data) in R. what are the challenges and how to cope up with it. 4) How to develop R based predictive applications using Shiny package or integrate R predictive model results with other application 5) R statistical models interpretation -interview questions. like clustering in R, record linkage in R, multinominal Regression in R,Random forest and SVM etc. Let me know if we can collaborate to learn and write more stuffs, share thoughts and approach to analytics solution in R and SAS.

Show 1 reply

naidu

hi ashish i,m looking forward to learn r programming i would like to learn it can u pls help me out

Subhajit

This discussion is going to be really helpful for people who want to be an analyst...Thank you for the valuable inputs.

Abhinav

All, do let me know if there is any analytics assessment tests available in the market which can help the organization in taking hiring decision. thanks,

Show 2 reply

Kunal Jain

Abhinav, Nice idea! I am not aware of any standard tests available yet. However you can get a few tests on tools, if you search. Regards, Kunal

Aakash

Well, there is an informs data scientist certification. You can try that.

Write for us

Write, captivate, and earn accolades and rewards for your work

Reach a Global Audience
Get Expert Feedback
Build Your Brand & Audience

Cash In on Your Knowledge
Join a Thriving Community
Level Up Your Data Science Game

Reading list

Introduction

Tools

Libraries

Plots

Use cases

4 Tricky R interview questions

If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc

bcookie