Interview Questions on AWS DynamoDB

Shikha Last Updated : 22 Apr, 2023

9 min read

Introduction

Amazon-backed DynamoDB is a type of NoSQL database that offers exciting features like quick and highly predictable performance, high reliability over data, and seamless scalability. With the help of AWS DynamoDB, we can entrust the admin tasks associated with the distributed databases running and scaling. It is an effective data storing and retrieval technique that comes with different pricing tiers to suit different customer requirements and allow them to create tables for the database. No matter whether you are a fresher or any technical expert, if you’re preparing for AWS-offered positions, then this is the right place you are landed at. This blog will help you to gain an understanding of DynamoDB and help you prepare for twisted interview questions.

Source: simplior.com

Learning Objectives

After reading this blog thoroughly, you’ll have the confidence to answer interview questions on the following:

1. Overview of AWS DynamoDB, its role in the technical era, and why it is needed when we have tools like Aurora.

2. Insights into topics like primary keys in DynamoDB, global and local secondary indexes, data pipeline, and projections in DynamoDB.

3. Knowledge of AWS DynamoDB workflow along with different concepts like DynamoDB pricing tiers.

4. An understanding of DynamoDBMapper class along with its methods like save, delete, query, count, etc.

5. Understand how DynamoDB prevents data loss in real-time, working of Query functionality in DynamoDB, etc.

This article was published as a part of the Data Science Blogathon.

AWS DynamoDB Interview Questions for Freshers

Q1. Can we use DynamoDB for free?

The pricing system of AWS DynamoDB is based on the resources you provision. You can start using DynamoDB with the free tier limits of DynamoDB, where many applications operate. But, if you require additional resources, then there is a pricing per month system, which varies depending upon the type of resources required.

Q2. What is the maximum size of an item in Amazon DynamoDB?

In DynamoDB, 400 KB is the ideal and maximum item size; it contains both the attribute name length in UTF-8 format as well as the value lengths in binary format.

Q3. Can DynamoDB support atomic updates in place?

Yes, DynamoDB supports the Quick in-place atomic notifications by allowing us to increase or decrease the numeric attributes with a single API call as well as to add or delete lists, sets, or maps.

Q4. Give the name of 4 use cases of AWS DynamoDB.

Business Intelligence(BI)
Everyday Web Analytics
Data Warehousing
Real-time Analytics

Q5. Explain two types of primary keys used by DynamoDB.

Source: aws.amazon.com

Partition Key: DynamoDB uses it to partition your data into separate logical data shards. To an internal hash function, the partition key value is inputted by DynamoDB, generating an output that will identify the partition (physical storage internal to DynamoDB) in which location the item will be stored.
Sort Key: The sort key is used to sort all items with the same partition key value to store them together. Used to determine storage location.

Q6. Explain the two types of secondary indexes supported by Amazon’s DynamoDB.

Source: pinterest.com

Global Secondary Index: A global secondary index consists of a partition key along with the sort key that is distinct from the primary key of a base table. It is defined as “Global” because queries on the index can cover all the items in a base table, including all the partitions.
Local Secondary Index: A local secondary index is an index with a different sort key but the same partition key as the primary table. It is defined as “local” because queries on the index can cover only the base table partition with the same partition key value.

Q7. How many global secondary indexes can you create for each table?

The maximum number of global secondary indexes permitted to each table is 5(five).

Q8. Can we erase the local secondary indexes?

As per the current scenario, as soon as the local secondary indexes are created, Amazon DynamoDB cannot remove them from the base table; the only solution is to delete the entire table.

Q9. Explain the differences between Amazon DynamoDB and Amazon SimpleDB.

Amazon DynamoDB	Amazon SimpleDB
It is very Fast and scalable.	It has Scaling limitations.
It helps to maintain high performance and to be highly cost-effective for workloads.	SimpleDB supports and attributes flexibility at the cost of performance.

Q10. Name different data types supported by DynamoDB.

The scalar data types offered by DynamoDB include the following:

Boolean (Y/N)
Numbers (0-9)
Strings (alphanumerics)
Binary (0,1)

The data types for collections offered by DynamoDB include the following:

String Set
Number Set
Heterogeneous Binary Set
Differentiated Map

Note: Null values are also acceptable.

Q11. Is it possible to add the local supplementary indexes to a table that already exists in the database?

As soon as we establish any table to local secondary indexes, we have the opportunity to define a sort key element that is currently unused to define a local secondary index that can be used whenever needed.

This implies that adding local secondary indexes to the existing table is somewhat impossible for now.

Q12. Give 5 ways to get data from DynamoDB.

Query Method
Scan Method
GetItem Command
BatchGet Command
TransactRead Command

Q13. Explain the Projections in DynamoDB.

Projects are the set of attributes that are copied from the base table to an index. This process of Projection will always occur along with the partition key and sort key.

The minimum 3 attributes carried by each index include:

A. Base table’s sort key value

B. Base table’s partition key value

C. Attribute to serve as the index sort key

Q14. Explain the role of the data pipeline in DynamoDB.

The role of the Data Pipeline in DynamoDB is to export and import data from the S3 bucket, provide data backup whenever history data is required, and perform data testing.

Two types of Data Pipeline include:

A. DataPipelineDefaultRole: It carries all the actions that are permitted for the pipeline to perform.

B. DataPipelineDefaultResourceRole: It carries all the resources that are permitted for the pipeline to perform.

Q15. How can we generate Access Key and Secret Key in DynamoDB?

We can easily generate an access key and secret key by Creating a User from Amazon’s Identity Access Management(IAM).

Interview Questions for Technical Experts

Q1. Explain the working of Amazon DynamoDB.

The working of AWS DynamoDB is quite simple to understand as it uses B-trees and Hashing techniques to manage complex data. First of all, when we enter the data, it uses a hashing technique to get distributed into different partitions and then the data gets stored in a particular partition after initializing it with a key value.

Source: cloudacademy.com

Each data partition is capable of handling 3000 reads and 1000 write units, along with a default storage capacity of 10GB. The partition key values are used in DynamoDB to store and retrieve the information. Dynamo DB inserts the key value to the hash function to write an item to the table and then uses the hash function output to determine the partition location of where the item needs to be stored.

Q2. Explain the concept of DynamoDBMapper class along with its methods.

The DynamoDBMapper class is nothing but the entry point of DynamoDB that enables us to pull data from various tables, scan tables, execute queries, perform CRUD(Create, Read, Update, and Delete) operations, and offer access to a DynamoDB endpoint.

public DynamoDBMapper(AmazonDynamoDB dyno,
                      DynamoDBMapperConfig configer)

This class is used to build a new mapper with the given service object and configuration.

1. dyno- The name of the service object is used to handle all service calls.

2. config- It is the name of the default configuration that can be overridden on a per-operation basis and used for all service calls.

The methods of DynamoDBMapper class include the:

A. getS3ClientCache: S3ClientCache is a smart Map for AmazonS3Client objects and this function returns the underlying S3ClientCache for accessing Amazon S3. S3ClientCache is very useful when we have multiple clients, as it not only keeps the clients organized by AWS Region but can also create new Amazon S3 clients on demand.

public S3ClientCache getS3ClientCache()

B. Save: As the name suggests this function saves the definite object to the table and needs no other parameter to pass. Although, we can also give some optional configuration parameters using the DynamoDBMapperConfig object.

public <T> void save(T obj,
                     DynamoDBSaveExpression saveExp,
                     DynamoDBMapperConfig configer)

Parameters:

1. obj: The name of an object that has to save into DynamoDB.

2. saveExp: It defines the options needed to be applied to save the request.

C. Delete: Delete function simply deletes the item from the table with the compulsion of passing the object instance of the mapped class.

public <T> void delete(T obj,
                       DynamoDBDeleteExpression deleteExp,
                       DynamoDBMapperConfig configer)

D. Query: The query function is used to query a table or a secondary index but the compulsion is that the table/index must have a composite primary key (partition key and sort key).

public <T> PaginatedQueryList<T> query(Class<T> cl,
                                       DynamoDBQueryExpression<T> queryExp,
                                       DynamoDBMapperConfig configer)

Parameters:

1. cl: cl represents the annotated class with AWS DynamoDB to tell how to store the object data in AWS DynamoDB.

2. queryExp: It provides the conditions that need to be put on the key values as well as the details need to run the query.

E. scanPage: This function scans a table or secondary index and returns a single page of matching results but that return page is the amount of data that fits within 1 megabyte.

public <T> ScanResultPage<T> scanPage(Class<T> cl,
                                      DynamoDBScanExpression scanExp,
                                      DynamoDBMapperConfig configer)

F. queryPage: This function is used to scan the entire table or secondary index and we can also provide the FilterExpression to filter the result set but this feature is totally optional.

public <T> QueryResultPage<T> queryPage(Class<T> cl,
                                        DynamoDBQueryExpression<T> queryExp,
                                        DynamoDBMapperConfig configer)

G. Count: This function evaluates the definite scan expression to find the matching items and returns them.

public <T> int count(Class<T> cl,
                     DynamoDBQueryExpression<T> queryExp,
                     DynamoDBMapperConfig configer)

H. createS3Link: This function is used to create a link to an object in Amazon S3, where we have to give the bucket name and a key name to uniquely identify the object in the bucket.

public S3Link createS3Link(String s3region,
                           String b_Name,
                           String b_key)

Q3. Explain the working of Query functionality in DynamoDB.

AWS DynamoDB enables two types of data fetching options to fetch data from collections: the Scan method and the Query method.

Query functionality | Interview questions

Source: dynobase

1. In the Scan method, DynamoDB scans the entire table and searches the records with its matching criteria.

2. In the Query method, DynamoDB performs the direct lookup for a specific dataset with the help of key constraints. To improve the read/ write operation speed and the data flexibility, it uses not only the primary key but also a partition primary key, a global secondary key, and a local secondary key.

That’s why the AWS DynamoDB query approach is recommended for most data fetching scenarios and its features like time effective, fast speed, etc., make it better than the DynamoDB scan.

Q4. Explain the difference between DynamoDB and Aurora.

AWS DynamoDB	Aurora
AWS DynamoDB is a non-relational database management system.	Aurora is a relational database service.
It uses the specialized syntax of NoSQL to fetch and process the data.	It uses the Structured Query Language for performing the tasks like data manipulation and fetching.
As the partitioning method, it prefers the usage of Sharding.	It uses simple horizontal partitioning.
DynamoDB can be operated with the key value as well as document models.	It can only be operated with relational databases.
DynamoDB does not support server-side scripting.	Aurora supports the scripting at the sever-side.

Q5. Explain the concept of Auto Scaling in DynamoDB.

AWS DynamoDB needs Auto Scaling because it gets very difficult sometimes to tell beforehand about the database’s workload. By Auto Scaling feature DynamoDB is capable of scaling the read and write capacity with respect to the traffic.

Source: aws.amazon.com

The feature of Scaling up permits global secondary indexes and the tables to gain more read and write and another option of scaling down is proven as cost-effective because it is automatically based on the traffic.

Q6. Explain the two DynamoDB pricing tiers.

1. On-demand Capacity Mode: The on-demand capacity mode is a type of pricing tier that is ideally used for conditions when the traffic is unpredictable. It scales the database instances by focusing on the incoming traffic to the application.

2. Provisioned Capacity Mode: The provisioned capacity mode is a type of pricing tier that is ideally used for conditions when the traffic is consistent and predictable. It enables customers to choose auto-scaling and specify the reads and writes per second.

Q7. Explain how DynamoDB prevents data loss in real time.

DynamoDB has a two-tier backup system along with long-term storage, which ensures minimal data loss in real time. Basically, each participant in DynamoDB has three nodes, and each node carries the same data from the partition to avoid data loss due to failure at one node. Apart from that, to track the changes in each node for data location and a replication log, AWS DynamoDB also has a B tree. A different AWS database is used to store the snapshot of this data almost for a month so that the data can be pulled out as soon as requested.

Conclusion

This blog covers most of the frequently asked DynamoDB interview questions that could be asked in data science, DynamoDB developer, Data Analyst, and big data developer interviews. Using these interview questions as a reference, you can better understand the concept of DynamoDB and start formulating effective answers for upcoming interviews. The key takeaways from this DynamoDB blog are:

AWS DynamoDB is a highly managed NoSQL database service that provides effective data storing and retrieval techniques with different pricing tiers to suit different customer requirements.
DynamoDB uses B-trees and Hashing techniques to manage complex data and supports both key-value and document data structures.
It enables two types of data fetching options, including the scan and the query method to fetch data from collections.
AWS DynamoDB offers a two-tier backup system and long-term storage to ensure minimal data loss.

If you have any interview questions on this topic. Then, share them with me in the comments below.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Shikha

I am a tech enthusiast, a student, and a learner. I am a critical reader and a lover of words who finds writing blogs interesting. I possess the capability to research and learn new technologies quickly.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices