DynamoDB 101

DynamoDB 101

A write-up about my learnings of DynamoDB.

Introduction

I wanted to learn more about AWS and get into Cloud, hence ended up doing The Cloud Resume challenge a while ago. You can read about my journey into Cloud here: My journey into Cloud.

During that journey, I briefly touched DynamoDB. It was hard and confusing. What was even more confusing to me was why people love and speak so highly about it.

It made me extremely curious, I wanted to learn more about DynamoDB, and see for myself what all the hype is about.

This made me buy the book The DynamoDB Book by Alex.

In this article, I go over my learnings about DynamoDB. Not everything, but mostly things I really don't want to forget or at least be able to refer quickly to whenever I would need it.

DynamoDB

DynamoDB is a popular NoSQL database that offers high performance, scalability, and reliability.

It is fully managed, meaning, it automatically scales up or down to meet the needs of your application without the need for you to configure and manage the underlying infrastructure.

DynamoDB is more than just a key-value store.

Core concepts

Tables

In DynamoDB, data is organized into tables. A table is a collection of data, and each table has a unique name. Tables allow you to store and query data and can be thought of as similar to a database table in a traditional relational database.

Items

An item is a single data record in a table. An item is made up of one or more attributes, which are pieces of data that describe the item.

For example, an item in a table might represent a person and have attributes like name, age, and address.

Attributes

An attribute is a piece of data that describes an item in a table. An item can have one or more attributes, and each attribute has a name and a value.

For example, in the item representing a person, the name attribute might have a value of John Doe and the age attribute might have a value of 35.

Primary keys

In DynamoDB, every table must have a primary key, which is used to uniquely identify each item in the table.

There are two types of primary keys in DynamoDB: Simple primary keys and composite primary keys.

Partitions

Before explaining the two different keys, we need to understand what partitions are.

In DynamoDB, a partition is a physical storage area for the table's data. Each partition is a separate storage area, and items with the same partition key are stored on the same partition.

This allows DynamoDB to distribute data across multiple servers, which helps to provide high levels of scalability and performance.

For instance, if you query the table using the partition key, DynamoDB can quickly locate the data on the appropriate partition.

Database partitioning is actually a topic in system design. This is a great read if you want to dig further into it and see some visuals as well: Database Partitioning.

Simple primary keys

Simple primary keys consist of an element called the partition key. It is used to distribute data across partitions in a table. The partition key determines which partition an item will be stored in, and all items with the same partition key are stored on the same partition.

Composite primary keys

Composite primary keys consist of two elements that form the primary key: the partition key and the sort key.

The problem with simple primary keys is that they only contain the partition key. Each item in a partition must have a unique primary key, which means in this context, a unique partition key.

For example, if the primary key for the table is the customer's name, then all of the items in the table will have a unique customer name. Though, this means we cannot store multiple customers with the same name in a partition, because each item must have a unique primary key.

In this situation, we could use a composite primary key that includes both the customer's name (the partition key) and their address (the sort key). This would allow you to store multiple items with the same customer name on the same partition because each item would have a unique combination of the customer's name and address.

Secondary indexes

In addition to the primary key, tables can also have one or more secondary indexes. Secondary indexes allow you to query the data using a different key than the primary one.

For example, you have a table that contains information about customers, including their names, addresses, and age. The primary key for the table is the customer's name. This means that you can only query the table using the customer's name, and you cannot query the table using the customer's address or age.

But what if we want to query customers based on their addresses? This is where secondary indexes play their role, they allow us to do so.

Local secondary indexes

Local secondary indexes use the same partition key as the table, but a different sort key.

  • More cost-effective than global secondary indexes due to not requiring additional storage space

  • Allows you to opt into strong consistency

  • Less flexible than global secondary indexes

Global secondary indexes

Global secondary indexes have a different partition key and sort key than the table's primary key.

  • Allows you to query the data in the table using any combination of attributes defined in the index

  • Independent of the table's primary key, which means that you can query the index using a different primary key than the table

  • Requires additional storage space

Type indicators

Type indicators are necessary when defining `ExpressionAttributeValues`.

Below is an example taken from the book.

result = dynamodb.query(
  TableName='MovieRoles',
  KeyConditionExpression="#a = :a AND #m < :title",
  ExpressionAttributeNames={
  "#a": "Actor",
  "#m": "Movie"
  },
  ExpressionAttributeValues={
  // "S" indicates the value is of type string
  ":a": { "S": "Natalie Portman" },
  ":title": { "S": "N" }
  }
)

The different type indicators that exist:

  • :S - string

  • :N - number

  • :B - binary

  • :SS - string set

  • :NS - number set

  • :BS - binary set

  • :BOOL - boolean

  • :NULL - null

  • :M - map

  • :L - list

Conclusion

The book was phenomenal. DynamoDB is harder to learn because it is different and extremely powerful, it is different in multiple good ways.

I will go back to the book and use it as a reference when I get around to working with DynamoDB.

I want to point out that the book taught me more than just DynamoDB:

  • Data modeling

  • Entity-relationship diagrams

  • System design

  • How to write concisely (Alex has a beautiful way of writing)