Introduction
I wanted to learn more about AWS and get into Cloud, hence ended up doing The Cloud Resume challenge a while ago. You can read about my journey into Cloud here: My journey into Cloud.
During that journey, I briefly touched DynamoDB. It was hard and confusing. What was even more confusing to me was why people love and speak so highly about it.
It made me extremely curious, I wanted to learn more about DynamoDB, and see for myself what all the hype is about.
This made me buy the book The DynamoDB Book by Alex.
In this article, I go over my learnings about DynamoDB. Not everything, but mostly things I really don't want to forget or at least be able to refer quickly to whenever I would need it.
DynamoDB
DynamoDB is a popular NoSQL database that offers high performance, scalability, and reliability.
It is fully managed, meaning, it automatically scales up or down to meet the needs of your application without the need for you to configure and manage the underlying infrastructure.
DynamoDB is more than just a key-value store.
Core concepts
Tables
In DynamoDB, data is organized into tables. A table is a collection of data, and each table has a unique name. Tables allow you to store and query data and can be thought of as similar to a database table in a traditional relational database.
Items
An item is a single data record in a table. An item is made up of one or more attributes, which are pieces of data that describe the item.
For example, an item in a table might represent a person and have attributes like name, age, and address.
Attributes
An attribute is a piece of data that describes an item in a table. An item can have one or more attributes, and each attribute has a name and a value.
For example, in the item representing a person, the name attribute might have a value of John Doe and the age attribute might have a value of 35.
Primary keys
In DynamoDB, every table must have a primary key, which is used to uniquely identify each item in the table.
There are two types of primary keys in DynamoDB: Simple primary keys and composite primary keys.
Partitions
Before explaining the two different keys, we need to understand what partitions are.
In DynamoDB, a partition is a physical storage area for the table's data. Each partition is a separate storage area, and items with the same partition key are stored on the same partition.
This allows DynamoDB to distribute data across multiple servers, which helps to provide high levels of scalability and performance.
For instance, if you query the table using the partition key, DynamoDB can quickly locate the data on the appropriate partition.
Database partitioning is actually a topic in system design. This is a great read if you want to dig further into it and see some visuals as well: Database Partitioning.
Simple primary keys
Simple primary keys consist of an element called the partition key. It is used to distribute data across partitions in a table. The partition key determines which partition an item will be stored in, and all items with the same partition key are stored on the same partition.
Composite primary keys
Composite primary keys consist of two elements that form the primary key: the partition key and the sort key.
The problem with simple primary keys is that they only contain the partition key. Each item in a partition must have a unique primary key, which means in this context, a unique partition key.
For example, if the primary key for the table is the customer's name, then all of the items in the table will have a unique customer name. Though, this means we cannot store multiple customers with the same name in a partition, because each item must have a unique primary key.
In this situation, we could use a composite primary key that includes both the customer's name (the partition key) and their address (the sort key). This would allow you to store multiple items with the same customer name on the same partition because each item would have a unique combination of the customer's name and address.
Secondary indexes
In addition to the primary key, tables can also have one or more secondary indexes. Secondary indexes allow you to query the data using a different key than the primary one.
For example, you have a table that contains information about customers, including their names, addresses, and age. The primary key for the table is the customer's name. This means that you can only query the table using the customer's name, and you cannot query the table using the customer's address or age.
But what if we want to query customers based on their addresses? This is where secondary indexes play their role, they allow us to do so.
Local secondary indexes
Local secondary indexes use the same partition key as the table, but a different sort key.
More cost-effective than global secondary indexes due to not requiring additional storage space
Allows you to opt into strong consistency
Less flexible than global secondary indexes
Global secondary indexes
Global secondary indexes have a different partition key and sort key than the table's primary key.
Allows you to query the data in the table using any combination of attributes defined in the index
Independent of the table's primary key, which means that you can query the index using a different primary key than the table
Requires additional storage space
Type indicators
Type indicators are necessary when defining `ExpressionAttributeValues`.
Below is an example taken from the book.
result = dynamodb.query(
TableName='MovieRoles',
KeyConditionExpression="#a = :a AND #m < :title",
ExpressionAttributeNames={
"#a": "Actor",
"#m": "Movie"
},
ExpressionAttributeValues={
// "S" indicates the value is of type string
":a": { "S": "Natalie Portman" },
":title": { "S": "N" }
}
)
The different type indicators that exist:
:S
- string:N
- number:B
- binary:SS
- string set:NS
- number set:BS
- binary set:BOOL
- boolean:NULL
- null:M
- map:L
- list
Conclusion
The book was phenomenal. DynamoDB is harder to learn because it is different and extremely powerful, it is different in multiple good ways.
I will go back to the book and use it as a reference when I get around to working with DynamoDB.
I want to point out that the book taught me more than just DynamoDB:
Data modeling
Entity-relationship diagrams
System design
How to write concisely (Alex has a beautiful way of writing)