In the realm of data, efficiently storing and managing resources is a common challenge. Especially when dealing with large volumes of simulation results in an AWS S3 bucket, alongside parameter data in a DynamoDB table, an automated data management system becomes crucial to limit storage costs and manual labor.
In this blog we will explore two features provided by Amazon Web Services (AWS): Lifecycle Rules in S3 and Time to Live (TTL) in DynamoDB. Both features serve as great strategies for effective data management.
S3 Lifecycle Rules for Efficient Storage Management
As data volumes increase in an S3 bucket, a structured storage plan becomes essential. This is where Lifecycle Rules in S3 come into play.
Lifecycle rules can be added via AWS Management Console. Navigate to the S3 bucket and locate the Management tab. Here you can find Lifecycle Rules and create a new rule:
Each rule requires a unique name. Optionally, you can add a filter such as: a prefix, tag or object size. Without a filter, the rule applies to all objects in the bucket.
For example here I’m creating a 3-day retention rule. I want the objects to expire after 3 days but you can choose to transition the objects to a different storage classes like Glacier.
AWS Console helpfully displays how this rule will affect the objects in the bucket.
These rules allow the user to define how objects transition between storage classes and when they expire. The functionality relieves the burden of manual tracking. Upon saving the rules, S3 manages the objects according to the lifecycle rule.
Note: These rules can also be configured through Terraform when setting up the bucket.
TTL in DynamoDB for Optimal Data Cleanup
When data accumulates in a DynamoDB table, some items, which are time-sensitive, may need to be removed. DynamoDB’s Time to Live (TTL) feature serves this purpose optimally.
TTL can be enabled via the DynamoDB service on the AWS Management Console. Find the table you want to enable TTL for and go to the Additional Settings tab. Turn on TTL and you’ll see the form below:
Your TTL attribute needs to exist for items in the table and it should be a Unix timestamp of the time the table item should expire. There’s a helpful preview option where you can simulate the expiry time.
After enabling TTL, all you need to do is configure the TTL attribute in your table items. Below is the 3 day retention TTL attribute for a table item:
"TimeToExist": int(round(dt.datetime.now().timestamp())) + (72 * 60 * 60)
You can also see this attribute suffixed (TTL) if you explore the items in the table:
And when you click the timestamp you can see exactly when this item will expire:
Note: This also can be configured through Terraform when setting up the table.
Incorporating S3 lifecycle rules and DynamoDB TTL is an effective strategy for managing data and reducing storage costs. Going forward, these strategies can be integrated into your applications’ architecture from the get-go, offering a level of automation that minimizes manual labor and increases efficiency.