Web Analytics Made
Easy - StatCounter

Menu

About us Contact us

How I solved Dynamic Task Scheduling using AWS DynamoDB TTL, Stream and Lambda

Tech 2019 / 09 / 16

How I solved Dynamic Task Scheduling using AWS DynamoDB TTL, Stream and Lambda



Recently, I had to design a solution for dynamic task scheduling. The requirement was that, an email had to be sent to each user after a day had passed since the record was inserted into the database. Inevitably, my initial thought was to run a cron job periodically to check for new records and send email. But, that would be inefficient because even if there is no new record, the cron would run and check against the whole database each time. After some digging, I found that this can be solved more efficiently using AWS DynamoDB TTL, Stream and Lambda as it would run independently for each record at the time specified. If your scheduling requires specifying execution time for each task separately, then you can use this approach to schedule tasks individually. I will try to explain how to do it in this article.

・・・

How AWS DynamoDB TTL, Stream and Lambda Works

TTL stands for time to live. In DynamoDB, you can specify a time for each record in a table independently which denotes the time when the item will expire called TTL. DynamoDB Stream is another service by AWS that acts upon any changes made in the DynamoDB table. We can use DynamoDB Stream to trigger an action when an item is inserted, modified or deleted. Along with TTL we can use DynamoDB Stream to trigger an AWS Lambda function when a record’s TTL expires in DynamoDB table.

Limitations of this DynamoDB TTL

Firstly, the biggest drawback is DynamoDB TTL does not exactly maintain the expiry time. After an item expires, when exactly the item will be deleted depends on the nature of workload and size of the table. At worst case scenario, it may take up to 48 hours for the actual deletion event to take place as explained in their documentation.

DynamoDB typically deletes expired items within 48 hours of expiration. The exact duration within which an item truly gets deleted after expiration is specific to the nature of the workload and the size of the table. Items that have expired and have not been deleted still appear in reads, queries, and scans. These items can still be updated, and successful updates to change or remove the expiration attribute are honored.

So, if your task needs to be executed at exactly the time specified, this solution won’t work for you. But, if your tasks need to be executed after a certain time, but not too constrained on how much later like sending email or sending notification, then this might work for you. There is a nice article that tried to benchmark the TTL performance of AWS DynamoDB.

Secondly, DynamoDB Stream gets triggered on all kinds of events like insertion, modification and deletion. Currently, there is no way to trigger DynamoDB Stream for only a specific event, say deletion. So, we need to handle all kinds of events and then decide upon the type of the event when to execute the task.

 

Create AWS DynamoDB table

First of all, we need to create a DynamoDB table. So, head on to AWS DynamoDB and create a table named ‘test-table’ and add a primary key named ‘email’ as following. For simplicity, we are keeping all other configurations to default.

Create AWS DynamoDB table

 

Now, we need to enable TTL in the DynamoDB table. For this, select the table, go to the overview section and under Table details you will find Time to live attribute, click on Enable TTL. A dialogue box will appear. In the TTL attribute section, type a field name which denotes the time each item will be deleted. In our case, it is ‘expired_at’. Then, in the DynamoDB Streams section, enable with view type New and old images by checking it. Then click continue. Now you are all set with the DynamoDB Table and Stream.

Enable TTL for AWS DynamoDB table

 

Create AWS Lambda function

Now, head over to AWS Lambda. Before creating a Lambda function create an IAM role that has access to DynamoDB. If you are not sure how to do it, follow this article. Then, from AWS Lambda console, create a Lambda function, provide a name like ‘test-function’ and select the runtime. I am using Nodejs, but you can use any runtime as you like. For execution role, select the previously created role and click on create function.

Create AWS Lambda function

 

Add AWS DynamoDB Stream trigger to AWS Lambda function

After the function is created, click on add trigger, then select your DynamoDB table, in our case ‘test-table’. Set the batch size to 1, as we want to process only 1 record at a time. Set batch window to 0, as we don’t want any delay in lambda execution after expiry of the record. Set starting position to Latest and then click add. Now your DynamoDB Stream Trigger is set.

Add AWS DynamoDB trigger to Lambda function

 

Add AWS Lambda Implementation to execute the task

Now it’s time to write the code that handles the DynamoDB Stream records. Go ahead to the AWS Lambda function and add your code. The following example is written in NodeJS. Notice that, we have handled all events and only on ‘REMOVE’ event we want the mail to be sent. As stated in this article, you can have actions on ‘INSERT’ and ‘MODIFY’ events as well depending upon your requirements.

exports.handler = async(event) => {

    const eventName = event.Records[0].eventName;
    const dynamodbRecord = event.Records[0].dynamodb;

    if (eventName === 'REMOVE') {

      const email = dynamodbRecord.OldImage.email.S;
      console.log("Sending email to : " + email);

      //code to send email

    }
    else {
        console.log("Event is " + eventName + ", Skipping execution");
    }

    const response = {
        statusCode: 200,
        body: JSON.stringify('Success'),
    };
    return response;
};
https://gist.github.com/anik0054/01bbc413dcef15f03283cef6f1501c77

 

Add records to DynamoDB table

At this point, we are all set with the entire workflow. Now, we can add records to the DynamoDB along with an expiry time. You can use the JAVA code as given below or even insert records manually to the table. Please note that, you must add TTL in epoch time format. And upon expiring, DynamoDB stream will execute the Lambda function to invoke the desired task. Check the Lambda function cloudwatch logs to make sure the system is working as intended.

import java.time.LocalDateTime;
import java.time.ZoneId;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.document.DynamoDB;
import com.amazonaws.services.dynamodbv2.document.Item;
import com.amazonaws.services.dynamodbv2.document.Table;

public class DynamoDBRecord {

  public static void main() {

    BasicAWSCredentials basicCredentials =
        new BasicAWSCredentials("<your-access-key>", "<your-secret-key>");
    AmazonDynamoDB amazonDynamoDB = AmazonDynamoDBClientBuilder.standard()
        .withCredentials(new AWSStaticCredentialsProvider(basicCredentials))
        .withRegion(Regions.fromName("<your-aws-region>")).build();
    DynamoDB dynamoDB = new DynamoDB(amazonDynamoDB);

    String email = "example@example.com";
    LocalDateTime now = LocalDateTime.now();
    LocalDateTime expiresAt = now.plusDays(1);
    Long expiresAtEpoch = expiresAt.atZone(ZoneId.systemDefault()).toEpochSecond();

    Table table = dynamoDB.getTable("test-table");
    Item item = new Item().withPrimaryKey("email", email).withNumber("expires_at", expiresAtEpoch);
    table.putItem(item);

  }

}
https://gist.github.com/anik0054/2eb617aad6b31bdadcc7055cf86e356e

 

Except for the drawbacks as stated above, this solution works pretty well. However, if your requirement requires precise execution, you can take a look at AWS Step Function. I will try to explain that in another article.

If you have made it this far, thanks a lot for your patience. Please let me know if there is more to add. Feel free to connect.

Check out other articles from our engineering team:
https://medium.com/monstar-lab-bangladesh-engineering


Related Posts : 

Configure Basic Authentication for CloudFront Using Lambda@Edge

Elastic Beanstalk Worker Tier Auto Scaling based on SQS message depth

You have ideas, We have solutions.

CONTACT US!