Pixelo Blog

When analyzing large datasets over time, using a time-series database such as InfluxDB or TimeScaleDB is essential. AWS offers a managed service called TimeStream available in the following regions :

us-gov-west-1
ap-northeast-1
ap-southeast-2
eu-central-1
eu-west-1
us-east-1
us-east-2
us-west-2

In this article, we will cover how to:

Deploy Timestream in serverless mode
Populate a table with stream records
Query the data with a simple SQL query

Project Creation

Comprehensive information about project creation can be found here in a dedicated post.

You now have a template.yml file in the sam-project folder, which we will edit.

Project Deployment

Open the template.yml file and add the following lines:

1AWSTemplateFormatVersion: 2010-09-09
2Description: >-
3  Create Timestream stack
4
5Transform:
6- AWS::Serverless-2016-10-31
7
8Conditions:
9  isDev: !Equals [ !Ref Stage, 'dev' ]
10
11Parameters:
12  Stage:
13    Type: String
14    Description: Stage of this environment
15    Default: "dev"
16
17Resources:
18  TimeStreamDatabase:
19    Type: AWS::Timestream::Database
20    Properties:
21      DatabaseName: timestream-database
22
23  S3Bucket:
24    Type: AWS::S3::Bucket
25    DeletionPolicy: Delete
26    Properties:
27      BucketName: s3-timestream-magnetic-store-rejected-data
28
29  TimeStreamTable:
30    Type: AWS::Timestream::Table
31    Properties:
32      DatabaseName: timestream-database
33      TableName: timestream-table
34      MagneticStoreWriteProperties: 
35        EnableMagneticStoreWrites: true
36        MagneticStoreRejectedDataLocation: 
37          S3Configuration:
38            BucketName: s3-timestream-magnetic-store-rejected-data
39            EncryptionOption: SSE_S3
40      RetentionProperties: 
41        MagneticStoreRetentionPeriodInDays: !If [ isDev, '365', '730' ]
42        MemoryStoreRetentionPeriodInHours: !If [ isDev, '364', '72' ]
43      Schema:
44        CompositePartitionKey:
45          - EnforcementInRecord: REQUIRED
46            Name: userId
47            Type: DIMENSION
48    DependsOn: 
49      - S3Bucket

With the following command, you will deploy three AWS resources:

IAM

Ensure yours credentials and IAM permissions include the following policies :

1{
2    "Sid": "TimestreamPolicy",
3    "Effect": "Allow",
4    "Action": [
5        "timestream:*"
6    ],
7    "Resource": [
8        "*"
9    ]
10},
11{
12    "Sid": "S3",
13    "Effect": "Allow",
14    "Action": [
15        "s3:*"
16    ],
17    "Resource": [
18        "arn:aws:s3:::*"
19    ]
20}

WARNING : These policies is not a best practice and should be adjusted according to your security requirements.

Deploy

Deploy the project with the following commands:

1sam build
2sam deploy --stack-name timestream-stack

Populate table

Prerequisites:

Install the csv-parser library
Install the @tsx for typescript execution locally
Install the @aws-sdk timestream library

1cd hello-world
2npm i csv-parse -D 
3npm i tsx -D
4npm install @aws-sdk/client-timestream-query -D
5npm install @aws-sdk/client-timestream-write -D

Now, create the data structure TimeStreamRecord according to your needs, For this example, we will use:

Three dimensions:

1- userId
2- userType
3- documentType

One value:

1- size (document file size)

Two measure names:

1- DOCUMENT_PUT
2- DOCUMENT_GET

1export type TimeStreamRecord = {
2    userId: string;
3    userType: string;
4    documentType: string;
5    size: number;
6    time: number;
7    measureName: TimeStreamMeasureName;
8};
9
10export enum TimeStreamMeasureName {
11    DOCUMENT_PUT = 'DOCUMENT_PUT',
12    DOCUMENT_GET = 'DOCUMENT_GET',
13}
14

You can build sample data with mockaroo to obtain this sample file:

File: sample.csv

1userId,userType,documentType,size,time,measureName
2a37e3311-3c49-46ae-bdff-7cf877968b45,CUSTOMER,INVOICE,83388,1710348922000,DOCUMENT_GET
3d2c76e6e-9232-41dc-820a-4762924821a2,CUSTOMER,QUOTATION,69918,1709040240000,DOCUMENT_GET
4bda1f773-3c4b-4433-83ab-84f00c1506f2,SUPPLIER,QUOTATION,93095,1720069220000,DOCUMENT_PUT
5...

Finally, run the script populate.ts:

1npx tsx populate.ts
2...
3Result 1001
4Ingested 1000 TimeStreamRecord

You have now populated your Timestream table with records and can view the data in the AWS Console using the Timestream Query Editor.

Query Editor

Request table

For this tutorial, we will execute a simple aggregate query:

1SELECT YEAR(time), SUM(measure_value::bigint), count(*), documentType
2FROM "sam-timestream-database"."sam-timestream-table"
3WHERE 
4  time between '2023-07-01' and current_date + 1d
5  GROUP BY YEAR(time), documentType
6ORDER BY YEAR(time)

Script execution of query.ts:

1npx tsx query.ts

Aggregate results:

1[
2    {
3        documentType: 'ORDER',
4        year: '2023',
5        size: '6575787',
6        count: '134',
7    },
8    {
9        documentType: 'INVOICE',
10        year: '2023',
11        size: '6710194',
12        count: '145',
13    },
14    {
15        documentType: 'QUOTATION',
16        year: '2023',
17        size: '7439707',
18        count: '146',
19    },
20    {
21        documentType: 'QUOTATION',
22        year: '2024',
23        size: '9630998',
24        count: '194',
25    },
26    {
27        documentType: 'ORDER',
28        year: '2024',
29        size: '10037785',
30        count: '196',
31    },
32    {
33        documentType: 'INVOICE',
34        year: '2024',
35        size: '9329493',
36        count: '185',
37    },
38]

Summary

In this post, we learn how to deploy and manage an AWS Timestream database in a serverless environment using AWS SAM, populate the database with sample data, and perform basic SQL queries to analyze the stored time-series data.

I invite you to explore the following aspects of Timestream not covered in this tutorial:

RetentionProperties (Memory and Magnetic)
CompositePartitionKey
Dimension
MagneticStoreRejectedDataLocation
Prometheus query & Grafana
Complexe query (interpolation & functions)

WARNING: Remember to clean up AWS resources

1sam delete

Deploy timestream in serverless application

July 24, 2024

Project Creation

Project Deployment

IAM

Deploy

Populate table

Request table

Summary

Comments

Related Posts

Create a lambda authorizer for ApiGateway authentication

Continuous intégration of serverless project in Gitlab

On this page