Adding AWS Timestream to Prometheus.

In a continuation of my existing build (Prometheus in Fargate), I have been playing around with remote storage options and for the purposes of this document, I am using AWS Timestream.

Prometheus out of the box will store up to 15 days of metrics, but these are lost should the instance/server be terminated. With this in mind, it is worthwhile looking into storage options for metric data with a few options being:

I am sure there are other options available but these are the most common I found when researching. For the sake of my sanity and the many rewrites I have already carried out for this document I am not going into detail regarding any of the above options, and simply showing how I built and updated my current Prometheus build to use the Amazon option.

The Amazon Timestream option requires the use of an adapter which can be configured in Prometheus as a remote_write option to store data. This adapter only uses write and as I am using Grafana for my dashboards I’ll need to install another plugin/adapter that allows connection to my Timestream DB.

Creating the ECR repository

Before starting you need to add an additional repository to the ECR and for this I have kept it simple and descriptive by calling it timestream-adapter.

This is where the build will be stored and called by the task_definition.json file in the Prometheus module.

Cloning and building the Timestream adapter

Using the existing build the folder structure is as follows:

Prometheus
Docker
alertmanager
prometheus

We will create a new folder called timestream

Prometheus
Docker
alertmanager
prometheus
timestream

Navigate to this folder and clone the adapter created by Dennis Pattmann :

git clone https://github.com/dpattmann/prometheus-timestream-adapter.git

Once cloned you need to prepare two of the files to suit your needs. The files are main.go and the Dockerfile. I had a few issues when first building this and actually reached out to Dennis in order to discuss. The Dockerfile you clone is actually incorrect and although it does build successfully, I did receive x509 errors in the Cloudwatch logs.

The Dockerfile when complete will look like:

FROM golang:1.15.3-alpine3.12 as buildWORKDIR /src/prometheus-timestream-adapterADD . /src/prometheus-timestream-adapterRUN apk add build-base ca-certificatesRUN go testRUN CGO_ENABLED=0 GOOS=linux go build -o /prometheus-timestream-adapterFROM scratchCOPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/COPY --from=build /prometheus-timestream-adapter /usr/local/bin/prometheus-timestream-adapterENTRYPOINT ["/usr/local/bin/prometheus-timestream-adapter"]

and main.go has a few configurable under func init()

func init() {prometheus.MustRegister(receivedSamples)prometheus.MustRegister(sentSamples)prometheus.MustRegister(failedSamples)prometheus.MustRegister(sentBatchDuration)flag.StringVar(&cfg.awsRegion, "awsRegion", "<enter region>", "")flag.StringVar(&cfg.databaseName, "databaseName", "prometheus", "")flag.StringVar(&cfg.listenAddr, "listenAddr", ":9201", "")flag.StringVar(&cfg.tableName, "tableName", "prometheus-table", "")flag.StringVar(&cfg.telemetryPath, "telemetryPath", "/metric", "")flag.StringVar(&cfg.tlsCert, "tlsCert", "tls.cert", "")flag.StringVar(&cfg.tlsKey, "tlsKey", "tls.key", "")flag.BoolVar(&cfg.tls, "tls", false, "")flag.Parse()}

Once these have been updated you are ready to build.

docker build -t timestream-adapter .

Once built you can push this to the waiting repository in AWS.

Obtain an authentication token and authenticate Docker:

aws-runas <name of account> aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <accountID>.dkr.ecr.us-east-1.amazonaws.com

Tag the image appropriately:

docker tag timestream-adapter:latest <accountID>.dkr.ecr.us-east-1.amazonaws.com/timestream-adapter:latest

Push the build to ECR:

docker push <accountID>.dkr.ecr.us-east-1.amazonaws.com/timestream-adapter:latest

Amazon Timestream

As stated above the current version of Terraform does not allow me to set up a Timestream database and table, so this has been completed manually using the AWS console.

Log in to the console and choose Amazon Timestream from the list of AWS services under Database (search if easier).

Create a database called prometheus by clicking the top right hand create database button.

Once created click on the database called prometheus, click the tables tab and then click create table:

I called my table prometheus-table, configuring the Memory store retention to 12 hours and the Magnetic store retention to 1 year.

Updating IAM permissions

Pretty much the last thing I did was add permissions allowing the ECS task the ability to talk to Timestream. I wanted to follow the Cloudwatch logs to ensure it was working correctly before altering permissions and was relieved to see the ‘Access denied’ reference.

A reminder of the folder structure so far (including previous build):

prometheus
infrastructure
docker
alertmanager
prometheus
timestream
modules
ecs
security-groups
app
templates

For the purposes of this document though I am adding here. Update the ECS module file called main.tf to include the timestream policy. In this build, I am using the AWS policy called AmazonTimestreamFullAccess but would suggest practising the ‘least privilege’ method for your own builds.

ecs/main.tf

resource "aws_iam_role_policy_attachment" "timestream-access" {
role = "${aws_iam_role.task-definition-role.name}"
policy_arn = "arn:aws:iam::aws:policy/AmazonTimestreamFullAccess"

I added the above under the last entry for role_policy_attachment.

Adding an additional CloudWatch log group

Update main.tf with a new entry for aws_cloudwatch_log_group. This will be used by the task definition later on in this document.

ecs/main.tf

resource "aws_cloudwatch_log_group" "timestream" {
name = "/${var.env["environment"]}/timestream"
retention_in_days = "7"
tags = "${merge(map("Name",format("%s-cloudwatch",var.tags["environment"])), var.tags)}"
}

Updating the Task Definition

Navigate to modules/app/templates and open up the task_definition.json file. For this, we are going to add a new entry after the coveo/ecs-exporter section.

{
"essential": false,
"cpu": 10,
"image": "coveo/ecs-exporter",
"memory": 64,
"name": "ecs-exporter",
"networkMode": "awsvpc",
"command": ["-aws.region=<enter region>"],
"portMappings": [
{
"containerport": 9222,
"hostPort": 9222,
"protocol": "tcp"
}
]
},

Don’t forget the comma ^^^ at the end.

{"essential": false,
"cpu": 10,
"image": "<accountID>.dkr.ecr.<enter region>.amazonaws.com/timestream-adapter:latest",
"memoryReservation": 128,
"name": "timestream-adapter",
"networkMode": "awsvpc",
"portMappings": [
{
"containerPort": 9201,
"hostPort": 9201,
"protocol": "tcp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/${awslogs-group}/timestream",
"awslogs-region": "${awslogs-region}",
"awslogs-stream-prefix": "timestream"
}
}
}
]

The new entry will more than likely change in the future as I convert to variables etc. For the purposes of this build, we can manage with the above.

Target Groups and Listener update

The last thing we need to do with the infrastructure is to update the target group to include port 9201. Navigate to modules/app and open main.tf.

Add the following after “aws_alb_target_group” “alertmanager”:

# Adding target group for timestream adapterresource "aws_alb_target_group" "timestream" {
name = "${var.env["environment"]}-ts"
port = 9201
protocol = "HTTP"
target_type = "ip"
vpc_id = "${var.vpc_id}"
health_check {
matcher = "200,302"
}
}
resource "aws_alb_listener" "timestream" {
load_balancer_arn = "${aws_alb.main.id}"
port = 9201
protocol = "HTTP"
default_action {
target_group_arn = "${aws_alb_target_group.timestream.id}"
type = "forward"
}
}

Adding remote_write

In order for Prometheus to use the timestream-adapter you need to add a reference to it in the prometheus.yml file and rebuild the image. Navigate to prometheus/docker/prometheus and open prometheus.yml.

Add the following :

remote_write:
- url: "http://localhost:9201/write"

Rebuild the image as before:

docker build -t prometheus .

Tag the image:

docker tag prometheus:latest <accountID>.dkr.ecr.<enter region>.amazonaws.com/prometheus:latest

Push the image to ECR:

docker push <accountID>.dkr.ecr.<enter region>.amazonaws.com/prometheus:latest

And finally

At this point, you should be ready to execute a Terraform apply. In my environment, I stopped the prometheus task in order to force it to restart a new task with the changes, and after a few minutes I could not only see logs for the timestream task but data in the Timestream DB created.

I have gone over this document should have captured all the steps I took in order to build out the Timestream DB in AWS and utilize the adapter written by Dennis in Go. If however, you do notice something I have missed or want to give me any feedback, please leave me a comment.

Senior Site Reliability Engineer