Links

Why does this thing exist?

Amazon wants customers to be successful in building products with AWS. Hence they publish lots of whitepapers around best practices so customers can implement them too. See, for example, the Serverless Applications Lens - AWS Well-Architected Framework. Several guidelines around the operational excellence pillar are shared: metrics, alerts, structured logging, distributed tracing, prototyping, testing and deploying. To make it easier to implement these guidelines, such as structured logging and tracing, AWS offers the Lambda Powertools. Without this additional library working with metrics and x-ray involves writing undifferentiated boilerplate code.

What problems does this thing solve?

Helps to implement best practices for observability:

  • Structured logging, with out of the box functionality such as logging the incoming event, and it handles serialization for you too
  • Distributed tracing with x-ray using decorators
  • Metrics, easier way to work with custom metrics. No need to worry about async invocations

It also comes with handy extras:

  • Static typing classes
  • Data parsing and deep validation using Pydantic
  • Event source data classes - Data classes describing the schema of common Lambda event triggers
  • Parameters utility - Retrieve and cache parameter values from Parameter Store, Secrets Manager, or DynamoDB
  • Rule Engine for Feature Flags

Structured Logging

In his talk Simplifying serverless best practices with AWS Lambda Powertools, Nicolas Moutschen identifies different types of logging:

  • Raw: [INFO] <timestamp> <msg> (Filtering in cloudwatch is tricky with this)
  • Semi-structured: [INFO] {"message": <msg>} (using a json formatter)
  • Structured: {"level":"info", "message": <msg>} (different keys but can add more data that helps search through logs)
  • Canoncial: [timestamp] <operation> meta=data timestamp | operation | key=value

The problem with raw or semi-structured logging is that you will have difficulty writing CloudWatch Logs Insights queries to pinpoint specific issues. For example, with raw logging, you would write the following CloudWatch Logs Insights query to find logs of level error:

fields @timestamp, @message
| filter @message like "ERROR"
| sort @timestamp desc

With powertools writing logs in a structured manner is as easy as writing raw or semi-structured logs:

from aws_lambda_powertools import Logger

LOG = Logger(utc=True)  # has a service name specified from the environment variables

@LOG.inject_lambda_context(clear_state=True, correlation_id_path=correlation_paths.APPSYNC_RESOLVER)  # automatically logs the incoming event and lambda context
def lambda_handler(event: Mapping[str, Any], context: LambdaContext) -> LambdaResponse:

    return handle_event(event)

The following block is an example of a structured log entry generated with the above example:

{
    "level": "INFO",
    "location": "decorate:345",
    "message": {
        "job": "updateSearchIndex",
        "userId": "email",
        "arguments": {}
    },
    "timestamp": "2021-08-23 12:09:59,905+0000",
    "service": "api",
    "cold_start": true,
    "function_name": "api",
    "function_memory_size": "1024",
    "function_arn": "arn:aws:lambda:eu-west-1:1234:function:api",
    "function_request_id": "e7510919-5270-4a00-b972-4e1b172c43e8",
    "xray_trace_id": "1-61239016-2c96ab4c13db3027066cb717"
}

You can see information from the lambda context is automatically attached to the log message. Note, for example, the request and trace ids or if the lambda had to initialize (cold start).

Additionally, CloudWatch Logs Insights will discover the attributes of the JSON object you're logging, making it easier to select them as columns in a query:

fields level, timestamp, cold_start, function_request_id, xray_trace_id, job
| sort timestamp desc
| filter level = "ERROR"

Tracing

Capturing subsegments and adding annotations is also way easier using powertools:

import JSON
import time
from aws_lambda_powertools import Tracer  # <-

tracer = Tracer() # <-

@tracer.capture_lambda_handler # <- just decorate
def lambda_handler(event, context):
    wait()
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

def wait():
    time.sleep(3)
    return 'waiting for godot'

in comparison to not using it:

import JSON
import time
from aws_xray_sdk.core import xray_recorder

cold_start = True

def lambda_handler(event, context):
    with xray_recorder.capture('dummy') as subsegment:
        response = wait()
        subsegment.put_metadata('response', response, 'my_lambda')
        if cold_start:
            subsegment.put_annotation('ColdStart', True)

    coldStart = False;
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

def wait():
    time.sleep(3)
    return 'waiting for godot'

You can record other functions by just adding a single decorator:

from aws_lambda_powertools import Tracer

class ElasticsearchClient:
    def __init__(self, es_session: Elasticsearch):
        self._es_session = es_session

    @tracer.capture_method
    def index_exists(self, index: str) -> bool:
        return self._es_session.indices.exists(index=index)

Metrics

Moreover, recording custom CloudWatch metrics becomes convenient:

from aws_lambda_powertools import Metrics
from aws_lambda_powertools.metrics import MetricUnit

metrics = Metrics(namespace="ExampleApplication", service="booking")

@metrics.log_metrics
def lambda_handler(evt, ctx):
    metrics.add_metric(name="SuccessfulBooking", unit=MetricUnit.Count, value=1)

Conclusion

If you're writing lambda functions with python or java, there is no reason not to make powertools a part of your standard toolbox. And there is a lot more. For example, Event handlers for AppSync and ApiGateway have become a part of the core feature set of powertools.