Abel's Blog: Querying Aurora serverless database remotely using Lambda

This post is part of a series

Part 1 - VPC
Part 2 - Aurora cluster
Part 3 - Lambda function

In the previous part, we've set up the Aurora MySql cluster. At this point we can start creating the client code to allow querying.

The Lambda code

In this example I'll be using .NET Core 2.1 as Lambda runtime and C# as programming language. The code is very simple and should be easy to port to your favourite runtime/language.

Lambda Input

The input to my function consists of two main pieces of information: database connection information and the query to execute.

    public class ConnectionInfo
    {
        public string DbUser { get; set; }
        public string DbPassword { get; set; }
        public string DbName { get; set; }
        public string DbHost { get; set; }
        public int DbPort { get; set; }
    }

    public class LambdaInput
    {
        public ConnectionInfo Connection { get; set; }

        public string QueryText { get; set; }
    }

Lambda Code

The function itself returns a List of dictionary where each item of the list represents a "record" from the query result, these are in a key/value form where key is the "field" name and the value is the what comes form the query.

    public List<Dictionary<string, object>> RunQueryHandler(LambdaInput input, ILambdaContext context)
    {
        var cxnString = GetCxnString(input.Connection);
        var query = input.QueryText;

        var result = new List<Dictionary<string, object>>();
        using (var conn = new MySql.Data.MySqlClient.MySqlConnection(cxnString))
        {
            var cmd = GetCommand(conn, query);
            var reader = cmd.ExecuteReader();

            var columns = new List<string>();

            for (int i = 0; i < reader.FieldCount; i++)
            {
                columns.Add(reader.GetName(i));
            }

            while (reader.Read())
            {
                var record = new Dictionary<string, object>();
                foreach (var column in columns)
                {
                    record.Add(column, reader[column]);
                }
                result.Add(record);
            }
        }
        return result;
    }

Support methods

Here is the code of the missing methods: GetCxnString and GetCommand not really complicated.

    private static readonly string cxnStringFormat = "server={0};uid={1};pwd={2};database={3};Connection Timeout=60";

    private string GetCxnString(ConnectionInfo cxn)
    {
        return string.Format(cxnStringFormat, cxn.DbHost, cxn.DbUser, cxn.DbPassword, cxn.DbName);
    }

    private static MySqlCommand GetCommand(MySqlConnection conn, string query)
    {
        conn.Open();
        var cmd = conn.CreateCommand();
        cmd.CommandText = query;
        cmd.CommandType = CommandType.Text;
        return cmd;
    }

Project file

Before compiling and packaging the code we need a project file, assuming you don't have one already, this is how it looks like to be able to run in AWS Lambda environment.

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>netcoreapp2.1</TargetFramework>
    <GenerateRuntimeConfigurationFiles>true</GenerateRuntimeConfigurationFiles>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Amazon.Lambda.Core" Version="1.0.0" />
    <PackageReference Include="Amazon.Lambda.Serialization.Json" Version="1.3.0" />
    <PackageReference Include="MySql.Data" Version="8.0.13" />
    <PackageReference Include="Newtonsoft.Json" Version="11.0.2" />
  </ItemGroup>

  <ItemGroup>
    <DotNetCliToolReference Include="Amazon.Lambda.Tools" Version="2.2.0" />
  </ItemGroup>

Preparing Lambda package

Assuming you have both the code and csproj file in the current directory, we just run dotnet lambda package command as per below, where -c sets the Configuration to release, -f sets the target framework to netcoreapp2.1 and -o sets the output zip file name.

$ dotnet lambda package -c release -f netcoreapp2.1 -o aurora-lambda.zip
Amazon Lambda Tools for .NET Core applications (2.2.0)
Project Home: https://github.com/aws/aws-extensions-for-dotnet-cli, https://github.com/aws/aws-lambda-dotnet

Executing publish command
Deleted previous publish folder
... invoking 'dotnet publish', working folder '/home/abel/Downloads/aurora_cluster_sample/bin/release/netcoreapp2.1/publish'

( ... ) --- removed code for brevity ---

... zipping:   adding: aurora.lambda.deps.json (deflated 76%)
Created publish archive (/home/abel/Downloads/aurora_cluster_sample/aurora-lambda.zip).
Lambda project successfully packaged: /home/abel/Downloads/aurora_cluster_sample/aurora-lambda.zip

Next, we upload the resulting zip file to an S3 bucket of our choice. In this example I'm using a bucket named abelperez-temp and I'm uploading the zip file to a folder named aurora-lambda so I keep some form of organisation in my file directory.

$ aws s3 cp aurora-lambda.zip s3://abelperez-temp/aurora-lambda/
upload: ./aurora-lambda.zip to s3://abelperez-temp/aurora-lambda/aurora-lambda.zip

Lambda stack

To create the Lambda function, I've put together a CloudFormation template that includes:

AWS::EC2::SecurityGroup contains outbound traffic rule to allow port 3306
AWS::IAM::Role contains an IAM role to allow the Lambda function to write to CloudWatch Logs and interact with ENIs
AWS::Lambda::Function contains the function definition

Here is the full template, the required parameters are VpcId, SubnetIds and LambdaS3Bucket which we should get from previous stacks' outputs. The template outputs the function full name, which we'll need to be able to invoke it later.

Special attention to the Lambda function definition, the property Handler, in .NET runtime is in the form of AssemblyName::Namespace.ClassName::MethodName and the property Code containing the S3 location of the zip file we uploaded earlier.

Description: Template to create a lambda function 

Parameters: 
  LambdaS3Bucket:
    Type: String
  DbClusterPort: 
    Type: Number
    Default: 3306
  VpcId: 
    Type: String
  SubnetIds: 
    Type: CommaDelimitedList

Resources:
  LambdaSg:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow outbound traffic to MySQL host
      VpcId:
        Ref: VpcId
      SecurityGroupEgress:
        - IpProtocol: tcp
          FromPort: !Ref DbClusterPort
          ToPort: !Ref DbClusterPort
          CidrIp: 0.0.0.0/0

  AWSLambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
            Action: sts:AssumeRole
      Path: /
      Policies:
        - PolicyName: PermitLambda
          PolicyDocument:
            Version: 2012-10-17
            Statement:
            - Effect: Allow
              Action:
              - logs:CreateLogGroup
              - logs:CreateLogStream
              - logs:PutLogEvents
              - ec2:CreateNetworkInterface
              - ec2:DescribeNetworkInterfaces
              - ec2:DeleteNetworkInterface
              Resource: 
                - "arn:aws:logs:*:*:*"
                - "*"
  HelloLambda:
    Type: AWS::Lambda::Function
    Properties:
      Handler: aurora.lambda::project.lambda.Function::RunQueryHandler
      Role: !GetAtt AWSLambdaExecutionRole.Arn
      Code:
        S3Bucket: !Ref LambdaS3Bucket
        S3Key: aurora-lambda/aurora-lambda.zip
      Runtime: dotnetcore2.1
      Timeout: 30
      VpcConfig:
        SecurityGroupIds:
          - !Ref LambdaSg
        SubnetIds: !Ref SubnetIds

Outputs:
  LambdaFunction:
    Value: !Ref HelloLambda

To deploy this stack we use the following command where we pass the parameters specific to our VPC (VpcId and SubnetIds) as well as the S3 bucket name.

$ aws cloudformation deploy --stack-name fn-stack \
--template-file aurora_lambda_template.yml \
--parameter-overrides VpcId=vpc-0b442e5d98841996c SubnetIds=subnet-013d0bbb3eca284a2,subnet-00c67cfed3ab0a791 LambdaS3Bucket=abelperez-temp \
--capabilities CAPABILITY_IAM

Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - fn-stack

Let's get the outputs as we'll need this information later. We have the Lambda function full name.

$ aws cloudformation describe-stacks --stack-name fn-stack --query Stacks[*].Outputs
[
    [
        {
            "OutputKey": "LambdaFunction",
            "OutputValue": "fn-stack-HelloLambda-C32KDMYICP5W"
        }
    ]
]

Invoking Lambda function

Now that we have deployed the function and we know its full name, we can invoke it by using dotnet lambda invoke-function command. Part of this job is to prepare the payload which is a JSON in put corresponding to the Lambda input defined above.

{
    "Connection": {
        "DbUser": "master", 
        "DbPassword": "Aurora.2019", 
        "DbName": "dbtest", 
        "DbHost": "db-stack-auroramysqlcluster-xxx.rds.amazonaws.com", 
        "DbPort": 3306
    }, 
    "QueryText":"show databases;"
}

Here is the command to invoke the Lambda function, including the payload parameter encoded to escape the quotes and all in a single line. There are better ways to do this, but for the sake of this demonstration, it's good enough.

$ dotnet lambda invoke-function \
--function-name fn-stack-HelloLambda-C32KDMYICP5W \
--payload "{ \"Connection\": {\"DbUser\": \"master\", \"DbPassword\": \"Aurora.2019\", \"DbName\": \"dbtest\", \"DbHost\": \"db-stack-auroramysqlcluster-xxx.rds.amazonaws.com\", \"DbPort\": 3306}, \"QueryText\":\"show databases;\" }" \
--region eu-west-1

Amazon Lambda Tools for .NET Core applications (2.2.0)
Project Home: https://github.com/aws/aws-extensions-for-dotnet-cli, https://github.com/aws/aws-lambda-dotnet

Payload:
[{"Database":"information_schema"},{"Database":"dbtest"},{"Database":"mysql"},{"Database":"performance_schema"}]

Log Tail:
START RequestId: 595944b5-73bb-4536-be92-a42652125ba8 Version: $LATEST
END RequestId: 595944b5-73bb-4536-be92-a42652125ba8
REPORT RequestId: 595944b5-73bb-4536-be92-a42652125ba8  Duration: 11188.62 ms   Billed Duration: 11200 ms       Memory Size: 128 MB     Max Memory Used: 37 MB

Now we can see the output in the Payload section. And that's how we can query remotely any Aurora serverless cluster without having to set up any EC2 instance. This could be extended to handle different SQL operations such as Create, Insert, Delete, etc.

Abel's Blog

Thursday, 7 February 2019

Querying Aurora serverless database remotely using Lambda - part 3