Wednesday 30 January 2019

Querying Aurora serverless database remotely using Lambda - part 1

This post is part of a series

Why aurora serverless?

For those who like experimenting with new technology, AWS feeds us with a lot of new stuff every year at re:Invent conference, and many more times. In August 2018, Amazon announced its general availability. I was intrigued by this totally new way to doing database, so I started to play with it.

The first road block I found was connectivity from my local environment. I wanted to connect to my new cluster using my traditional MySQL Workbench client. It turned out to be one of the limitations clearly explained by Amazon, and as of today:

You can't give an Aurora Serverless DB cluster a public IP address. You can access an Aurora Serverless DB cluster only from within a virtual private cloud (VPC) based on the Amazon VPC service.

Most common workarounds involve the use of an EC2 instance to either run a MySQL client from there or SSH tunnel and allow connections from outside of the VPC. In both cases we'll be charged for the use of the EC2 instace. At the moment of this writing there is a solution for that, but still in beta: Data API.

With all that said, I decided to explore my own way in the meantime by creating a serverless approach involving Lambda to query my database.

Setting up the base - VPC

First, we need a VPC. For testing only, it doesn't really matter if we just use the default VPC in the current region. But if you'd like to get started with a blue print, there is this public CloudFormation template, it contains a sample VPC with two public subnets and two private subnets as well as all the required resources to guarantee connectivity (IGW, NAT, Route tables, etc.).

I prefer to isolate my experiments from the rest of the resources, that's why I came up with a template containing the bare minimum to get a VPC up and running. It only includes:

  • AWS::EC2::VPC the VPC itself - default CIDR block 10.192.0.0/16
  • AWS::EC2::Subnet private subnet 1 - default CIDR block 10.192.20.0/24
  • AWS::EC2::Subnet private subnet 2 - default CIDR block 10.192.21.0/24

Here is the full template, the only required parameter is EnvironmentName, but feel free to override any of the CIDR blocks. The template outputs the IDs corresponding to newly created resources such as the VPC and the subnets.

Description: >-
  This template deploys a VPC, with two private subnets spread across 
  two Availability Zones. This VPC does not provide any internet 
  connectivity resources such as IGW, NAT Gw, etc.

Parameters:
  EnvironmentName:
    Description: >-
      An environment name that will be prefixed to resource names
    Type: String

  VpcCIDR:
    Description: >-
      Please enter the IP range (CIDR notation) for this VPC
    Type: String
    Default: 10.192.0.0/16

  PrivateSubnet1CIDR:
    Description: >-
      Please enter the IP range (CIDR notation) for the private subnet 
      in the first Availability Zone
    Type: String
    Default: 10.192.20.0/24

  PrivateSubnet2CIDR:
    Description: >-
      Please enter the IP range (CIDR notation) for the private subnet 
      in the second Availability Zone
    Type: String
    Default: 10.192.21.0/24

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: !Ref VpcCIDR
      EnableDnsSupport: true
      EnableDnsHostnames: true
      Tags:
        - Key: Name
          Value: !Ref EnvironmentName

  PrivateSubnet1:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      AvailabilityZone: !Select [ 0, !GetAZs '' ]
      CidrBlock: !Ref PrivateSubnet1CIDR
      MapPublicIpOnLaunch: false
      Tags:
        - Key: Name
          Value: !Sub ${EnvironmentName} Private Subnet (AZ1)

  PrivateSubnet2:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      AvailabilityZone: !Select [ 1, !GetAZs '' ]
      CidrBlock: !Ref PrivateSubnet2CIDR
      MapPublicIpOnLaunch: false
      Tags:
        - Key: Name
          Value: !Sub ${EnvironmentName} Private Subnet (AZ2)

Outputs:
  VPC:
    Description: A reference to the created VPC
    Value: !Ref VPC

  PrivateSubnet1:
    Description: A reference to the private subnet in the 1st Availability Zone
    Value: !Ref PrivateSubnet1

  PrivateSubnet2:
    Description: A reference to the private subnet in the 2nd Availability Zone
    Value: !Ref PrivateSubnet2

To create a stack from this template we run the following command (or go to AWS console and upload the template)

$ aws cloudformation deploy --stack-name vpc-stack \
--template-file vpc_template.yml \
--parameter-overrides EnvironmentName=Dev

Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - vpc-stack

After successful stack creation, we can get the outputs as we'll need them for the next step.

$ aws cloudformation describe-stacks --stack-name vpc-stack --query Stacks[*].Outputs
[
    [
        {
            "Description": "A reference to the private subnet in the 1st Availability Zone",
            "OutputKey": "PrivateSubnet1",
            "OutputValue": "subnet-013d0bbb3eca284a2"
        },
        {
            "Description": "A reference to the private subnet in the 2nd Availability Zone",
            "OutputKey": "PrivateSubnet2",
            "OutputValue": "subnet-00c67cfed3ab0a791"
        },
        {
            "Description": "A reference to the created VPC",
            "OutputKey": "VPC",
            "OutputValue": "vpc-0b442e5d98841996c"
        }
    ]
]

In the next part, we'll create the Database cluster using these resources as a base layer.