Thursday 26 July 2018

How to push Windows and IIS logs to CloudWatch using unified CloudWatch Agent automatically

CloudWatch is a powerful monitoring and management tool, collects monitoring and operational data in the form of logs, metrics, and events, providing you with a unified view of AWS resources. One of the most common use cases is collecting logs from web applications.

Log files are generated locally in the form of text files and some running process monitor them and then decide where to send them. This is usually performed by the SSM Agent, however, as per AWS documents:

"Important The unified CloudWatch Agent has replaced SSM Agent as the tool for sending log data to Amazon CloudWatch Logs. Support for using SSM Agent to send log data will be deprecated in the near future. We recommend that you begin using the unified CloudWatch Agent for your log collection processes as soon as possible."

Assigning permissions to EC2 instances

EC2 instances need permission to access CloudWatch logs, if your current instances don’t have any roles associated, then create one with the CloudWatchAgentServerPolicy managed policy attached.

If your instances already have a role then you can add the policy to the existing role. In either case, the instance needs to perform operations such as CreateLogGroup, CreateLogStream, PutLogEvents and so on.

Install the CloudWatch Agent

On Windows Server, the installation process consists of three basic steps:

  1. Download the package from https://s3.amazonaws.com/amazoncloudwatch-agent/windows/amd64/latest/AmazonCloudWatchAgent.zip
  2. Unzip to a local folder
  3. Change directory to the folder containing unzipped package and run install.ps1

For more information about how to install the agent, see AWS documents.

Here is a powershell snippet to automate this process.

# Install the CloudWatch Agent
$zipfile = "AmazonCloudWatchAgent.zip"
$tempDir = Join-Path $env:TEMP "AmazonCloudWatchAgent"
Invoke-WebRequest -Uri "https://s3.amazonaws.com/amazoncloudwatch-agent/windows/amd64/latest/AmazonCloudWatchAgent.zip" -OutFile $zipfile
Expand-Archive -Path $zipfile -DestinationPath $tempDir -Force
cd $tempDir
Write-Host "Trying to uninstall any previous version of CloudWatch Agent"
.\uninstall.ps1

Write-Host "install the new version of CloudWatch Agent"
.\install.ps1

Creating configuration file

Before launching the agent, a configuration file is required, this configuration file can seem daunting at first, especially because it’s a different format from one used in SSM Agent. This configuration file contain three sections: agent, metrics and logs.

In this case, we are interested only in section logs which in turn has two main parts: windows_events (system or application events we can find in Windows Event Viewer) and files (any log files including IIS logs).

There are two common parameters required:

  • log_group_name - Used in CloudWatch to identify a log group, it should be something meaningful such as the event type or website name.
  • log_stream_name - Used in CloudWatch to identify a log stream within a log group, typically it’s a reference to the current EC2 instance.

Collecting Windows Events

Here is an example of a Windows Event log

{
    "event_levels": ["ERROR","INFORMATION"],
    "log_group_name": "/eventlog/application",
    "event_format": "text",
    "log_stream_name": "EC2AMAZ-NPQGPRK",
    "event_name": "Application"
}

Key points:

  • event_levels can be one or more of (INFORMATION, WARNING, ERROR, CRITICAL,VERBOSE).
  • event_name is typically one of (System, Security, Application)
  • event_format is text or xml.

Collecting IIS logs

Here is an example of an IIS website logs

{
    "log_group_name": "/iis/website1",
    "timezone": "UTC",
    "timestamp_format": "%Y-%m-%d %H:%M:%S",
    "encoding": "utf-8",
    "log_stream_name": "EC2AMAZ-NPQGPRK",
    "file_path": "C:\\inetpub\\logs\\LogFiles\\W3SVC2\\*.log"
}

Key points:

  • timezone and timestamp_format are optional.
  • encoding defaults to utf-8
  • file_path uses the standard Unix glob matching rules to match files, while all the examples in AWS docs display concrete log files, the example above matches all .log files within IIS logs folder, this is important since IIS create new files based on a rotation and we can’t predict their names.

These sections can be repeated for every website and for every Windows Event we’d like to push logs to CloudWatch. If we have several EC2 instances as web servers, this process can be tedious and error prone, therefore it should be automated. Here is an example of a powershell snippet.

$windowsLogs = @("Application", "System", "Security")
$windowsLoglevel = @("ERROR", "INFORMATION")
$instance = hostname

$iissites = Get-Website | Where-Object {$_.Name -ne "Default Web Site"}

$iislogs = @()
foreach ($site in $iissites) {
    $iislog = @{
        file_path = "$($site.logFile.directory)\w3svc$($site.id)\*.log"
        log_group_name = "/iis/$($site.Name.ToLower())"
        log_stream_name = $instance
        timestamp_format = "%Y-%m-%d %H:%M:%S"
        timezone = "UTC"
        encoding = "utf-8"
    }
    $iislogs += $iislog
}

$winlogs = @()
foreach ($event in $windowsLogs) {
    $winlog = @{
        event_name = $event
        event_levels = $windowsLoglevel
        event_format ="text"
        log_group_name = "/eventlog/$($event.ToLower())"
        log_stream_name = $instance
    }
    $winlogs += $winlog
}

$config = @{
    logs = @{
        logs_collected = @{
            files = @{
                collect_list = $iislogs
            }
            windows_events = @{
                collect_list = $winlogs
            }
        }
        log_stream_name = "generic-logs"
    }
}

# this could be any other location as long as it’s absolute
$configfile = "C:\Users\Administrator\amazon-cloudwatch-agent.json"

$json = $config | ConvertTo-Json -Depth 6 

# Encoding oem is important as the file is required without any BOM 
$json | Out-File -Force -Encoding oem $configfile

For more information on how to create this file, see AWS documents.

Starting the agent

With the configuration file in place, it’s time to start the agent, to do that, change directory to CloudWatch Agent installation path, typically within Program Files\Amazon\AmazonCloudWatchAgent and run the following command line:

.\amazon-cloudwatch-agent-ctl.ps1 -a fetch-config -m ec2 -c file:configuration-file-path -s 

Key points:

  • -a is short for -Action, fetch-config indicates it will reload configuration file.
  • -m is short for -Mode, in this case ec2 as opposed to onPrem.
  • -c is short for -ConfigLocation which is the configuration file previously generated.
  • -s is short for -Start which indicates to start the service after loading configuration.

Here is a powershell snippet covering this part of the process.

cd "${env:ProgramFiles}\Amazon\AmazonCloudWatchAgent"
Write-Host "Starting CloudWatch Agent"
.\amazon-cloudwatch-agent-ctl.ps1 -a fetch-config -m ec2 -c file:$configfile -s

Let’s test it.

Assuming we have 3 websites running in our test EC2 instance, let’s name them.

  • website1 - hostname: web1.local
  • website2 - hostname: web2.local
  • website3 - hostname: web3.local

After some browsing to generate some traffic, let’s inspect CloudWatch.

Some Windows Events also in CloudWatch Logs

Here is the complete powershell script.

Tuesday 17 July 2018

Automate SSL certificate validation in AWS Certificate Manager using DNS via Route 53

When creating SSL certificates in AWS Certificate Manager, there is a required step before getting the certificate: Validate domain ownership. This seems obvious but to get a certificate you need to prove that you have control over the requested domain(s). There are two ways to validate domain ownership: by email or by DNS.

Use Email to Validate Domain Ownership

When using this option, ACM will send an email to the three registered contact addresses in WHOIS (Domain registrant, Technical contact, Administrative contact) then will wait for up to 72h for confirmation or it will time out.

This approach requires manual intervention which is not great for automation although there might be scenarios where this is applicable. See official AWS documentation.

Use DNS to Validate Domain Ownership

When using this option, ACM will need to know that you have control over the DNS settings on the domain, it will provide a pair name/value to be created as a CNAME record which it will use to validate and to renew if you wish so.

This approach is more suitable for automation since it doesn't require manual intervention. However, as of this writing, it's not supported yet by CloudFormation and therefore it will need to be done by using AWS CLI or API calls. Follow up the official announcement and comments. See official AWS documentation.

How to do this in the command line?

The following commands have been tested in bash on Linux 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1, there shouldn't be much trouble if trying this on a different operation system, not tested on Windows though.

Some prerequisites:

  • AWS CLI installed and configured.
  • jq package installed and available in PATH.

Set the variable to store domain name and request the certificate to AWS ACM CLI command request-certificate

$ DOMAIN_NAME=abelperez.info

$ SSL_CERT_ARN=`aws acm request-certificate \
--domain-name $DOMAIN_NAME \
--subject-alternative-names *.$DOMAIN_NAME \
--validation-method DNS \
--query CertificateArn \
--region us-east-1 \
--output text`

At this point we have the certificate but it's not validated yet. ACM provides values for us to create a CNAME record so they can verify domain ownership. To do that, use aws acm describe-certificate command to retrieve those values.

Now, let's store the result in a variable to prepare for extracting name and value later.

$ SSL_CERT_JSON=`aws acm describe-certificate \
--certificate-arn $SSL_CERT_ARN \
--query Certificate.DomainValidationOptions \
--region us-east-1`

Extract name and value querying the previous json using jq.

$ SSL_CERT_NAME=`echo $SSL_CERT_JSON \
| jq -r ".[] | select(.DomainName == \"$DOMAIN_NAME\").ResourceRecord.Name"`

$ SSL_CERT_VALUE=`echo $SSL_CERT_JSON \
| jq -r ".[] | select(.DomainName == \"$DOMAIN_NAME\").ResourceRecord.Value"`

Let's verify that SSL_CERT_NAME and SSL_CERT_VALUE captured the right values.

$ echo $SSL_CERT_NAME
_3f88376edb1eda680bd44991197xxxxx.abelperez.info.

$ echo $SSL_CERT_VALUE
_f528dff0e3e6cd0b637169a885xxxxxx.acm-validations.aws.

At this point, we are ready to interact with Route 53 to create the record set using the proposed values from ACM, but first we need the Hosted Zone Id, it can be copied from the console, but we can also get it from Route 53 command line filtering by domain name.

$ R53_HOSTED_ZONE=`aws route53 list-hosted-zones-by-name \
--dns-name $DOMAIN_NAME \
--query HostedZones \
| jq -r ".[] | select(.Name == \"$DOMAIN_NAME.\").Id" \
| sed 's/\/hostedzone\///'`

Route 53 gives us the hosted zone id in the form of "/hostedzone/Z2TXYZQWVABDCE", the leading "/hostedzone/" bit is stripped out using sed command. Let's verify the hosted zone is captured in the variable.

$ echo $R53_HOSTED_ZONE
Z2TXYZQWVABDCE

With the hosted zone id, name and value from ACM, prepare the JSON input for route 53 change-resource-record-sets command, in this is case, Action is a CREATE, TTL can be the default 300 seconds (which is what AWS does itself through the console).

$ read -r -d '' R53_CNAME_JSON << EOM
{
  "Comment": "DNS Validation CNAME record",
  "Changes": [
    {
      "Action": "CREATE",
      "ResourceRecordSet": {
        "Name": "$SSL_CERT_NAME",
        "Type": "CNAME",
        "TTL": 300,
        "ResourceRecords": [
          {
            "Value": "$SSL_CERT_VALUE"
          }
        ]
      }
    }
  ]
}
EOM

We can check all variables were expanded correctly before preparing the command line.

$ echo "$R53_CNAME_JSON"
{
  "Comment": "DNS Validation CNAME record",
  "Changes": [
    {
      "Action": "CREATE",
      "ResourceRecordSet": {
        "Name": "_3f88376edb1eda680bd44991197xxxxx.abelperez.info.",
        "Type": "CNAME",
        "TTL": 300,
        "ResourceRecords": [
          {
            "Value": "_f528dff0e3e6cd0b637169a885xxxxxx.acm-validations.aws."
          }
        ]
      }
    }
  ]
}

Now we've verified everything is in place, finally we can create the record set using route 53 cli.

$ R53_CNAME_ID=`aws route53 change-resource-record-sets \
--hosted-zone-id $R53_HOSTED_ZONE \
--change-batch "$R53_CNAME_JSON" \
--query ChangeInfo.Id \
--output text`

This operation will return a change-id, since route 53 needs to propagate the change, it won't be available immediately, usually within 60 seconds, to ensure we can proceed, we can use the wait command. This command will block the console/script until the record set change is ready.

$ aws route53 wait resource-record-sets-changed --id $R53_CNAME_ID

After the wait, the record set is ready, now ACM needs to validate it, as per AWS docs, it can take up to several hours but in my experience it's not that long. By using another wait command, we'll block the console/script until the certificate is validated.

$ aws acm wait certificate-validated \
--certificate-arn $SSL_CERT_ARN \
--region us-east-1

Once this wait is done, we can verify that our certificate is in fact issued.

$ aws acm describe-certificate \
--certificate-arn $SSL_CERT_ARN \
--query Certificate.Status \
--region us-east-1
"ISSUED"

And this is how it's done, 100% end to end commands, no manual intervention, no console clicks, ready for automation.