AWS Blog

In the Works – VMware Cloud on AWS

by Jeff Barr | on | in Announcements | | Comments

The long-standing trend toward on-premises virtualization has helped many enterprises to increase operational efficiency and to wring out as much value from their data center as possible. Along the way, they have built up a substantial repertoire of architectural skills and operational experience, but now find that they are struggling to match public cloud economics and the AWS pace of innovation.

Because of this, many enterprises are now looking at the AWS Cloud and like what they see. They are enticed by the fact that AWS has data centers in 35 Availability Zones across 13 different locations around the world (with construction underway in five more) and see considerable value in the rich set of AWS Services and the flexible pay-as-you-go model, and are looking at ways to move in to the future while building on an investment in virtualization that often dates back a decade or more.

VMware + AWS = Win
In order to help these organizations take advantage of the benefits that AWS has to offer while building on their existing investment in virtualization, we are working with our friends at VMware to build and deliver VMware Cloud on AWS.

This new offering is a native, fully managed VMware environment on the AWS Cloud that can be accessed on an hourly, on-demand basis or in subscription form. It includes the same core VMware technologies that customers run in their data centers today including vSphere Hypervisor (ESXi), Virtual SAN (vSAN), and the NSX network virtualization platform and is designed to provide a clean, seamless experience.

VMware Cloud on AWS runs directly on the physical hardware, while still taking advantage of a host of network and hardware features designed to support our security-first design model. This allows VMware to run their virtualization stack on AWS infrastructure without having to use nested virtualization.

If you find yourself in the situation that I described above—running on-premises virtualization but looking forward to the cloud—I think you’ll find a lot to like here. Your investment in packaging, tooling, and training will continue to pay dividends, as will your existing VMware licenses, agreements, and discounts. Everything that you and your team know about ESXi, vSAN, and NSX remain relevant and valuable. You will be able to manage your entire VMware environment (on-premises and AWS) using your existing copy of vCenter, along with tools and scripts that make use of the vCenter APIs.

The entire roster of AWS compute, storage, database, analytics, mobile, and IoT services can be directly accessed from your applications. Because your VMware applications will be running in the same data centers as the AWS services, you’ll be able to benefit from fast, low-latency connectivity when you use these services to enhance or extend your applications. You’ll also be able to take advantage of AWS migration tools such as AWS Database Migration Service, AWS Import/Export Snowball, and AWS Storage Gateway.

Plenty of Options
VMware Cloud on AWS will give you a lot of different options when it comes to migration, data center consolidation, modernization, and globalization:

On the migration side, you can use vSphere vMotion to live-migrate individual VMs, workloads, or entire data centers to AWS with a couple of clicks. Along the way, as you migrate individual components, you can use AWS Direct Connect to set up a dedicated network connection from your premises to AWS.

When it comes to data center consolidation, you can migrate code and data to AWS without having to alter your existing operational practices, tools, or policies.

When you are ready to modernize, you can take advantage of unique and  powerful features such as Amazon Aurora (a highly scalable relational database designed to be compatible with MySQL), Amazon Redshift (a fast, fully managed, petabyte-scale data warehouse), and many other services.

When you need to globalize your business, you can spin up your existing applications in multiple AWS regions with a couple of clicks.

Stay Tuned
I will share more information on this development as it becomes available. To learn more, visit the VMware Cloud on AWS page.

Jeff;

Amazon ElastiCache for Redis Update – Sharded Clusters, Engine Improvements, and More

by Jeff Barr | on | in Amazon ElastiCache | | Comments

Many AWS customers use Amazon ElastiCache to implement a fast, in-memory data store for their applications.

We launched Amazon ElastiCache for Redis in 2013 and have added snapshot exports to S3, a refreshed engine, scale-up capabilities, tagging, and support for Multi-AZ operation with automatic failover over the past year or so.

Today we are adding a healthy collection of new features and capabilities to ElastiCache for Redis. Here’s an overview:

Sharded Cluster Support – You can now create sharded clusters that can hold more than 3.5 TiB of in-memory data.

Improved Console – Creation and maintenance of clusters is now more straightforward and requires far fewer clicks.

Engine Update – You now have access to the features of the Redis 3.2 engine.

Geospatial Data – You can now store and process geospatial data.

Let’s dive in!

Sharded Cluster Support / New Console
Until now, ElastiCache for Redis allowed you to create a cluster containing a single primary node and up to 5 read replicas. This model limited the size of the in-memory data store to 237 GiB per cluster.

You can now create clusters with up to 15 shards, expanding the overall in-memory data store to more than 3.5 TiB. Each shard can have up to 5 read replicas, giving you the ability to handle 20 million reads and 4.5 million writes per second.

The sharded model, in conjunction with the read replicas, improves overall performance and availability. Data is spread across multiple nodes and the read replicas support rapid, automatic failover in the event that a primary node has an issue.

In order to take advantage of the sharded model, you must use a Redis client that is cluster-aware. The client will treat the cluster as a hash table with 16,384 slots spread equally across the shards, and will then map the incoming keys to the proper shard.

ElastiCache for Redis treats the entire cluster as a unit for backup and restore purposes; you don’t have to think about or manage backups for the individual shards.

The Console has been improved and I can create my first Scale Out cluster with ease (note that I checked Cluster Mode enabled (Scale Out) after I chose Redis as my Cluster engine):

The Console helps me to choose a suitable node type with a handy new menu:

You can also create sharded clusters using the AWS Command Line Interface (CLI), the AWS Tools for Windows PowerShell, the ElastiCache API, or via a AWS CloudFormation template.

Engine Update
Amazon ElastiCache for Redis is compatible with version 3.2 of the Redis engine. The engine includes three new features that may be of interest to you:

Enforced Write Consistency – the new WAIT command blocks the caller until all previous write commands have been acknowledged by the primary node and a specified number of read replicas. This change does not make Redis in to a strongly consistent data store, but it does improve the odds that a freshly promoted read replica will include the most recent writes to previous primary.

SPOP with COUNT – The SPOP command removes and then returns a random element from a set. You can now request more than one element at a time.

Bitfields – Bitfields are a memory-efficient way to store a collection of many small integers as a bitmap, stored as a Redis string. Using the BITFIELD command, you can address (GET) and manipulate (SET, increment, or decrement) fields of varying widths without having to think about alignment to byte or word boundaries.

Our implementation of Redis includes a snapshot mechanism that does not need to fork the server process into parent and child processes. Under heavy load, the standard, fork-based snapshot mechanism can lead to degraded performance due to swapping. Our alternative implementation comes in to play when memory utilization is above 50% and neatly sidesteps the issue. It is a bit slower, so we use it only when necessary.

We have improved the performance of the syncing mechanism that brings a fresh read replica into sync with its primary node. We made a similar improvement to the mechanism that brings the remaining read replicas back in to sync with the newly promoted primary node.

As I noted earlier, our engine is compatible with the comparable open source version and your applications do not require any changes.

Geospatial Data
You can now store and query geospatial data (a latitude and a longitude). Here are the commands:

  • GEOADD – Insert a geospatial item.
  • GEODIST – Get the distance between two geospatial items.
  • GEOHASH – Get a Geohash (geocoding) string for an item.
  • GEOPOS – Return the positions of items identified by a key.
  • GEORADIUS -Return items that are within a specified radius of a location.
  • GEORADIUSBYMEMBER – Return items that are within a specified radius of another item.

Available Now
Sharded cluster creation and all of the features that I mentioned are available now and you can start using them today in all AWS regions.

Jeff;

New AWS Quick Starts for Atlassian JIRA Software and Bitbucket Data Center

by Jeff Barr | on | in Quick Start | | Comments

The AWS Quick Starts help you to rapidly deploy reference implementations of software solutions on the AWS Cloud. You can use the Quick Starts to easily test drive and consume software while taking advantage of best practices promoted by AWS and the software partner.

Today I would like to tell you about a pair of Quick Start guides that were developed in collaboration with APN Advanced Technology Partner (and DevOps competency holder) Atlassian to help you to deploy their JIRA Software Data Center and Bitbucket Data Center on AWS.

Atlassian’s Data Center offerings are designed for customers that have large development teams and a need for scalable, highly available development and project management tools. Because these tools are invariably mission-critical, robustness and resilience are baseline requirements, production deployments are always run in a multi-node or cluster configuration.

New Quick Starts
JIRA Software Data Center is a project and issue management solution for agile teams and Bitbucket Data Center is a Git repository solution, both of which provide large teams working on multiple projects with high availability and performance at scale. With these two newly introduced Atlassian Quick Starts, you have access to a thoroughly tested, fully supported reference architecture that greatly simplifies and accelerates the deployment of these products on AWS.

The Quick Starts include AWS CloudFormation templates that allow you deploy Bitbucket and/or JIRA Software into a new or existing Virtual Private Cloud (VPC). If you want to use a new VPC, the template will create it, along with public and private subnets and a NAT Gateway to allow EC2 instances in the private subnet to connect to the Internet (in regions where the NAT Gateway is not available, the template will create a NAT instance instead). If you are already using AWS and have a suitable VPC, you can deploy JIRA Software Data Center and Bitbucket Data Center there instead.

You will need to sign up for evaluation licenses for the Atlassian products that you intend to launch.

Bitbucket Data Center
The Bitbucket Data Center Quick Start deploys the following components as part of the deployment:

Amazon RDS PostgreSQL – Bitbucket Data Center requires a supported external database. Amazon RDS for PostgreSQL in a Multi-AZ configuration allows failover in the event the master node fails.

NFS Server –  Bitbucket Data Center uses a shared file system to store the repositories in a common location that is accessible to multiple Bitbucket nodes. The Quick Start architecture implements the shared file system in an EC2 instance with an attached Amazon Elastic Block Store (EBS) volume.

Bitbucket Auto Scaling Group – The Bitbucket Data Center product is installed on Amazon Elastic Compute Cloud (EC2) instances in an Auto Scaling group. The deployment will scale out and in, based on utilization.

Amazon Elasticsearch Service – Bitbucket Data Center uses Elasticsearch for indexing and searching.  The Quick Start architecture uses Amazon Elasticsearch Service, a managed service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS Cloud.

JIRA Software Data Center
The JIRA Software Data Center Quick Start deploys the following components as part of the deployment:

Amazon RDS PostgreSQL – JIRA Data Center requires a supported external database. Amazon RDS for PostgreSQL in a Multi-AZ configuration allows failover in the event the master node fails.

Amazon Elastic File System – JIRA Software Data Center uses a shared file system to store artifacts in a common location that is accessible to multiple JIRA nodes. The Quick Start architecture implements a highly available shared file system using Amazon Elastic File System.

JIRA Auto Scaling Group – The JIRA Data Center product is installed on Amazon Elastic Compute Cloud (EC2) instances in an Auto Scaling group. The deployment will scale out and in, based on utilization.

We will continue to work with Atlassian to update and refine these two new Quick Starts.  We’re also working on two additional Quick Starts for Atlassian Confluence and Atlassian JIRA Service Desk and hope to have them ready before AWS re:Invent.

To get started, please visit the Bitbucket Data Center Quick Start or the JIRA Software Data Center Quick Start. You can also head over to Atlassian’s Quick Start page. The templates are available today; give them a whirl and let us know what you think!

Jeff;

AWS Webinars – October and November 2016

by Jeff Barr | on | in Webinars | | Comments

Are you keeping up with the latest developments in AWS-land? Do you have a good understanding of Amazon Redshift, Amazon ECS, and Amazon Cognito? Do you know what IoT is all about, and do you have firm grasp of the best security practices for your cloud workloads? Do you understand the EC2 Spot Market, and do you know how to use it to your advantage?

In pursuit of our focus on training and education, I am pleased to share the webinars that we have on tap for October and November. These are free, but they do fill up and I suggest that you register ahead of time. All times are PT and each webinar runs for one hour:

October 25

October 26

October 27

November 8

November 9

November 10

Jeff;
 
PS – Don’t forget our upcoming re:Invent 2016 Preparation Webinars!

X1 Instance Update – X1.16xlarge + More Regions

by Jeff Barr | on | in Amazon EC2, Launch | | Comments

Earlier this year we made the x1.32xlarge instance available. With nearly 2 TiB of memory, this instance type is a great fit for memory-intensive big data, caching, and analytics workloads. Our customers are running the SAP HANA in-memory database, large-scale Apache Spark and Hadoop jobs, and many types of high performance computing (HPC) workloads.

Today we are making two updates to the X1 instance type:

  • New Instance Size – The new x1.16xlarge instance size provides another option for running smaller-footprint workloads.
  • New Regions – The X1 instances are now available in three additional regions, bringing the total to ten.

New Instance Size
Here are the specifications for the new x1.16xlarge:

  • Processor: 2 x Intel™ Xeon E7 8880 v3 (Haswell) running at 2.3 GHz – 32 cores / 64 vCPUs.
  • Memory: 976 GiB with Single Device Data Correction (SDDC+1).
  • Instance Storage: 1,920 GB SSD.
  • Network Bandwidth: 10 Gbps.
  • Dedicated EBS Bandwidth: 5 Gbps (EBS Optimized at no additional cost).

Like the x1.32xlarge, this instance supports Turbo Boost 2.0 (up to 3.1 GHz), AVX 2.0, AES-NI, and TSX-NI.  They are available as On-Demand Instances, Spot Instances, or Dedicated Instances; you can also purchase Reserved Instances and Dedicated Host Reservations.

New Regions
Both sizes of X1 instances are available in the following regions:

  • US East (Northern Virginia)
  • US West (Oregon)
  • Europe (Ireland)
  • Europe (Frankfurt)
  • Asia Pacific (Tokyo)
  • Asia Pacific (Singapore)
  • Asia Pacific (Sydney)
  • Asia Pacific (Mumbai) – New
  • AWS GovCloud (US) – New
  • Asia Pacific (Seoul) – New
Jeff;

 

AWS Blog Usability Panel (Seattle or Remote)

by Jeff Barr | on | in Events | | Comments

In order to make sure that the AWS Blog is meeting your information and entertainment needs, we are planning to conduct some usability panels later this month. We are looking for a mix or local (Seattle) and remote participants with any level of experience reading the blog and/or using AWS. If you participate in a usability panel, you’ll receive an Amazon.com gift card as a token of our appreciation.

If you are interested in participating, sign up today.

Jeff;

 

 

AWS Week in Review – October 3, 2016

by Jeff Barr | on | in Week in Review | | Comments

Twenty external and internal contributors worked together to create this edition of the AWS Week in Review. If you would like to join the party (with the possibility of a free lunch at re:Invent), please visit the AWS Week in Review on GitHub.

Monday

October 3

Tuesday

October 4

Wednesday

October 5

Thursday

October 6

Friday

October 7

Saturday

October 8

  • Nothing happened!
Sunday

October 9

New AWS Marketplace Products

New & Notable Open Source

  • elastic-ci-stack-for-aws is a simple, flexible, auto-scaling cluster of build agents running in your own VPC.
  • aws-lambda-ebs-backups contains Python scripts to be run using AWS’s Lambda service to Backup and Delete Snapshots of EBS Volumes.
  • load-dyno-table is a DynamoDB table loader.
  • letsencrypt-iam-lambda is an AWS Lambda function to take a received S3 event, and update a related certificate in AWS IAM.
  • sundial is a job system for Amazon EC2 Container Service that manages dependencies and scheduling.
  • jrestless is a frakework to help you build JAX-RS applications with serverless architectures using AWS Lambda.
  • AWSLambda_CloudFrontMetaData is a Lambda Function to extract meta data (IP version, HTTP version, and Edge location) from customer visits to CloudFront.
  • hologram implements easy, painless AWS credentials on developer laptops.
  • og-aws is a practical guide to AWS.
  • aws-secrets helps to manage secrets on EC2 instances with KMS encryption, IAM roles, and S3 storage.

New YouTube Videos

Upcoming Events

Help Wanted

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

Snowball HDFS Import

by Jeff Barr | on | in Amazon Elastic MapReduce, AWS Snowball, Launch | | Comments

If you are running MapReduce jobs on premises and storing data in HDFS (the Hadoop Distributed File System), you can now copy that data directly from HDFS to an AWS Import/Export Snowball without using an intermediary staging file. Because HDFS is often used for Big Data workloads, this can greatly simplify the process of importing large amounts of data to AWS for further processing.

To use this new feature, download and configure the newest version of the Snowball Client on the on-premises host that is running the desired HDFS cluster. Then use commands like this to copy files from HDFS to S3 via Snowball:

$ snowball cp -n hdfs://HOST:PORT/PATH_TO_FILE_ON_HDFS s3://BUCKET-NAME/DESTINATION-PATH

You can use the -r option to recursively copy an entire folder:

$ snowball cp -n -r hdfs://HOST:PORT/PATH_TO_FOLDER_ON_HDFS s3://BUCKET_NAME/DESTINATION_PATH

To learn more, read Using the HDFS Client.

Jeff;

IPv6 Support Update – CloudFront, WAF, and S3 Transfer Acceleration

by Jeff Barr | on | in Amazon CloudFront, Amazon S3, AWS Web Application Firewall | | Comments

As a follow-up to our recent announcement of IPv6 support for Amazon S3, I am happy to be able to tell you that IPv6 support is now available for Amazon CloudFront, Amazon S3 Transfer Acceleration, and AWS WAF and that all 60+ CloudFront edge locations now support IPv6. We are enabling IPv6 in a phased rollout that starts today and will extend across all of the networks over the next few weeks.

CloudFront IPv6 Support
You can now enable IPv6 support for individual Amazon CloudFront distributions. Viewers and networks that connect to a CloudFront edge location over IPv6 will automatically be served content over IPv6. Those that connect over IPv4 will continue to work as before. Connections to your origin servers will be made using IPv4.

Newly created distributions are automatically enabled for IPv6; you can modify an existing distribution by checking Enable IPv6 in the console or setting it via the CloudFront API:

Here are a couple of important things to know about this new feature:

  • Alias Records – After you enable IPv6  support for a distribution, the DNS entry for the distribution will be updated to include an AAAA record. If you are using Amazon Route 53 and an alias record to map all or part of your domain to the distribution, you will need to add an AAAA alias to the domain.
  • Log Files – If you have enabled CloudFront Access Logs, IPv6 addresses will start to show up in the c-ip field; make sure that your log processing system knows what to do with them.
  • Trusted Signers -If you make use of Trusted Signers in conjunction with an IP address whitelist, we strongly recommend the use of an IPv4-only distribution for Trusted Signer URLs that have an IP whitelist and a separate, IPv4/IPv6 distribution for the actual content. This model sidesteps an issue that would arise if the signing request arrived over an IPv4 address and was signed as such, only to have the request for the content arrive via a different, IPv6 address that is not on the whitelist.
  • CloudFormation – CloudFormation support is in the works. With today’s launch, distributions that are created from a CloudFormation template will not be enabled for IPv6. If you update an existing stack, the setting will remain as-is for any distributions referenced in the stack..
  • AWS WAF – If you use AWS WAF in conjunction with CloudFront, be sure to update your WebACLs and your IP rulesets as appropriate in order to whitelist or blacklist IPv6 addresses.
  • Forwarded Headers – When you enable IPv6 for a distribution, the X-Forwarded-For header that is presented to the origin will contain an IPv6 address. You need to make sure that the origin is able to process headers of this form.

To learn more, read IPv6 Support for Amazon CloudFront.

AWS WAF IPv6 Support
AWS WAF helps you to protect your applications from application-layer attacks (read New – AWS WAF to learn more).

AWS WAF can now inspect requests that arrive via IPv4 or IPv6 addresses. You can create web ACLs that match IPv6 addresses, as described in Working with IP Match Conditions:

All existing WAF features will work with IPv6 and there will be no visible change in performance. The IPv6 will appear in the Sampled Requests collected and displayed by WAF:

S3 Transfer Acceleration IPv6 Support
This important new S3 feature (read AWS Storage Update – Amazon S3 Transfer Acceleration + Larger Snowballs in More Regions for more info) now has IPv6 support. You can simply switch to the new dual-stack endpoint for your uploads. Simply change:

https://BUCKET.s3-accelerate.amazonaws.com

to

https://BUCKET.s3-accelerate.dualstack.amazonaws.com

Here’s some code that uses the AWS SDK for Java to create a client object and enable dual-stack transfer:

AmazonS3Client s3 = new AmazonS3Client();
s3.setS3ClientOptions(S3ClientOptions.builder().enableDualstack().setAccelerateModeEnabled(true).build());

Most applications and network stacks will prefer IPv6 automatically, and no further configuration should be required. You should plan to take a look at the IAM policies for your buckets in order to make sure that they will work as expected in conjunction with IPv6 addresses.

To learn more, read about Making Requests to Amazon S3 over IPv6.

Don’t Forget to Test
As a reminder, if IPv6 connectivity to any AWS region is limited or non-existent, IPv4 will be used instead. Also, as I noted in my earlier post, the client system can be configured to support IPv6 but connected to a network that is not configured to route IPv6 packets to the Internet. Therefore, we recommend some application-level testing of end-to-end connectivity before you switch to IPv6.

Jeff;

 

New – CloudWatch Plugin for collectd

by Jeff Barr | on | in Amazon CloudWatch, Amazon EC2 | | Comments

You have had the power to store your own business, application, and system metrics in Amazon CloudWatch for quite some time (see New – Custom Metrics for Amazon CloudWatch to learn more).  As I wrote way back in 2011 when I introduced this feature, “You can view graphs, set alarms, and initiate automated actions based on these metrics, just as you can for the metrics that CloudWatch already stores for your AWS resources.”

Today we are simplifying the process of collecting statistics from your system and getting them in to CloudWatch with the introduction of a new CloudWatch plugin for collectd. By combining collectd‘s ability to gather many different types of statistics with the CloudWatch features for storage, display, alerting, and alarming, you can become better informed about the state and performance of your EC2 instances and your on-premises hardware and the applications running on them. The plugin is being released as an open source project and we are looking forward to your pull requests.

The collectd daemon is written in C for performance and portability. It supports over one hundred plugins, allowing you to collect statistics on Apache and Nginx web server performance, memory usage, uptime, and much more.

Installation and Configuration
I installed and configured collectd and the new plugin on an EC2 instance in order to see it in action.

To get started I created an IAM Policy with permission to write metrics data to CloudWatch:

Then I created an IAM Role that allows EC2 (and hence the collectd code running on my instance) to use my Policy:

If I was planning to use the plugin to collect statistics from my on-premises servers or if my EC2 instances were already running, I could have skipped these steps, and created an IAM user with the appropriate permissions instead. Had I done this, I would have had to put the user’s credentials on the servers or instances.

With the Policy and the Role in place, I launched an EC2 instance and selected the Role:

I logged in and installed collectd:

$ sudo yum -y install collectd

Then I fetched the plugin and the install script, made the script executable, and ran it:

$ chmod a+x setup.py
$ sudo ./setup.py

I answered a few questions and the setup ran without incident, starting up collectd after configuring it:

Installing dependencies ... OK
Installing python dependencies ... OK
Copying plugin tar file ... OK
Extracting plugin ... OK
Moving to collectd plugins directory ... OK
Copying CloudWatch plugin include file ... OK

Choose AWS region for published metrics:
  1. Automatic [us-east-1]
  2. Custom
Enter choice [1]: 1

Choose hostname for published metrics:
  1. EC2 instance id [i-057d2ed2260c3e251]
  2. Custom
Enter choice [1]: 1

Choose authentication method:
  1. IAM Role [Collectd_PutMetricData]
  2. IAM User
Enter choice [1]: 1

Choose how to install CloudWatch plugin in collectd:
  1. Do not modify existing collectd configuration
  2. Add plugin to the existing configuration
Enter choice [2]: 2
Plugin configuration written successfully.
Stopping collectd process ... NOT OK
Starting collectd process ... OK
$

With collectd running and the plugin installed and configured, the next step was to decide on the statistics of interest and configure the plugin to publish them to CloudWatch (note that there is a per-metric cost so this is an important step).

The file /opt/collectd-plugins/cloudwatch/config/blocked_metrics contains a list of metrics that have been collected but not published to CloudWatch:

$ cat /opt/collectd-plugins/cloudwatch/config/blocked_metrics
# This file is automatically generated - do not modify this file.
# Use this file to find metrics to be added to the whitelist file instead.
cpu-0-cpu-user
cpu-0-cpu-nice
cpu-0-cpu-system
cpu-0-cpu-idle
cpu-0-cpu-wait
cpu-0-cpu-interrupt
cpu-0-cpu-softirq
cpu-0-cpu-steal
interface-lo-if_octets-
interface-lo-if_packets-
interface-lo-if_errors-
interface-eth0-if_octets-
interface-eth0-if_packets-
interface-eth0-if_errors-
memory--memory-used
load--load-
memory--memory-buffered
memory--memory-cached

I was interested in memory consumption so I added one line to /opt/collectd-plugins/cloudwatch/config/whitelist.conf:

memory--memory-.*

The collectd configuration file (/etc/collectd.conf) contains additional settings for collectd and the plugins. I did not need to make any changes to it.

I restarted collectd so that it would pick up the change:

$ sudo service collectd restart

I exercised my instance a bit in order to consume some memory, and then opened up the CloudWatch Console to locate and display my metrics:

This screenshot includes a preview of an upcoming enhancement to the CloudWatch Console; don’t worry if yours doesn’t look as cool (stay tuned for more information on this).

If I had been monitoring a production instance, I could have installed one or more of the collectd plugins. Here’s a list of what’s available on the Amazon Linux AMI:

$ sudo yum list | grep collectd
collectd.x86_64                        5.4.1-1.11.amzn1               @amzn-main
collectd-amqp.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-apache.x86_64                 5.4.1-1.11.amzn1               amzn-main
collectd-bind.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-curl.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-curl_xml.x86_64               5.4.1-1.11.amzn1               amzn-main
collectd-dbi.x86_64                    5.4.1-1.11.amzn1               amzn-main
collectd-dns.x86_64                    5.4.1-1.11.amzn1               amzn-main
collectd-email.x86_64                  5.4.1-1.11.amzn1               amzn-main
collectd-generic-jmx.x86_64            5.4.1-1.11.amzn1               amzn-main
collectd-gmond.x86_64                  5.4.1-1.11.amzn1               amzn-main
collectd-ipmi.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-iptables.x86_64               5.4.1-1.11.amzn1               amzn-main
collectd-ipvs.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-java.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-lvm.x86_64                    5.4.1-1.11.amzn1               amzn-main
collectd-memcachec.x86_64              5.4.1-1.11.amzn1               amzn-main
collectd-mysql.x86_64                  5.4.1-1.11.amzn1               amzn-main
collectd-netlink.x86_64                5.4.1-1.11.amzn1               amzn-main
collectd-nginx.x86_64                  5.4.1-1.11.amzn1               amzn-main
collectd-notify_email.x86_64           5.4.1-1.11.amzn1               amzn-main
collectd-postgresql.x86_64             5.4.1-1.11.amzn1               amzn-main
collectd-rrdcached.x86_64              5.4.1-1.11.amzn1               amzn-main
collectd-rrdtool.x86_64                5.4.1-1.11.amzn1               amzn-main
collectd-snmp.x86_64                   5.4.1-1.11.amzn1               amzn-main
collectd-varnish.x86_64                5.4.1-1.11.amzn1               amzn-main
collectd-web.x86_64                    5.4.1-1.11.amzn1               amzn-main

Things to Know
If you are using version 5.5 or newer of collectd, four metrics are now published by default:

  • df-root-percent_bytes-used – disk utilization
  • memory–percent-used – memory utilization
  • swap–percent-used – swap utilization
  • cpu–percent-active – cpu utilization

You can remove these from the whitelist.conf file if you don’t want them to be published.

The primary repositories for the Amazon Linux AMI, Ubuntu, RHEL, and CentOS currently provide older versions of collectd; please be aware of this change in the default behavior if you install from a custom repo or build from source.

Lots More
There’s quite a bit more than I had time to show you. You can  install more plugins and then configure whitelist.conf to publish even more metrics to CloudWatch. You can create CloudWatch Alarms, set up Custom Dashboards, and more.

To get started, visit AWS Labs on GitHub and download the CloudWatch plugin for collectd.

Jeff;