The Power of Coding Standards

While preparing for another AWS Pro cert, I came across an interesting article by Martin Fowler that highlights the cost of cruft. Cruft is badly designed, unnecessarily complicated, or unwanted code.

Having run a lot of software projects in my time and established and managed app development teams, Martin’s articles around software architecture and code quality really resonated with me. 

Just the other day I came across some code that had more comment lines than actual code, and the code that was there wasn’t architecturally sound. It had a somewhat obvious bug due to an oversight as to how a user might interact with the application. This was making remediation of the code difficult and time consuming.

I feel that 4 of the most powerful lessons I’ve learnt in my IT career are:

1. Always understand the root cause of an error. Don’t just “fix” stuff without fully identifying and understanding the root cause.

2. Architect everything modularly. Learning OOP with C++ was probably one of the best things I did very early in my career.

3. Always ask yourself “Will this scale?” or, what will this look like scaled out 10x, 1,000x, 100,000x.

4. Introduce standards fast and early with well documented examples of “why” for others to follow.

I’m going to focus on the “why” in point 4. 

Far too many times I’ve been involved in projects where a rough proof of concept has been developed, the idea catches on and before you know it, badly written developer code is in production.

Martin Fowler correctly points out that the time impact of this becomes realised in weeks not months. More often than not the code won’t scale. More features create more cruft. Finding the root cause of errors becomes more cumbersome and time consuming. Before long, all hands are busy bailing a leaky boat. Rather than finding ways to make the boat sail much faster and leaner.

I feel that point 2 and point 4 go hand in hand. Point 2 is reflected in the 12 Factor app. OOP encourages abstraction, so I’ve always created applications this way. 

The main benefit of standards in my view, is having a whole team code consistently at speed. Everyone can understand everyone else’s code quickly, and there’s a certain amount of quality baked into the code immediately.

It’s likely that some people might cringe at the idea of coding standards, but that might be because they’ve had standards forced upon them with no rhyme or reason.

In my experience, it’s best if the product development team come up with standards together, and agree why a standard is important.

I think another point to emphasise is that this should be a set of guidelines rather than rigid rules that are enforced blindly.

These days many standards exist that you can just pick up and use. PEP8 from Python is a good example. Further to this, most languages now have linters that ensure that developers are adhering to code style recommendations.

Agreeing on things like meaningful names of modules, functions and variables so their purpose is self-evident is a worthwhile investment of time. One example is deciding that all repos for front-end code should have -frontend- in their name, and backend code similar. You won’t need to look in the repo to figure out what part of the application the code deals with. It’s easy to search for packages or modules in the repo, by filtering on these naming conventions. 

I’ve worked with coders that thought the height of genius was writing code nobody else but them could understand. Single letter variable names and aliases. Unnecessary LEFT and RIGHT join statements in SQL. All it does is make the code near impossible to understand, let alone maintain.

Whatever standards you come up with, there should be a sound reason for them. That reason shouldrelate back to quality at scale. Although good standards might mean a 10-15% increase in time to develop initially, when you’re even more time poor later on, that investment will pay huge dividends when it really counts.

Understanding the outcome you’re trying to accomplish with a standard is more important than the means by which you’re trying to accomplish it. I see this a lot in the security, governance and privacy space.

In many large organisations the why, the desired outcome, is completely lost in the complications of the process. The process is so cumbersome, lengthy and unworkable that everyone avoids it whenever they can. This defeats the purpose of it existing in the first place.

When coming up with standards with your team, always frame them with “we want to use this standard so that we accomplish this outcome.” This opens the floor for a better way to accomplish the outcome if the standard is too tight or rigid. 

Ask yourself why the approach might not work rather than validating why it will. What’s the cost and consequences vs the cost and consequence of other options?

I feel it’s just as important to review and revise standards as things change. Are the standards you’ve established still fit for purpose? Are they accomplishing the objectives you intended? Sadly there are far too many times when a development team is stuck on a set way of doing things but have forgotten why they do it that way. There’s a real risk of locking yourself in the dark ages if you’re not reviewing and incrementally improving the effectiveness of your approach.

In summary, establishing some sound standards that encourage common patterns so that new problems can be solved with quality code is a worthwhile investment.

The cost of not doing this is poorly written code that typically doesn’t scale and is difficult and time consuming to maintain. It will almost certainly need to be completely rewritten at some time.

Use Lambda@Edge to handle complex redirect rules with CloudFront

Problem

Most mature CDNs on the market today offer the capability to define URL forwarding / redirects using path based rules that get executed on the edge, minimising the wait time for users to be sent to their final destination.

CloudFront, Amazon Web Services’ CDN offering, provides out-of-the box support for redirection from HTTP to HTTPS and will cache 3xx responses from its origins, but it doesn’t allow you to configure path based redirects. In the past, this meant we had to configure our redirects close to our origins in specific AWS regions which had an impact on how fast we could serve our users content.

Solution

Luckily, AWS has anticipated this as a requirement for its users and provides other services in edge locations that can compliment CloudFront to enable this functionality.

AWS Lambda@Edge is exactly that, a lambda function that runs on the edge instead of in a particular region. AWS has the most comprehensive Global Edge Network with, at the time of writing, 169 edge locations around the world. With Lambda@Edge, your lambda function runs in a location that is geographically closest to the user making the request.

You define, write and deploy them exactly the same way as normal lambdas, with an extra step to associate them with a CloudFront distribution which then copies them to the edge locations where they’ll be executed.

Lambda@Edge can intercept requests at different stages of the request life-cycle:

https://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html

For our use case, we want to intercept the viewer request and redirect the user based on a set of path based rules.

In the following section there is instructions on how to deploy implement redirects at the edge using the Serverless Application Model, CloudFront and Lambda@Edge.

How to

Assumptions

This guide is written with the assumption that you have the following things set up:

Since we’ll be using the Serverless Application Model to define and deploy our lambda, we’ll need to set up an S3 bucket for sam package, so we have a prerequisites CloudFormation template.

Note: everything in this guide is deployed into us-east-1. I have included the region explicitly in the CLI commands, but you can use your AWS CLI config if you want (or any of the other valid ways to define region).

1) Create the following file:

lambda-edge.prerequisites.yaml

AWSTemplateFormatVersion: '2010-09-09'

Resources:
  RedirectLambdaBucket:
    Type: AWS::S3::Bucket

Outputs:
  RedirectLambdaBucketName:
    Description: Redirect lambda package S3 bucket name
    Value: !Ref RedirectLambdaBucket

We define the bucket name as an output so we can refer to it later.

2) Deploy the prerequisites CloudFormation stack with:

$ aws --region us-east-1 cloudformation create-stack --stack-name redirect-lambda-prerequisites --template-body file://`pwd`/lambda-edge-prerequisites.yaml

This should give you an S3 bucket we can point sam deploy to, let’s save it into an environment variable so it’s easy to use in future commands (you can also just get this from the AWS Console):

3) Run the following command:

$ export BUCKET_NAME=$(aws --region us-east-1 cloudformation describe-stacks --stack-name redirect-lambda-prerequisites --query "Stacks[0].Outputs[?OutputKey=='RedirectLambdaBucketName'].OutputValue" --output text)

Now we’ve got our bucket name ready to use with $BUCKET_NAME, we’re ready to start defining our lambda using the Serverless Application Model.

The first thing we need to define is a lambda execution role. This is the role that our edge lambda will assume when it gets executed.

4) Create the following file:

lambda-edge.yaml

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Full stack to demo Lambda@Edge for CloudFront redirects

Parameters:
  RedirectLambdaName:
    Type: String
    Default: redirect-lambda

Resources:
  RedirectLambdaFunctionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - 'lambda.amazonaws.com'
                - 'edgelambda.amazonaws.com'
            Action:
              - 'sts:AssumeRole'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'

Notice that we allow both lambda.amazonaws.com and edgelambda.amazonaws.com to assume this role, and we grant the role the AWSLambdaBasicExecutionRole managed policy, which grants it privileges to publish its logs to CloudWatch.

Next, we need to define our actual lambda function using the Serverless Application Model.

5) Add the following in the Resources: section of lambda-edge.yaml:

lambda-edge.yaml

  RedirectLambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: lambdas/
      FunctionName: !Ref RedirectLambdaName
      Handler: RedirectLambda.handler
      Role: !GetAtt RedirectLambdaFunctionRole.Arn 
      Runtime: nodejs10.x
      AutoPublishAlias: live

Note: we define AutoPublishAlias: live here which tells SAM to publish both an alias and a version of the lambda and link the two. CloudFront requires a specific version of the lambda and doesn’t allow us to use $LATEST.

We also define CodeUri: lambdas/ which tells SAM where it should look for the Node.js that will be the brains of the lambda itself. This doesn’t exist yet, so we’d better create it:

6) Make a new directory called lambdas:

$ mkdir lambdas

7) Inside that directory, create the following file:

lambdas/RedirectLambda.js

'use strict';

exports.handler = async (event) => {
    console.log('Event: ', JSON.stringify(event, null, 2));
    let request = event.Records[0].cf.request;

    const redirects = {
        '/path-1':    'https://consegna.cloud/',
        '/path-2':    'https://www.amazon.com/',
    };

    if (redirects[request.uri]) {
        return {
            status: '302',
            statusDescription: 'Found',
            headers: {
                'location': [{ value: redirects[request.uri] }]
            }
        };
    }
    return request;
};

The key parts of this lambda are:

a) we can inspect the viewer request as it gets passed in via the event context,
b) we can return a 302 redirect if the request path meets some criteria we set, and
c) we can return the request as-is if it doesn’t meet our redirect criteria.

You can make the redirect rules as simple or as complex as you like.

You may have noticed we hard-code our redirect rules in our lambda, we do this for a couple of reasons but you may decide you’d rather keep your rules somewhere else like DynamoDB or S3. The three main reasons we have our redirect rules directly in the lambda are:

a) the quicker we can inspect the request and return to the user the better, having to hit DynamoDB or S3 will slow us down
b) because this lambda is executed on every request, there will be cost implications to hit DynamoDB or S3 every time
c) defining our redirects via code means we can have robust peer reviews using things like GitHub’s pull requests

Because this is a Node.js lambda, SAM requires us to define a package.json file, so we can just define a vanilla one:

8) Create the file package.json:

lambdas/package.json

{
  "name": "lambda-redirect",
  "version": "1.0.1",
  "description": "Redirect lambda using Lambda@Edge and CloudFront",
  "author": "Chris McKinnel",
  "license": "MIT"
}

The last piece of the puzzle is to define our CloudFront distribution and hook up the lambda to it.

9) Add the following to your lambda-edge.yaml:

lambda-edge.yaml

  CloudFront: 
    Type: AWS::CloudFront::Distribution 
    Properties: 
      DistributionConfig: 
        DefaultCacheBehavior: 
          Compress: true 
          ForwardedValues: 
            QueryString: true 
          TargetOriginId: google-origin
          ViewerProtocolPolicy: redirect-to-https 
          DefaultTTL: 0 
          MaxTTL: 0 
          MinTTL: 0 
          LambdaFunctionAssociations:
            - EventType: viewer-request
              LambdaFunctionARN: !Ref RedirectLambdaFunction.Version
        Enabled: true 
        HttpVersion: http2 
        PriceClass: PriceClass_All 
        Origins: 
          - DomainName: www.google.com
            Id: google-origin
            CustomOriginConfig: 
              OriginProtocolPolicy: https-only 

In this CloudFront definition, we define Google as an origin so we can define a default cache behaviour that attaches our lambda to the viewer-request. Notice that when we associate the lambda function to our CloudFront behaviour we refer to a specific lambda version.

SAM / CloudFormation template

Your SAM template should look like the following:

lambda-edge.yaml

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Full stack to demo Lambda@Edge for CloudFront redirects

Parameters:
  RedirectLambdaName:
    Type: String
    Default: redirect-lambda

Resources:
  RedirectLambdaFunctionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - 'lambda.amazonaws.com'
                - 'edgelambda.amazonaws.com'
            Action:
              - 'sts:AssumeRole'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'

  RedirectLambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: lambdas/
      FunctionName: !Ref RedirectLambdaName
      Handler: RedirectLambda.handler
      Role: !GetAtt RedirectLambdaFunctionRole.Arn 
      Runtime: nodejs10.x
      AutoPublishAlias: live

  CloudFront: 
    Type: AWS::CloudFront::Distribution 
    Properties: 
      DistributionConfig: 
        DefaultCacheBehavior: 
          Compress: true 
          ForwardedValues: 
            QueryString: true 
          TargetOriginId: google-origin
          ViewerProtocolPolicy: redirect-to-https 
          DefaultTTL: 0 
          MaxTTL: 0 
          MinTTL: 0 
          LambdaFunctionAssociations:
            - EventType: viewer-request
              LambdaFunctionARN: !Ref RedirectLambdaFunction.Version
        Enabled: true 
        HttpVersion: http2 
        PriceClass: PriceClass_All 
        Origins: 
          - DomainName: www.google.com
            Id: google-origin
            CustomOriginConfig: 
              OriginProtocolPolicy: https-only 

And your directory structure should look like:

├── lambda-edge-prerequisites.yaml
├── lambda-edge.yaml
├── lambdas
│   ├── RedirectLambda.js
│   └── package.json
└── packaged
    └── lambda-edge.yaml

Now we’ve got everything defined, we need to package it and deploy it. AWS SAM makes this easy.

10) First, create a new directory called package:

$ mkdir package

11) Using our $BUCKET_NAME variable from earlier, we can now run:

$ sam package --template-file lambda-edge.yaml --s3-bucket $BUCKET_NAME > packaged/lambda-edge.yaml

The AWS SAM CLI takes the local SAM template and parses it into a format that CloudFormation understands. After running this command, you should have a directory structure like this:

├── .aws-sam
│   └── build
│       ├── RedirectLambda
│       │ ├── RedirectLambda.js
│       │ └── package.json
│       └── template.yaml
├── lambda-edge-prerequisites.yaml
├── lambda-edge.yaml
├── lambdas
│   ├── RedirectLambda.js
│   └── package.json
└── packaged
    └── lambda-edge.yaml 

Notice the new .aws-sam directory – this contains your lambda code and a copy of your SAM template. You can use AWS SAM CLI to run your lambda locally, however this is out of the scope of this guide. Also notice the new file under the packaged directory – this contains direct references to your S3 bucket, and it’s what we’ll use to deploy the template to AWS.

You can find the full demo, downloadable in zip format, here: lambda-edge.zip

Finally we’re ready to deploy our template:

12) Deploy your template by running:

$ sam deploy --region us-east-1 --template-file packaged/lambda-edge.yaml --stack-name lambda-redirect --capabilities CAPABILITY_IAM

Note the –capabilities CAPABILITY_IAM, this tells CloudFormation that we acknowledge that this stack may create IAM resources that may grant privileges in the AWS account. We need this because we’re creating an IAM execution role for the lambda.

This should give you a CloudFormation stack with a lambda deployed on the edge that is configured with a couple of redirects.

When you hit your distribution domain name and append a redirect path (/path-2/ – look for this in the lambda code), you should get redirected:

Summary

AWS gives you building blocks that you can use together to build complete solutions, often these solutions are much more powerful than what’s available out-of-the-box in the market. Consegna has a wealth of experience designing and building solutions for their clients, helping them accelerate their adoption of the cloud.

AWS SAM local – debugging Lambda and API Gateway with breakpoints demo

Overview

This new serverless world is great, but if you dive into it too fast – sometimes you end up getting caught up trying to get it all working, and forget to focus on having your local development environment running efficiently. This often costs developers time, and as a consequence it also costs the business money.

One of the things I see fairly often, is developers adopting AWS Serverless technologies because they know that’s what they should be doing (and everyone else is doing it), but they end up sacrificing their local development flows to do so – and the one that’s most obvious is running lambdas locally with breakpoints.

This post covers how to get local debugging working using breakpoints and an IDE from a fresh AWS SAM project using the Python3.6 runtime.

I’m using the Windows Subsystem for Linux and Docker on Windows.

Prerequisites

Video Demo

There is a high level step-by-step below, but the video contains exact steps and a demo of this working using WSL and Docker for Windows.

Step by step

Assuming you’ve got the prerequisites above, the process of getting a new project set up and hooked up to your IDE is relatively straight forward.

1. Initialise a new python runtime project with:

$ sam init --runtime python3.6

2. Test we can run our API Gateway mock, backed by our lambda locally using:

$ sam local start-api

3. Hit our app in a browser at:

http://127.0.0.1/hello

Screen Shot 2019-05-29 at 2.02.43 PM

4. Add the debugging library to requirements.txt

hello_world/requirements.txt
requests==2.20.1
ptvsd==4.2.10

5. Import the debug library and have it listen on a debug port

hello_world/app.py

import ptvsd

ptvsd.enable_attach(address=(‘0.0.0.0’, 5890), redirect_output=True)
ptvsd.wait_for_attach()

6. Build the changes to the app in a container using:

$ sam build --use-container

7. Set a breakpoint in your IDE

8. Configure the debugger in your IDE (Visual Code uses launch.json)

{
   "version": "0.2.0",
   "configurations": [
       {
           "name": "SAM CLI Python Hello World",
           "type": "python",
           "request": "attach",
           "port": 5890,
           "host": "127.0.0.1",
           "pathMappings": [
               {
                   "localRoot": "${workspaceFolder}/hello_world",
                   "remoteRoot": "/var/task"
               }
           ]
       }
   ]
}

9. Start your app and expose the debug port using:

$ sam local start-api --debug-port 5890

10. Important: hit your endpoint in the browser

This will fire up a docker container that has the debug port exposed. If you attempt to start the debugger in your IDE before doing this, you will get a connection refused error.

11. Start the debugger in your IDE:

This should drop you into an interactive debugging session in your IDE. Great!

Summary

While adopting new technologies can be challenging and fun, it’s important that you keep your local development as efficient as possible so you can spend your time on what you do best: developing.

Consegna helps its partners transition from traditional development processes, to cloud development processes with minimal disruption to workflow. By demonstrating to developers that their day-to-day lives won’t change as much as they think, we find that widespread adoption and enthusiasm for the cloud is the norm, not the exception for developers in our customer engagements.

Cloud Migration War Stories: 10 Lessons learnt from the Lift-and-Shift migration of 100s of client servers from Traditional Data Centres into AWS.

Experience is a hard teacher because you get the test first, and the lesson afterwards.

I’ve always felt the best lessons are the ones learnt yourself, but to be honest, sometimes I would be more than happy to learn some lessons from others first. I hope the following can help you before you embark on your lift-and-shift migration journey.

Beware the incumbent

“Ok, so he won’t shake my hand or even look me in the eye, oh no this is not a good sign”. These were my initial observations when I first met with the representative of one of our clients Managed Service Provider (MSP). Little did I know how challenging, yet important this relationship was to become.

This is how I saw it. All of sudden after years of this MSP giving your client pretty average service they see you, this threat on their radar. Sometimes the current MSP is also getting mixed messages from the client. What’s wrong? Why the change? What does it mean for them?

I found it best to get the existing MSP on side early. If it’s an exit, then an exit strategy is needed between the client and the MSP. The best results happen when the MSP is engaged and ideally a Project Manager is put in place to assist the client with that exit strategy.

Most importantly, tell your MSP to heed the wise words of Stephen Orban.  “Stop fighting gravity. The cloud is here, the benefits to your clients are transformational, and these companies need your help to take full advantage of what the cloud offers them. Eventually, if you don’t help them, they’ll find someone who will.”

Partner with your client

“Do cloud with your client, not to them”. Your client is going to have a certain way they work, and are comfortable with. Your client will also have a number of Subject Matter Experts (SMEs) and in order to bring these SMEs on the journey also, having someone from your team on-site and full time paired-up next to the SME to learn from them can be invaluable.

There will be things they know that you don’t. A lot actually. I found it best to get your client involved and more importantly their input and buy-in. The outcome will be much better, as will your ability to overcome challenges when they come up.

Lay a good foundation

We spent a significant amount of time working with our client to understand what we were in for. We created an extensive list of every server (and there were 100s) in the Managed Service Providers’ data centre and then set strategies, in place to migrate groups of servers.

We also set up our own version of an AWS Landing Zone as a foundational building block so best practices were set up for account management and security in AWS.

It’s important to lay this foundation and do some good analysis up front. Things will change along the way but a good period of discovery at the start of a project is essential.

But, don’t over analyse!

Do you need a plan? Absolutely. There is a number of good reasons why you need a plan. It sets a guideline and communication within the team and outside it. But I think you can spend too much time planning and not enough time doing.

We started with a high-level plan with groups of servers supporting different services for our client. We estimated some rough timelines and then got into it. And we learnt a lot along the way and then adapted our plan to show value to our client.

Pivot

Mike Tyson once said “Everybody has a plan until they get punched in the mouth”

When things go wrong you need to adapt and change. When we started migrating one particular set of servers out from the incumbent data centre we discovered their network was slow and things ground to a hold during the migration. So, like being punched in mouth, you take the hit and focus on a different approach. We did get back to those servers and got them into the cloud but we didn’t let them derail our plans.

Challenge the status quo

When I started working with a client migration project recently, the team had just finished migrating one of the key databases up into AWS, but the backup process was failing, as the backup window was no longer large enough.

After digging a little deeper, it was found that the backup process itself was very slow and cumbersome, but it had been working (mostly) for years, so ‘why change, right?’! The solution we put in place was to switch to a more lightweight process, which completed in a fraction of the time.

What’s my role, what’s your role?

It’s a really good idea to get an understanding of what everyone’s role is when working with multiple partners. We found taking a useful idea from ITIL and creating a RACI matrix (https://en.it-processmaps.com/products/itil-raci-matrix.html ) a really good way to communicate who was responsible for what during the migration and also with the support of services after the migration.

Not one size fits all.

There are a number of different ways to migrate applications out of data centres and into the cloud. We follow the “6 Rs” for different strategies for moving servers in AWS (https://aws.amazon.com/blogs/enterprise-strategy/6-strategies-for-migrating-applications-to-the-cloud/).

Although with most servers we used a “Lift and Shift” or “Rehosting” approach in a number of cases we were also “Refactoring”, “Re-architecting” and “Retiring” where this made sense.

Go “Agile”.

In short, “Agile” can mean a lot of different things to different people. It can also depend on the maturity of your client and their previous experiences.

We borrowed ideas from the Kanban methodology such as using sticky notes and tools like Trello to visualise the servers being migrated and to help us limit tasks in progress to make the team more productive.

We found we could take a lot of helpful parts of Agile methodologies like Scrum including stand-ups which allowed daily communication within the team.

And finally but probably most important – Manage Up and Around!

My old boss once told me “perception is reality” and it has always stuck.

It’s critical senior stakeholders are kept well informed of progress in a concise manner and project governance is put in place. This way key stakeholders from around the business can assist when help is needed and are involved in the process.

So, how does this work in an Agile world? Communications are key. You can still run your project using an Agile methodology but it’s still important to provide reporting on risks, timelines and financials to senior stakeholders. This reporting, along with regular updates with governance meetings reinforcing these written reports, will mean your client will be kept in the loop and the project on track.

NZTA All-Access Hackathon

Consegna and AWS are proud to be sponsoring the NZTA Hackathon again this year. The event will be held the weekend of 21 – 23 September.

Last year’s event, Save One More Life, was a huge success and the winner’s concept has been used to help shape legislation in order to support its adoption nationally.
The information session held in the Auckland NZTA Innovation space last night, provided great insight into this years event which focuses on accessible transport options, in other words making transport more accessible to everyone, especially those without access to their own car, the disabled, and others in the community who are isolated due to limited transportation options.

The importance of diversity among the teams was a strong theme during the evening. For this event in particular, diverse teams are going to have an edge, as Luke Krieg, the Senior Manager of Innovation NZTA, pointed out, “Data can only tell you so much about a situation. It’s not until you talk to people, that the real insights appear – and what the data alone doesn’t reveal also becomes evident.”

Jane Strange, CX Improvement Lead NZTA, illustrated this point nicely with a bell curve that shows the relationship between users at each extreme of the transport accessibility graph.

Those on the right with high income, urban location, proximity to and choice of transport options invariably define transport policy for those to the left of the curve who are those with low income, located in suburban or rural areas, who are typically more isolated and have fewer transport options.

Luke also stressed how much more successful diverse teams participating in Hackathons usually are. As these are time-boxed events that require a broad spectrum of skills, technology in and of itself often doesn’t win out. Diverse skills are essential to a winning team.

For more information and registration to the event, please visit https://nzta.govt.nz/all-access

 

Cognito User Pool Migration

At Consegna, we like AWS and their services which are covered by a solid bench of documentation, blog posts and best practices. Because it is easy to find open source production ready code on GitHub, it is straightforward to deploy new applications quickly and at scale. However, sometimes, moving too fast may lead to some painful problems over time!

Deploying the AWS Serverless Developer Portal from Github straight to production works perfectly fine. Nevertheless, hardcoded values within the templates make complicated to deploy multiple similar environments within the same AWS account. Introducing some parameterization is usually the way to go to solve that problem. But that leads to deal with a production stack to not be aligned with the staging environments which is, of course, not a best practice…

This blog post describes the solution we have implemented to solve the challenge of migrating Cognito users from one pool to another at scale. The extra step of migrating API keys associated to those users is covered in this blog.

The Technology Stack

The deployed stack involves AWS serverless technologies such as Amazon API Gateway, AWS Lambda, and Amazon Cognito. It is assumed in this blog post that you are familiar with those AWS services but we encourage you to check out the AWS documentation or to contact Consegna for more details.

The Challenge

The main challenge is to migrate Cognito users and their API keys at scale without any downtime or requiring any password resets from the end users.

The official AWS documentation describes two ways of migrating users from one user pool to another:

1. Migrate users when they sign-in using Amazon Cognito for the first time with a user migration Lambda trigger. With this approach, users can continue using their existing passwords and will not have to reset them after the migration to your user pool.
 2. Migrate users in bulk by uploading a CSV file containing the user profile attributes for all users. With this approach, users will require to reset their passwords.

We discarded the second option as we did not want our users to “pay” for this backend migration. So we used the following AWS blog article as a starting point while keeping in mind that it does the cover the entire migration we need to implement. Indeed, by default, an API key is created for every user registering on the portal. The key is stored in API Gateway and is named based on the user’s CognitoIdenityId attribute which is specific to each user within a particular Cognito user pool.

The Solution

The Migration Flow

The following picture represents our migration flow with the extra API key migration step.

Screen Shot 2018-08-13 at 4.38.00 PM

Migration Flow

Notes

  1. The version of our application currently deployed in production does not support the Forget my password flow so we did not implement it in our migration flow (but we should and will).
  2. When a user registers, they must submit a verification code to have access to his API key. In the very unlikely situation where a user has registered against the current production environment without confirming their email address, the user will be migrated automatically with automatic confirmation of their email address by the migration microservice. Based on the number of users and the low probability of this particular scenario, we considered it as an acceptable risk. However it might be different for your application.

The Prerequisites

In order to successfully implement the migration microservice, you first need to grant some IAM permissions and to modify the Cognito user pool configuration.

  1. You must grant your migration Lambda function the following permissions (feel free to restrict those permissions to specific Cognito pools using
    arn:${Partition}:cognito-idp:${Region}:${Account}:userpool/${UserPoolId}):
- Action:
  - apigateway:GetApiKeys
  - apigateway:UpdateApiKey
  - cognito-identity:GetId
  - cognito-idp:AdminInitiateAuth
  - cognito-idp:AdminCreateUser
  - cognito-idp:AdminGetUser
  - cognito-idp:AdminRespondToAuthChallenge
  - cognito-idp:ListUsers
Effect: Allow
Resource: "*"
  1. On both Cognito pools (the one you are migrating from and the one you are migrating to), enable Admin Authentication Flow (ADMIN_NO_SRP_AUTH) for allowing server-based authentication by the Lambda function executing the migration. You can do it via the Management Console or the AWS CLI with the following command:
aws cognito-idp update-user-pool-client \
    --user-pool-id <value> \
    --client-id <value> \
    --explicit-auth-flows ADMIN_NO_SRP_AUTH

More details about the Admin Authentication Flow is available here.

You are all set. Let’s get our hands dirty!

The Implementation (in JS)

At the Application Layer

To allow a smooth migration for our users, the OnFailure of the login method should call our migration microservice instead of returning the original error back to the user. An unauthenticated API Gateway client is initialized to call the migrate_user method on our API Gateway. The result returned by the backend is straightforward: RETRY indicates a successful migration so the application must re login the user automatically else it must handle the authentication error (user does not exist, username or password incorrect and so on).

onFailure: (err) => {
  // Save the original error to make sure to return appropriate error if required...
  var original_err = err;

  // Attempt migration only if old Cognito pool exists and if the original error is 'User does not exist.'
  if (err.message === 'User does not exist.' && oldCognitoUserPoolId !== '') {
    initApiGatewayClient()  // Initialize an unauthenticated API Gateway client
    
    var body = {
      // Prepare the body for the request for all required information such as
      // username, password, old and new Cognito pool information
    }
    
    // Let's migrate your user!
    apiGatewayClient.post("/migrate_user", {}, body, {}).then((result) => {
      resolve(result);
      if (result.data.status === "RETRY") {  // Successful migration!
        // user can now login!
      } else {
          // Oh no, status is not RETRY...
          // Check the error code and display appropriate error message to the user
      } 
    }).catch((err) => {
      // Handle err returned by migrate_user or return original error
    });
  } else {
    // Reject original error
  }
}

The Migration microservice

API Gateway is used in conjunction with Cognito to authenticate the caller but few methods such as our migrate_user must remain unauthenticated. So here the configuration of migrate_user POST method on our API Gateway:

/migrate_user:
    post:
      produces:
      - application/json
      responses: {}
      x-amazon-apigateway-integration:
        uri: arn:aws:apigateway:<AWS_REGION>:lambda:path/2015-03-31/functions/arn:aws:lambda:<AWS_REGION>:<ACCOUNT_ID>:function:${stageVariables.FunctionName}/invocations
        httpMethod: POST
        type: aws_proxy
    options:
      consumes:
      - application/json
      produces:
      - application/json
      responses:
        200:
          description: 200 response
          schema:
            $ref: "#/definitions/Empty"
          headers:
            Access-Control-Allow-Origin:
              type: string
            Access-Control-Allow-Methods:
              type: string
            Access-Control-Allow-Headers:
              type: string
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: 200
            responseParameters:
              method.response.header.Access-Control-Allow-Methods: "'DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT'"
              method.response.header.Access-Control-Allow-Headers: "'Content-Type,Authorization,X-Amz-Date,X-Api-Key,X-Amz-Security-Token'"
              method.response.header.Access-Control-Allow-Origin: "'*'"
        passthroughBehavior: when_no_match
        requestTemplates:
          application/json: "{\"statusCode\": 200}"
        type: mock

The implementation of migrate_user is simply added to our express-server.js so no Lambda to manage so to speak. The function is available below and we are going to deep dive in details into each step:

app.post('/migrate_user', (req, res) => {
    // 1 -- Extract paramters from the body
    var username = req.body.username;
    var password = req.body.password;
    // etc ...

    var oldCognitoIdentityId = null;
    var cognitoIdentityId = null;
    var answer = { "status": "NO_RETRY" };

    const migrate_task = async () => {

        // 2 -- Check if migration is required
        let result = await isMigrationRequired(username, cognitoUserPoolId);
        if (result === false) return "NO_RETRY";

        // 3 -- Resolve the CognitoIdentityId of the user within the old pool
        result = await getCognitoIdentityId(username, password, oldCognitoUserPoolId, oldCognitoIdentityPoolId, oldCognitoClientId, oldCognitoRegion);
        if (result.error != null) {
            // Analyse error and return appropriate error code
            if (result.error.code === "PasswordResetRequiredException") return "NO_RETRY_PASSWORD_RESET_REQUIRED";
            else return "NO_RETRY";
        } else oldCognitoIdentityId = result.cognitoIdentityId;

        // 4 -- Extract the user's attributes to migrate from the old to the new pool
        var attributesToMigrate = await getUserAttributes(username, oldCognitoUserPoolId);

        // 5 -- Migrate user from old to new pool
        result = await migrateUser(username, password, cognitoUserPoolId, cognitoClientId, attributesToMigrate);
        if (result.error !== null) {
            // Something went wrong during the migration!
            return "NO_RETRY";
        }

        // 6 -- Resolve the CognitoIdentityId of the user within the new pool
        result = await getCognitoIdentityId(username, password, cognitoUserPoolId, cognitoIdentityPoolId, cognitoClientId, cognitoRegion);
        if (result.error !== null) {
            // Analyse error and return appropriate error code
            if (result.error.code === "PasswordResetRequiredException") return "NO_RETRY_PASSWORD_RESET_REQUIRED";
            else return "NO_RETRY";
        } else cognitoIdentityId = result.cognitoIdentityId;

        // 7 -- Migrate the user's API key
        result = await migrateApiKey(username, cognitoIdentityId, oldCognitoIdentityId);

        // 8 -- Migration complete!
        return "RETRY";
    }

    migrate_task()
        .then((value) => {
            answer.status = value;
            if (value === "RETRY") {
                res.status(200).json(answer);
            } else res.status(500).json(answer);
        })
        .catch((error) => {
            answer.status = value;
            res.status(500).json(answer);
        })
});
1 – Extract parameters from the body

All the data required for the migration has been passed by the application to our function via req so we just extract it. Of course do not log the password else it will appear in clear in the execution logs of your Lambda.

Note: you might wish to inject the Cognito pool information directly to the Lambda via environment variables instead of passing via the body of the request.

2 – Check if migration is required

A migration is indicated as required only if the user does not already exist in the new pool. However be aware that this function does not verify the existence of the user in the old pool (the check is made during step 3.):

function isMigrationRequired(username, cognitoUserPoolId) {
  return new Promise((resolve, reject) => {
    var params = {
      Username: username,
      UserPoolId: cognitoUserPoolId
    };
    
    cognitoidentityserviceprovider.adminGetUser(params, function(lookup_err, data) {
      if (lookup_err) {
        if (lookup_err.code === "UserNotFoundException") {
          // User not found so migration should be attempted!
          resolve(true);
        } else {
          reject(lookup_err)  // reject any other error
        }
      } else {
        resolve(false);  // User does exist in the pool so no migration required
      }
    });
  })
};
3 – Resolve the CognitoIdentityId of the user within the old pool

Authenticate the user against the old pool using adminInitiateAuth and get his CognitoIdentityId via the getId method. This is required for the migration of the user’s API key. Of course, if the user cannot be authenticated against the old pool, they cannot be migrated so the function returns the error straight away.

function getCognitoIdentityId(username, password, cognitoUserPoolId, cognitoIdentityPoolId, cognitoClientId, cognitoRegion) {

  var params = {
    AuthFlow: 'ADMIN_NO_SRP_AUTH',
    ClientId: cognitoClientId,
    UserPoolId: cognitoUserPoolId,
    AuthParameters: {
      USERNAME: username,
      PASSWORD: password
    }
  };

  var result = {
    "cognitoIdentityId": null,
    "error": null
  }

  return new Promise((resolve, reject) => {
    cognitoidentityserviceprovider.adminInitiateAuth(params, function(initiate_auth_err, data) {
      if (initiate_auth_err) {
        // Error during authentication of the user against the old pool so this user cannot be migrated!
        result.error = initiate_auth_err;
        resolve(result);
      } else {
        // User exists in the old pool so let's get his CognitoIdentityId
        var Logins = {};
        Logins["cognito-idp." + cognitoRegion + ".amazonaws.com/" + cognitoUserPoolId] = data.AuthenticationResult.IdToken;
        params = {
          IdentityPoolId: cognitoIdentityPoolId,
          Logins: Logins
        };
        cognitoidentity.getId(params, function(get_id_err, data) {
          result.cognitoIdentityId = data.IdentityId;
          resolve(result);
        });
      }
    });
  });
}
4 – Extract the attribute of user to migrate from the old to the new pool

Resolve the user’s attributes to migrate and force email_verified to true to avoid post-migration issues.

Note: all the attributes must be migrated except sub because this attribute is Cognito pool specific and will be created by the new pool.

function getUserAttributes(username, cognitoUserPoolId) {
  var user = null;
  var params = {
    UserPoolId: cognitoUserPoolId,
    Filter: "username = \"" + username + "\""
  };
  
  var result = [];

  return new Promise((resolve, reject) => {
    cognitoidentityserviceprovider.listUsers(params, function(list_err, data) {
      if (list_err) console.log("Error while listing users using " + params + ": " + list_err.stack);
      else {
        data.Users[0].Attributes.map(function(attribute) {
          if (attribute.Name === 'email_verified') {
            attribute.Value = 'true';
          }
          if (attribute.Name !== 'sub') result.push(attribute);
        });
      }

      resolve(result);
    });
  });
}
5 – Migrate user from old to new pool

Our user is now ready to be migrated! So let’s use the admin features of Cognito(adminCreateUser, adminInitiateAuth, and adminRespondToAuthChallenge) to create the user, authenticate the user, and set their password.

function migrateUser(username, password, cognitoUserPoolId, cognitoClientId, attributesToMigrate) {
  var params = {
    UserPoolId: cognitoUserPoolId,
    Username: username,
    MessageAction: 'SUPPRESS', //suppress the sending of an invitation to the user
    TemporaryPassword: password,
    UserAttributes: attributesToMigrate
  };
  
  var result = {
    "error": null
  }
  
  return new Promise((resolve, reject) => {
    cognitoidentityserviceprovider.adminCreateUser(params, function(create_err, data) {
      if (create_err) {
        result.error = create_err;
        resolve(result);
      } else {
        // Now sign in the migrated user to set the permanent password and confirm the user
        params = {
          AuthFlow: 'ADMIN_NO_SRP_AUTH',
          ClientId: cognitoClientId,
          UserPoolId: cognitoUserPoolId,
          AuthParameters: {
            USERNAME: username,
            PASSWORD: password
          }
        };
        cognitoidentityserviceprovider.adminInitiateAuth(params, function(initiate_auth_err, data) {
          if (initiate_auth_err) {
            result.error = initiate_auth_err;
            resolve(result);
          } else {
            // Handle the response to set the password (confirm the challenge name is NEW_PASSWORD_REQUIRED)
            if (data.ChallengeName !== "NEW_PASSWORD_REQUIRED") {
              result.error = new Error("Unexpected challenge name after adminInitiateAuth [" + data.ChallengeName + "], migrating user created, but password not set")
              resolve(result)
            }

            params = {
              ChallengeName: "NEW_PASSWORD_REQUIRED",
              ClientId: cognitoClientId,
              UserPoolId: cognitoUserPoolId,
              ChallengeResponses: {
                "NEW_PASSWORD": password,
                "USERNAME": data.ChallengeParameters.USER_ID_FOR_SRP
              },
              Session: data.Session
            };
            cognitoidentityserviceprovider.adminRespondToAuthChallenge(params, function(respond_err, data) {
              if (respond_err) {
                result.error = respond_err;
              }

              resolve(result)
            });
          }
        });
      }
    });
  });
}
6 – Resolve the CognitoIdentityId of the user within the new pool

Our user is now created within the new pool so let’s resolve his CognitoIdentityId required for migrating his API key.

7 – Migrate user’s API key

Migrate the user’s API key by renaming it to point to the user’s CognitoIdentityId resolved during step 6.

function migrateApiKey(username, cognitoIdentityId, oldCognitoIdentityId) {
  var params = {
    nameQuery: oldCognitoIdentityId
  };

  return new Promise((resolve, reject) => {
    apigateway.getApiKeys(params, function(get_key_err, data) {
      params = {
        apiKey: apiKeyId,
        patchOperations: [{
          op: "replace",
          path: "/name",
          value: cognitoIdentityId
        }, {
          op: "replace",
          path: "/description",
          value: "Dev Portal API Key for " + cognitoIdentityId
        }]
      };
      // Update API key name and description to reflect the new CognitoIdentityId
      apigateway.updateApiKey(params, function(update_err, data) {
        console.log("API key (id: [" + apiKeyId + "]) updated successfully");
        resolve(true)
      });
    });
  })
}
8 – Migration complete, so return RETRY to indicate success

The migration is now complete so return RETRY status indicating to the application that the user must be re logged in automatically.

Conclusion

By leveraging AWS serverless technologies we have been able to fully handle the migration of our client’s application users at the backend level. The customer was happy with this solution as it avoided sending requests to the users to reset their password and it realigned the production with staging.

It’s implementing solutions like this that helps set Consegna apart from other cloud consultancies — we are a true technology partner and care deeply about getting outcomes for customers that align with their business goals, not just looking after our bottom line.

What is your digital waste footprint?

How many times have you walked into your garage and took stock of all the things you haven’t used in years? Those bikes that you bought for you and your partner that you haven’t used since the summer of ‘09, the fishing rods, the mitre saw, the boat (if you’re lucky) and the list goes on and on. Imagine if you didn’t have to pay for them all up front – and better yet, imagine if you could stop paying for them the moment you stopped using them!

Amazingly, that is the world we live in with the public cloud. If you’re not using something, then you shouldn’t be paying for it – and if you are, then you need to ask yourself some hard questions. The problem we’re seeing in customer-land is twofold:

  1. Technical staff are too far removed from whoever pays the bills, and
  2. It’s easier than ever to start new resources that cost money

Technical staff don’t care about the bill

Many technical staff that provision resources and use services on AWS have no idea what they cost and have never seen an invoice or the billing dashboard. They don’t pay the bills, so why would they worry about what it costs?

Working with technical staff and raising awareness around the consequences of their choices in the public cloud goes a long way to arresting the free-fall into an unmanageable hosting bill. By bringing the technical staff along on the optimisation journey, you’re enabling them to align themselves with business goals and feel the choices they make are contributing in a positive way.

It’s so easy to create new resources

One of the biggest strengths of the public cloud is how easy it is to provision resources or enable services, however this appears to be one of its weaknesses as well. It’s because of this ease of use that time and time again we see serious account sprawl: unused, underutilised and over-sized resources litter the landscape, nobody knows how much Project A costs compared to Project B and there isn’t a clear plan to remediate the wastage and disarray.

Getting a handle on your hosting costs is an important step to take early on and implementing a solid strategy to a) avoid common cost related mistakes and b) be able to identify and report on project costs is crucial to being successful in your cloud journey.

Success stories

Consegna has recently engaged two medium-to-large sized customers and challenged them to review the usage of their existing AWS services and resources with a view to decreasing their monthly cloud hosting fees. By working with Consegna as an AWS partner and focusing on the following areas, one customer decreased their annual bill by NZD$500,000 and the other by NZD$100,000. By carefully analysing the following areas of your cloud footprint, you should also be able to significantly reduce your digital waste footprint.

Right-sizing and right-typing

Right-sizing your resources is generally the first step you’ll take in your optimisation strategy. This is because you can make other optimisation decisions that are directly related to the size of your existing resources, and if they aren’t the right size to begin with then those decisions will be made in error.

Right-typing can also help reduce costs if you’re relying on capacity in one area of your existing resource type that can be found in a more suitable resource type. It’s important to have a good idea of what each workload does in the cloud, and to make your decisions based on this instead of having a one-size-fits all approach.

Compute

Right-sizing compute can be challenging if you don’t have appropriate monitoring in place. When making right-sizing decisions there are a few key metrics to consider, but the main two are CPU and RAM. Because of the shared responsibility model that AWS adheres to, it doesn’t have access to RAM metrics on your instances out-of-the-box so to get a view on this you need to use third party software.

Consegna has developed a cross-platform custom RAM metric collector that ships to CloudWatch and has configured a third-party integration to allow CloudCheckr to consume the metrics to provide utilisation recommendations. Leveraging the two key metrics, CPU and RAM, allows for very accurate recommendations and deep savings.

Storage

Storage is an area that gets overlooked regularly which can be a costly mistake. It’s important to analyse the type of data you’re storing, how and how often you’re accessing it, where it’s being stored and how important it is to you. AWS provides a myriad of storage options and without careful consideration of each, you can miss out on substantial decreases of your bill.

Database

Right-sizing your database is just as important as right-sizing your compute – for the same reasons there are plenty of savings to be had here as well.

Right-typing your database can also be an interesting option to look at as well. Traditional relational databases appear to be becoming less and less popular with new serverless technologies like DynamoDB – but it’s important to define your use case and provision resources appropriately.

It’s also worth noting that AWS have recently introduced serverless technologies to their RDS offering which is an exciting new prospect for optimisation aficionados.

Instance run schedules

Taking advantage of not paying for resources when they’re not running can make a huge difference to how much your bill is, especially if you have production workloads that don’t need to be running 24/7. Implementing a day / night schedule can reduce your bill by 50% for your dev / test workloads.

Consenga takes this concept to the next level by deploying a portal for non-technical users to control when the instances they deal with day-to-day are running or stopped. By pushing this responsibility out to the end users, instances that would have been running 12 hours a day based on a rigid schedule now only run for as long as they’re needed – an hour, or two usually – supercharging the savings.

Identify and terminate unused and idle resources

If you’re not using something then you should ask yourself if you really need it running, or whether or not you could convert it to an on-demand type model.

This seems like an obvious one, but the challenge can actually be around identification – there are plenty of places resources can hide in AWS so being vigilant and using the help of third party software can be key to aid you in this process.

Review object storage policies

Because object storage in AWS (S3) is so affordable, it’s easy to just ignore it and assume there aren’t many optimisations to be made in this area. This can be a costly oversight as not only the type of storage you’re using is important, but how frequently you need to access the data as well.

Lifecycle policies on your object storage is a great way to automate rolling infrequently used data into cold storage and can be a key low-hanging fruit that you can nab early on in your optimisation journey.

Right-type pricing tiers

AWS offers a robust range of pricing tiers for a number of their services and by identifying and leveraging the correct tiers for your usage patterns, you can make some substantial savings. In particular you should be considering Reserved Instances for your production resources that you know are going to be around forever, and potentially Spot Instances for your dev / test workloads that you don’t care so much about.

Of course, there are other pricing tiers in other services that are worth considering.

Going Cloud Native

AWS offers many platform-as-a-service offerings which take care of a lot of the day to day operational management that is so time consuming. Using these offerings as a default instead of managing your own infrastructure can provide some not so tangible optimisation benefits.

Your operations staff won’t be bogged down with patching and keeping the lights on – they’ll be freed up to innovate and explore the new and exciting technologies that AWS are constantly developing and releasing to the public for consumption.

Consegna consistently works with its technology and business partners to bake this optimisation process into all cloud activities. By thinking of ways to optimise and be efficient first, both hosting related savings and operational savings are achieved proactively as opposed to reactively.

Slow DoS attack mitigation — a Consegna approach.

Recently we discovered that a customer’s website was being attacked in what is best described as a “slow DoS”. The attacker was running a script that scraped each page of the site to find possible PDF files to download, then was initiating many downloads of each file.

Because the site was fronted by a Content Delivery Network (CDN), the site itself was fine and experienced no increase in load or service disruption, but it did cause a large spike in bandwidth usage between the CDN and the clients. The increase in bandwidth was significant enough to increase the monthly charge from around NZ$1,500 to over NZ$5,000. Every time the customer banned the IP address that was sending all the requests, a new IP would appear to replace it. It seems the point of the attack was to waste bandwidth and cost our customer money — and it was succeeding.

The site itself was hosted in AWS on an EC2 instance, however the CDN service the site was using was a third party — Fastly. After some investigation, it seemed that Fastly didn’t have any automated mitigation features that would stop this attack. Knowing that AWS Web Application Firewall (WAF) has built in rate-based rules we decided to investigate whether we could migrate the CDN to CloudFront and make use of these rules.

All we needed to do was create a CloudFront distribution with the same behaviour as the Fastly one, then point the DNS records to CloudFront — easy right? Fastly has a neat feature that allows you to redirect at the edge which was being used to redirect the apex domain to the www subdomain — if we were to replicate this behaviour in CloudFront we would need some extra help, but first we needed to make sure we could make the required DNS changes.

To point a domain at CloudFront that is managed by Route 53 is easy, you can just set an ALIAS record on the apex domain and a CNAME on the www subdomain. However, this customers DNS was managed by a third-party provider who they were committed to sticking with (this is a blog post for another day). The third-party provider did not support ALIAS or ANAME records and insisted that apex domains could only have A records — that meant we could only use IP addresses!

Because CloudFront has so many edge locations (108 at the time of writing), it wasn’t practical to get a list of all of them and set 108 A records — plus this would require activating the “static IP” feature of CloudFront which gives you a dedicated IP for each edge location, which costs around NZ$1,000 a month.

And to top all that off, whatever solution we decided to use would only be in place for 2 months as the site was being migrated to a fully managed service. We needed a solution that would be quick and easy to implement — AWS to the rescue!

So, we had three choices:

  1. Stay with Fastly and figure out how to ban the bad actors
  2. Move to CloudFront and figure out the redirect (bearing in mind we only had A records to work with)
  3. Do nothing and incur the NZ$5,000 cost each month — high risk if the move to a managed service ended up being delayed. We decided this wasn’t really an option.

We considered spinning up a reverse proxy and pointing the apex domain at it to redirect to the www subdomain (remember, we couldn’t use an S3 bucket because we could only set A records) but decided against this approach because we’d need to make the reverse proxy scalable given we’d be introducing it in front of the CDN during an ongoing DoS attack. Even though the current attack was slow, it could have easily been changed into something more serious.

We decided to stay with Fastly and figure out how to automatically ban IP addresses that were doing too many requests. Aside from the DNS limitation, one of the main drivers for this decision was inspecting the current rate of the DoS — it was so slow that it was below the minimum rate-based rule configuration that the AWS WAF allows (2,000 requests in 5 minutes). We needed to write our own rate-based rules anyway, so using CloudFront and WAF didn’t solve our problems straight away.

Thankfully, Fastly had an API that we could hit with a list of bad IPs — so all we needed to figure out was:

  1. Get access to the Fastly logs,
  2. Parse the logs and count the number of requests,
  3. Auto-ban the bad IPs.

Because Fastly allows log shipping to S3 buckets, we configured it to ship to our AWS account in a log format that could be easily consumed by Athena, and wrote a couple of AWS Lambda functions that:

  1. Queried the Fastly logs S3 bucket using Athena,
  2. Inspected the logs and banned bad actors by hitting the Fastly API, maintaining state in DynamoDB,
  3. Built a report of bad IPs and ISPs and generated a complaint email.

The deployed solution looked something like this:

By leveraging S3, Athena, Lambda and DynamoDB we were able to deploy a fully serverless rate-based auto-banner for bad actors with a very short turnaround. The customer was happy with this solution as it avoided having to incur the $5000 NZD / month cost, avoided needing to change the existing brittle DNS setup and also provided some valuable exposure into how powerful serverless technology on AWS is.

It’s implementing solutions like this that helps set Consegna apart from other cloud consultancies — we are a true technology partner and care deeply about getting outcomes for customers that align with their business goals, not just looking after our bottom line.

Busy-ness And Innovation. How Hackathons Can Successfully Promote New Ideas And Thinking

At Consegna we pride ourselves on our experience and knowledge.  Our recently appointed National Client Manager has a knack for knowing virtually everybody in Wellington. This can be amusing if you’re with him as a 10 minute walk down Lambton Quay can take half an hour. We tend to now leave gaps that long between meetings. One of the questions he will ask people is one we probably all ask, “how’s business?”  The answer is always the same, “busy!”

 

Now sometimes that can just be a standard answer, but if it’s true and we’re all so busy then what are we busy doing? We would like to think we’re doing super innovative work but how much time do we actually dedicate to innovation? Most of the time we’re busy just trying to complete projects. Google famously scraped their “20% time” dedicated to innovation because managers were judged on the productivity of their teams and they were in turn concerned about falling behind on projects where only 80% capacity was used. Yet, the flipside of that was that  “20% time” formed the genesis of Gmail, Adsense and Google Talk.

 

So this leads on to thinking how can we continue to create when we’re all so busy? We have to be innovative about being innovative.  One solution that organisations have been working on is the idea of a hackathon. A short period of time, usually around 48 hours, for organisations to stop what they’re doing and work on exciting new things that traditionally might be out of scope or “nice to haves”.

 

Consegna was selected as the University of Auckland cloud enablement partner in 2017 and were recently involved with the University’s cloud centre of excellence team who wanted to promote and facilitate innovation within their internal teams. One answer to create more innovation was hosting a Hackathon with Auckland University internal teams. In total there were 17 teams of 2-6 people a team. They were tasked with trying to solve operational issues within the University – and there were a number of reasons we jumped at the chance to help them with their hackathon.

 

First and foremost we liked the idea of two days being used to build, break and create. We were there as the AWS experts to help get the boat moving, but deep down we’re all engineers and developers who like to roll up our sleeves and get on the tools.

 

Secondly, after watching larger companies like Facebook define why they use Hackathons, it resonated with the team at Consegna. “Prototypes don’t need to be polished, they just need to work at the beginning”. Facebook Chat came out of a 2007 Hackathon which evolved into Web Chat, then Mobile Chat which then became Facebook Messenger as we know it today. The end result of a hackathon doesn’t need to be be pretty, but the potential could know no bounds. Consegna were able to help Auckland University build something where its success was not judged on immediate outcomes, but on the potential outcomes to do amazing things.

 

Hackathons can also be incredibly motivating for staff members. For those of you who are familiar with Daniel Pink’s book “Drive”, you’ll know money doesn’t always motive staff like we traditionally thought it did. Autonomy, Mastery and Purpose are key parts of Pink’s thinking but also the key to a successful hackathon. The business is not telling you how to do something and in a lot of cases, the guidelines can be very fluid in that they’re not telling you what to build. A guiding question to a successful hackathon can be succinctly put as “what do you think?”

 

We applaud Auckland University for inviting us along to allow their staff members to pick our brains and learn new things. Want to know how to program Alexa? No problem! How can I collect, process and analyse streaming data in real time? Talk to one our lead DevOps Architects, they’ll introduce you to Kinesis. There was as much learning and teaching going on at the hackathon as there was building some amazing applications.

 

It’s important to be aware of potential pitfalls of hosting a hackathon and to be mindful you don’t fall for the traps. You want to make sure you get the most out of your hackathon, everybody enjoys it and that there are plenty of takeaways otherwise bluntly, what’s the point?

 

Hackathons as a general rule don’t allow for access to customers. If you’re wanting to dedicate just 48 hours to solve a problem then how can you have understanding and empathy for customers if they’re not around? If they’re not there to collect feedback and iterate can you even build it? Auckland University got around that problem by largely building for themselves; they were the customer. They could interview each other for feedback so this was a tidy solution so if you think a hackathon will offer a solution for customer-facing applications you might want to think about either making customers available for further enhancements outside of the hackathon or think of a different innovation solution altogether.

 

Before I mentioned how Facebook use hackathons to inspire innovation, but there is one downside – they’re unpaid and on weekends. Employees can feel obligated to participate to please hierarchies which forces innovation out of their staff. Generally, this is not going to go well. The driving factors Pink mentions are not going to be as prevalent if staff are doing it in their own time, usually without pay – to build something the business owns. From what I’ve seen Facebook put on some amazing food for their hackathons but so what? Value your staff, value the work they do and don’t force them to do it on their own time. Auckland University’s hackathon was scheduled over two working days, in work time and made sure staff felt valued for what ultimately was a tool for the University.

 

Over the two days, the 18 teams presented their projects and there were some amazing outcomes. MiA, the winner, used a QR code on a big screen so students can snapshot the QR code and register their attendance to each lecture. This was done with an eye on Rekognition being the next iteration and using image recognition software to measure attendance. Generally speaking, there’s not a whole lot wrong with how the University currently measures attendance with good old fashioned pen and paper but how accurate is it when you’re allowing humans to be involved and how cool is it for an AWS product to take the whole problem away?

 

In all, it was an amazing event to be a part of and we must thank Auckland University for inviting us. Also thanks to AWS for providing a wealth of credits so there was no cost barrier to building and creating.

 

If you’re thinking of hosting your own hackathon I wish you well. It’s a great way to take a break from being busy and focus on creating new tools with potentially unlimited potential. It will get your staff scratching a learning and creative itch they’ve been aching to get at.

 

Most importantly, always take time and make strategies to keep innovating. Don’t be too busy.

How to Save Big Dollars on Direct Connect Costs

Our COO Michael Butler recently published an entertaining and insightful look into how you can save yourself time and money on Direct Connects.

” Having spent many years in the instant gratification world of AWS I find myself getting (unreasonably) frustrated when the IT services I want to use don’t align with the commercial models of public cloud

billing-visualisation

Find out what Mike discovered in Software Defined Networking & the confessions of a network engineer in the cloud and how you can benefit from Mike’s insights.