NZTA All-Access Hackathon

Consegna and AWS are proud to be sponsoring the NZTA Hackathon again this year. The event will be held the weekend of 21 – 23 September.

Last year’s event, Save One More Life, was a huge success and the winner’s concept has been used to help shape legislation in order to support its adoption nationally.
The information session held in the Auckland NZTA Innovation space last night, provided great insight into this years event which focuses on accessible transport options, in other words making transport more accessible to everyone, especially those without access to their own car, the disabled, and others in the community who are isolated due to limited transportation options.

The importance of diversity among the teams was a strong theme during the evening. For this event in particular, diverse teams are going to have an edge, as Luke Krieg, the Senior Manager of Innovation NZTA, pointed out, “Data can only tell you so much about a situation. It’s not until you talk to people, that the real insights appear – and what the data alone doesn’t reveal also becomes evident.”

Jane Strange, CX Improvement Lead NZTA, illustrated this point nicely with a bell curve that shows the relationship between users at each extreme of the transport accessibility graph.

Those on the right with high income, urban location, proximity to and choice of transport options invariably define transport policy for those to the left of the curve who are those with low income, located in suburban or rural areas, who are typically more isolated and have fewer transport options.

Luke also stressed how much more successful diverse teams participating in Hackathons usually are. As these are time-boxed events that require a broad spectrum of skills, technology in and of itself often doesn’t win out. Diverse skills are essential to a winning team.

For more information and registration to the event, please visit https://nzta.govt.nz/all-access

 

Cognito User Pool Migration

At Consegna, we like AWS and their services which are covered by a solid bench of documentation, blog posts and best practices. Because it is easy to find open source production ready code on GitHub, it is straightforward to deploy new applications quickly and at scale. However, sometimes, moving too fast may lead to some painful problems over time!

Deploying the AWS Serverless Developer Portal from Github straight to production works perfectly fine. Nevertheless, hardcoded values within the templates make complicated to deploy multiple similar environments within the same AWS account. Introducing some parameterization is usually the way to go to solve that problem. But that leads to deal with a production stack to not be aligned with the staging environments which is, of course, not a best practice…

This blog post describes the solution we have implemented to solve the challenge of migrating Cognito users from one pool to another at scale. The extra step of migrating API keys associated to those users is covered in this blog.

The Technology Stack

The deployed stack involves AWS serverless technologies such as Amazon API Gateway, AWS Lambda, and Amazon Cognito. It is assumed in this blog post that you are familiar with those AWS services but we encourage you to check out the AWS documentation or to contact Consegna for more details.

The Challenge

The main challenge is to migrate Cognito users and their API keys at scale without any downtime or requiring any password resets from the end users.

The official AWS documentation describes two ways of migrating users from one user pool to another:

1. Migrate users when they sign-in using Amazon Cognito for the first time with a user migration Lambda trigger. With this approach, users can continue using their existing passwords and will not have to reset them after the migration to your user pool.
 2. Migrate users in bulk by uploading a CSV file containing the user profile attributes for all users. With this approach, users will require to reset their passwords.

We discarded the second option as we did not want our users to “pay” for this backend migration. So we used the following AWS blog article as a starting point while keeping in mind that it does the cover the entire migration we need to implement. Indeed, by default, an API key is created for every user registering on the portal. The key is stored in API Gateway and is named based on the user’s CognitoIdenityId attribute which is specific to each user within a particular Cognito user pool.

The Solution

The Migration Flow

The following picture represents our migration flow with the extra API key migration step.

Screen Shot 2018-08-13 at 4.38.00 PM

Migration Flow

Notes

  1. The version of our application currently deployed in production does not support the Forget my password flow so we did not implement it in our migration flow (but we should and will).
  2. When a user registers, they must submit a verification code to have access to his API key. In the very unlikely situation where a user has registered against the current production environment without confirming their email address, the user will be migrated automatically with automatic confirmation of their email address by the migration microservice. Based on the number of users and the low probability of this particular scenario, we considered it as an acceptable risk. However it might be different for your application.

The Prerequisites

In order to successfully implement the migration microservice, you first need to grant some IAM permissions and to modify the Cognito user pool configuration.

  1. You must grant your migration Lambda function the following permissions (feel free to restrict those permissions to specific Cognito pools using
    arn:${Partition}:cognito-idp:${Region}:${Account}:userpool/${UserPoolId}):
- Action:
  - apigateway:GetApiKeys
  - apigateway:UpdateApiKey
  - cognito-identity:GetId
  - cognito-idp:AdminInitiateAuth
  - cognito-idp:AdminCreateUser
  - cognito-idp:AdminGetUser
  - cognito-idp:AdminRespondToAuthChallenge
  - cognito-idp:ListUsers
Effect: Allow
Resource: "*"
  1. On both Cognito pools (the one you are migrating from and the one you are migrating to), enable Admin Authentication Flow (ADMIN_NO_SRP_AUTH) for allowing server-based authentication by the Lambda function executing the migration. You can do it via the Management Console or the AWS CLI with the following command:
aws cognito-idp update-user-pool-client \
    --user-pool-id <value> \
    --client-id <value> \
    --explicit-auth-flows ADMIN_NO_SRP_AUTH

More details about the Admin Authentication Flow is available here.

You are all set. Let’s get our hands dirty!

The Implementation (in JS)

At the Application Layer

To allow a smooth migration for our users, the OnFailure of the login method should call our migration microservice instead of returning the original error back to the user. An unauthenticated API Gateway client is initialized to call the migrate_user method on our API Gateway. The result returned by the backend is straightforward: RETRY indicates a successful migration so the application must re login the user automatically else it must handle the authentication error (user does not exist, username or password incorrect and so on).

onFailure: (err) => {
  // Save the original error to make sure to return appropriate error if required...
  var original_err = err;

  // Attempt migration only if old Cognito pool exists and if the original error is 'User does not exist.'
  if (err.message === 'User does not exist.' && oldCognitoUserPoolId !== '') {
    initApiGatewayClient()  // Initialize an unauthenticated API Gateway client
    
    var body = {
      // Prepare the body for the request for all required information such as
      // username, password, old and new Cognito pool information
    }
    
    // Let's migrate your user!
    apiGatewayClient.post("/migrate_user", {}, body, {}).then((result) => {
      resolve(result);
      if (result.data.status === "RETRY") {  // Successful migration!
        // user can now login!
      } else {
          // Oh no, status is not RETRY...
          // Check the error code and display appropriate error message to the user
      } 
    }).catch((err) => {
      // Handle err returned by migrate_user or return original error
    });
  } else {
    // Reject original error
  }
}

The Migration microservice

API Gateway is used in conjunction with Cognito to authenticate the caller but few methods such as our migrate_user must remain unauthenticated. So here the configuration of migrate_user POST method on our API Gateway:

/migrate_user:
    post:
      produces:
      - application/json
      responses: {}
      x-amazon-apigateway-integration:
        uri: arn:aws:apigateway:<AWS_REGION>:lambda:path/2015-03-31/functions/arn:aws:lambda:<AWS_REGION>:<ACCOUNT_ID>:function:${stageVariables.FunctionName}/invocations
        httpMethod: POST
        type: aws_proxy
    options:
      consumes:
      - application/json
      produces:
      - application/json
      responses:
        200:
          description: 200 response
          schema:
            $ref: "#/definitions/Empty"
          headers:
            Access-Control-Allow-Origin:
              type: string
            Access-Control-Allow-Methods:
              type: string
            Access-Control-Allow-Headers:
              type: string
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: 200
            responseParameters:
              method.response.header.Access-Control-Allow-Methods: "'DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT'"
              method.response.header.Access-Control-Allow-Headers: "'Content-Type,Authorization,X-Amz-Date,X-Api-Key,X-Amz-Security-Token'"
              method.response.header.Access-Control-Allow-Origin: "'*'"
        passthroughBehavior: when_no_match
        requestTemplates:
          application/json: "{\"statusCode\": 200}"
        type: mock

The implementation of migrate_user is simply added to our express-server.js so no Lambda to manage so to speak. The function is available below and we are going to deep dive in details into each step:

app.post('/migrate_user', (req, res) => {
    // 1 -- Extract paramters from the body
    var username = req.body.username;
    var password = req.body.password;
    // etc ...

    var oldCognitoIdentityId = null;
    var cognitoIdentityId = null;
    var answer = { "status": "NO_RETRY" };

    const migrate_task = async () => {

        // 2 -- Check if migration is required
        let result = await isMigrationRequired(username, cognitoUserPoolId);
        if (result === false) return "NO_RETRY";

        // 3 -- Resolve the CognitoIdentityId of the user within the old pool
        result = await getCognitoIdentityId(username, password, oldCognitoUserPoolId, oldCognitoIdentityPoolId, oldCognitoClientId, oldCognitoRegion);
        if (result.error != null) {
            // Analyse error and return appropriate error code
            if (result.error.code === "PasswordResetRequiredException") return "NO_RETRY_PASSWORD_RESET_REQUIRED";
            else return "NO_RETRY";
        } else oldCognitoIdentityId = result.cognitoIdentityId;

        // 4 -- Extract the user's attributes to migrate from the old to the new pool
        var attributesToMigrate = await getUserAttributes(username, oldCognitoUserPoolId);

        // 5 -- Migrate user from old to new pool
        result = await migrateUser(username, password, cognitoUserPoolId, cognitoClientId, attributesToMigrate);
        if (result.error !== null) {
            // Something went wrong during the migration!
            return "NO_RETRY";
        }

        // 6 -- Resolve the CognitoIdentityId of the user within the new pool
        result = await getCognitoIdentityId(username, password, cognitoUserPoolId, cognitoIdentityPoolId, cognitoClientId, cognitoRegion);
        if (result.error !== null) {
            // Analyse error and return appropriate error code
            if (result.error.code === "PasswordResetRequiredException") return "NO_RETRY_PASSWORD_RESET_REQUIRED";
            else return "NO_RETRY";
        } else cognitoIdentityId = result.cognitoIdentityId;

        // 7 -- Migrate the user's API key
        result = await migrateApiKey(username, cognitoIdentityId, oldCognitoIdentityId);

        // 8 -- Migration complete!
        return "RETRY";
    }

    migrate_task()
        .then((value) => {
            answer.status = value;
            if (value === "RETRY") {
                res.status(200).json(answer);
            } else res.status(500).json(answer);
        })
        .catch((error) => {
            answer.status = value;
            res.status(500).json(answer);
        })
});
1 – Extract parameters from the body

All the data required for the migration has been passed by the application to our function via req so we just extract it. Of course do not log the password else it will appear in clear in the execution logs of your Lambda.

Note: you might wish to inject the Cognito pool information directly to the Lambda via environment variables instead of passing via the body of the request.

2 – Check if migration is required

A migration is indicated as required only if the user does not already exist in the new pool. However be aware that this function does not verify the existence of the user in the old pool (the check is made during step 3.):

function isMigrationRequired(username, cognitoUserPoolId) {
  return new Promise((resolve, reject) => {
    var params = {
      Username: username,
      UserPoolId: cognitoUserPoolId
    };
    
    cognitoidentityserviceprovider.adminGetUser(params, function(lookup_err, data) {
      if (lookup_err) {
        if (lookup_err.code === "UserNotFoundException") {
          // User not found so migration should be attempted!
          resolve(true);
        } else {
          reject(lookup_err)  // reject any other error
        }
      } else {
        resolve(false);  // User does exist in the pool so no migration required
      }
    });
  })
};
3 – Resolve the CognitoIdentityId of the user within the old pool

Authenticate the user against the old pool using adminInitiateAuth and get his CognitoIdentityId via the getId method. This is required for the migration of the user’s API key. Of course, if the user cannot be authenticated against the old pool, they cannot be migrated so the function returns the error straight away.

function getCognitoIdentityId(username, password, cognitoUserPoolId, cognitoIdentityPoolId, cognitoClientId, cognitoRegion) {

  var params = {
    AuthFlow: 'ADMIN_NO_SRP_AUTH',
    ClientId: cognitoClientId,
    UserPoolId: cognitoUserPoolId,
    AuthParameters: {
      USERNAME: username,
      PASSWORD: password
    }
  };

  var result = {
    "cognitoIdentityId": null,
    "error": null
  }

  return new Promise((resolve, reject) => {
    cognitoidentityserviceprovider.adminInitiateAuth(params, function(initiate_auth_err, data) {
      if (initiate_auth_err) {
        // Error during authentication of the user against the old pool so this user cannot be migrated!
        result.error = initiate_auth_err;
        resolve(result);
      } else {
        // User exists in the old pool so let's get his CognitoIdentityId
        var Logins = {};
        Logins["cognito-idp." + cognitoRegion + ".amazonaws.com/" + cognitoUserPoolId] = data.AuthenticationResult.IdToken;
        params = {
          IdentityPoolId: cognitoIdentityPoolId,
          Logins: Logins
        };
        cognitoidentity.getId(params, function(get_id_err, data) {
          result.cognitoIdentityId = data.IdentityId;
          resolve(result);
        });
      }
    });
  });
}
4 – Extract the attribute of user to migrate from the old to the new pool

Resolve the user’s attributes to migrate and force email_verified to true to avoid post-migration issues.

Note: all the attributes must be migrated except sub because this attribute is Cognito pool specific and will be created by the new pool.

function getUserAttributes(username, cognitoUserPoolId) {
  var user = null;
  var params = {
    UserPoolId: cognitoUserPoolId,
    Filter: "username = \"" + username + "\""
  };
  
  var result = [];

  return new Promise((resolve, reject) => {
    cognitoidentityserviceprovider.listUsers(params, function(list_err, data) {
      if (list_err) console.log("Error while listing users using " + params + ": " + list_err.stack);
      else {
        data.Users[0].Attributes.map(function(attribute) {
          if (attribute.Name === 'email_verified') {
            attribute.Value = 'true';
          }
          if (attribute.Name !== 'sub') result.push(attribute);
        });
      }

      resolve(result);
    });
  });
}
5 – Migrate user from old to new pool

Our user is now ready to be migrated! So let’s use the admin features of Cognito(adminCreateUser, adminInitiateAuth, and adminRespondToAuthChallenge) to create the user, authenticate the user, and set their password.

function migrateUser(username, password, cognitoUserPoolId, cognitoClientId, attributesToMigrate) {
  var params = {
    UserPoolId: cognitoUserPoolId,
    Username: username,
    MessageAction: 'SUPPRESS', //suppress the sending of an invitation to the user
    TemporaryPassword: password,
    UserAttributes: attributesToMigrate
  };
  
  var result = {
    "error": null
  }
  
  return new Promise((resolve, reject) => {
    cognitoidentityserviceprovider.adminCreateUser(params, function(create_err, data) {
      if (create_err) {
        result.error = create_err;
        resolve(result);
      } else {
        // Now sign in the migrated user to set the permanent password and confirm the user
        params = {
          AuthFlow: 'ADMIN_NO_SRP_AUTH',
          ClientId: cognitoClientId,
          UserPoolId: cognitoUserPoolId,
          AuthParameters: {
            USERNAME: username,
            PASSWORD: password
          }
        };
        cognitoidentityserviceprovider.adminInitiateAuth(params, function(initiate_auth_err, data) {
          if (initiate_auth_err) {
            result.error = initiate_auth_err;
            resolve(result);
          } else {
            // Handle the response to set the password (confirm the challenge name is NEW_PASSWORD_REQUIRED)
            if (data.ChallengeName !== "NEW_PASSWORD_REQUIRED") {
              result.error = new Error("Unexpected challenge name after adminInitiateAuth [" + data.ChallengeName + "], migrating user created, but password not set")
              resolve(result)
            }

            params = {
              ChallengeName: "NEW_PASSWORD_REQUIRED",
              ClientId: cognitoClientId,
              UserPoolId: cognitoUserPoolId,
              ChallengeResponses: {
                "NEW_PASSWORD": password,
                "USERNAME": data.ChallengeParameters.USER_ID_FOR_SRP
              },
              Session: data.Session
            };
            cognitoidentityserviceprovider.adminRespondToAuthChallenge(params, function(respond_err, data) {
              if (respond_err) {
                result.error = respond_err;
              }

              resolve(result)
            });
          }
        });
      }
    });
  });
}
6 – Resolve the CognitoIdentityId of the user within the new pool

Our user is now created within the new pool so let’s resolve his CognitoIdentityId required for migrating his API key.

7 – Migrate user’s API key

Migrate the user’s API key by renaming it to point to the user’s CognitoIdentityId resolved during step 6.

function migrateApiKey(username, cognitoIdentityId, oldCognitoIdentityId) {
  var params = {
    nameQuery: oldCognitoIdentityId
  };

  return new Promise((resolve, reject) => {
    apigateway.getApiKeys(params, function(get_key_err, data) {
      params = {
        apiKey: apiKeyId,
        patchOperations: [{
          op: "replace",
          path: "/name",
          value: cognitoIdentityId
        }, {
          op: "replace",
          path: "/description",
          value: "Dev Portal API Key for " + cognitoIdentityId
        }]
      };
      // Update API key name and description to reflect the new CognitoIdentityId
      apigateway.updateApiKey(params, function(update_err, data) {
        console.log("API key (id: [" + apiKeyId + "]) updated successfully");
        resolve(true)
      });
    });
  })
}
8 – Migration complete, so return RETRY to indicate success

The migration is now complete so return RETRY status indicating to the application that the user must be re logged in automatically.

Conclusion

By leveraging AWS serverless technologies we have been able to fully handle the migration of our client’s application users at the backend level. The customer was happy with this solution as it avoided sending requests to the users to reset their password and it realigned the production with staging.

It’s implementing solutions like this that helps set Consegna apart from other cloud consultancies — we are a true technology partner and care deeply about getting outcomes for customers that align with their business goals, not just looking after our bottom line.

What is your digital waste footprint?

How many times have you walked into your garage and took stock of all the things you haven’t used in years? Those bikes that you bought for you and your partner that you haven’t used since the summer of ‘09, the fishing rods, the mitre saw, the boat (if you’re lucky) and the list goes on and on. Imagine if you didn’t have to pay for them all up front – and better yet, imagine if you could stop paying for them the moment you stopped using them!

Amazingly, that is the world we live in with the public cloud. If you’re not using something, then you shouldn’t be paying for it – and if you are, then you need to ask yourself some hard questions. The problem we’re seeing in customer-land is twofold:

  1. Technical staff are too far removed from whoever pays the bills, and
  2. It’s easier than ever to start new resources that cost money

Technical staff don’t care about the bill

Many technical staff that provision resources and use services on AWS have no idea what they cost and have never seen an invoice or the billing dashboard. They don’t pay the bills, so why would they worry about what it costs?

Working with technical staff and raising awareness around the consequences of their choices in the public cloud goes a long way to arresting the free-fall into an unmanageable hosting bill. By bringing the technical staff along on the optimisation journey, you’re enabling them to align themselves with business goals and feel the choices they make are contributing in a positive way.

It’s so easy to create new resources

One of the biggest strengths of the public cloud is how easy it is to provision resources or enable services, however this appears to be one of its weaknesses as well. It’s because of this ease of use that time and time again we see serious account sprawl: unused, underutilised and over-sized resources litter the landscape, nobody knows how much Project A costs compared to Project B and there isn’t a clear plan to remediate the wastage and disarray.

Getting a handle on your hosting costs is an important step to take early on and implementing a solid strategy to a) avoid common cost related mistakes and b) be able to identify and report on project costs is crucial to being successful in your cloud journey.

Success stories

Consegna has recently engaged two medium-to-large sized customers and challenged them to review the usage of their existing AWS services and resources with a view to decreasing their monthly cloud hosting fees. By working with Consegna as an AWS partner and focusing on the following areas, one customer decreased their annual bill by NZD$500,000 and the other by NZD$100,000. By carefully analysing the following areas of your cloud footprint, you should also be able to significantly reduce your digital waste footprint.

Right-sizing and right-typing

Right-sizing your resources is generally the first step you’ll take in your optimisation strategy. This is because you can make other optimisation decisions that are directly related to the size of your existing resources, and if they aren’t the right size to begin with then those decisions will be made in error.

Right-typing can also help reduce costs if you’re relying on capacity in one area of your existing resource type that can be found in a more suitable resource type. It’s important to have a good idea of what each workload does in the cloud, and to make your decisions based on this instead of having a one-size-fits all approach.

Compute

Right-sizing compute can be challenging if you don’t have appropriate monitoring in place. When making right-sizing decisions there are a few key metrics to consider, but the main two are CPU and RAM. Because of the shared responsibility model that AWS adheres to, it doesn’t have access to RAM metrics on your instances out-of-the-box so to get a view on this you need to use third party software.

Consegna has developed a cross-platform custom RAM metric collector that ships to CloudWatch and has configured a third-party integration to allow CloudCheckr to consume the metrics to provide utilisation recommendations. Leveraging the two key metrics, CPU and RAM, allows for very accurate recommendations and deep savings.

Storage

Storage is an area that gets overlooked regularly which can be a costly mistake. It’s important to analyse the type of data you’re storing, how and how often you’re accessing it, where it’s being stored and how important it is to you. AWS provides a myriad of storage options and without careful consideration of each, you can miss out on substantial decreases of your bill.

Database

Right-sizing your database is just as important as right-sizing your compute – for the same reasons there are plenty of savings to be had here as well.

Right-typing your database can also be an interesting option to look at as well. Traditional relational databases appear to be becoming less and less popular with new serverless technologies like DynamoDB – but it’s important to define your use case and provision resources appropriately.

It’s also worth noting that AWS have recently introduced serverless technologies to their RDS offering which is an exciting new prospect for optimisation aficionados.

Instance run schedules

Taking advantage of not paying for resources when they’re not running can make a huge difference to how much your bill is, especially if you have production workloads that don’t need to be running 24/7. Implementing a day / night schedule can reduce your bill by 50% for your dev / test workloads.

Consenga takes this concept to the next level by deploying a portal for non-technical users to control when the instances they deal with day-to-day are running or stopped. By pushing this responsibility out to the end users, instances that would have been running 12 hours a day based on a rigid schedule now only run for as long as they’re needed – an hour, or two usually – supercharging the savings.

Identify and terminate unused and idle resources

If you’re not using something then you should ask yourself if you really need it running, or whether or not you could convert it to an on-demand type model.

This seems like an obvious one, but the challenge can actually be around identification – there are plenty of places resources can hide in AWS so being vigilant and using the help of third party software can be key to aid you in this process.

Review object storage policies

Because object storage in AWS (S3) is so affordable, it’s easy to just ignore it and assume there aren’t many optimisations to be made in this area. This can be a costly oversight as not only the type of storage you’re using is important, but how frequently you need to access the data as well.

Lifecycle policies on your object storage is a great way to automate rolling infrequently used data into cold storage and can be a key low-hanging fruit that you can nab early on in your optimisation journey.

Right-type pricing tiers

AWS offers a robust range of pricing tiers for a number of their services and by identifying and leveraging the correct tiers for your usage patterns, you can make some substantial savings. In particular you should be considering Reserved Instances for your production resources that you know are going to be around forever, and potentially Spot Instances for your dev / test workloads that you don’t care so much about.

Of course, there are other pricing tiers in other services that are worth considering.

Going Cloud Native

AWS offers many platform-as-a-service offerings which take care of a lot of the day to day operational management that is so time consuming. Using these offerings as a default instead of managing your own infrastructure can provide some not so tangible optimisation benefits.

Your operations staff won’t be bogged down with patching and keeping the lights on – they’ll be freed up to innovate and explore the new and exciting technologies that AWS are constantly developing and releasing to the public for consumption.

Consegna consistently works with its technology and business partners to bake this optimisation process into all cloud activities. By thinking of ways to optimise and be efficient first, both hosting related savings and operational savings are achieved proactively as opposed to reactively.

Slow DoS attack mitigation — a Consegna approach.

Recently we discovered that a customer’s website was being attacked in what is best described as a “slow DoS”. The attacker was running a script that scraped each page of the site to find possible PDF files to download, then was initiating many downloads of each file.

Because the site was fronted by a Content Delivery Network (CDN), the site itself was fine and experienced no increase in load or service disruption, but it did cause a large spike in bandwidth usage between the CDN and the clients. The increase in bandwidth was significant enough to increase the monthly charge from around NZ$1,500 to over NZ$5,000. Every time the customer banned the IP address that was sending all the requests, a new IP would appear to replace it. It seems the point of the attack was to waste bandwidth and cost our customer money — and it was succeeding.

The site itself was hosted in AWS on an EC2 instance, however the CDN service the site was using was a third party — Fastly. After some investigation, it seemed that Fastly didn’t have any automated mitigation features that would stop this attack. Knowing that AWS Web Application Firewall (WAF) has built in rate-based rules we decided to investigate whether we could migrate the CDN to CloudFront and make use of these rules.

All we needed to do was create a CloudFront distribution with the same behaviour as the Fastly one, then point the DNS records to CloudFront — easy right? Fastly has a neat feature that allows you to redirect at the edge which was being used to redirect the apex domain to the www subdomain — if we were to replicate this behaviour in CloudFront we would need some extra help, but first we needed to make sure we could make the required DNS changes.

To point a domain at CloudFront that is managed by Route 53 is easy, you can just set an ALIAS record on the apex domain and a CNAME on the www subdomain. However, this customers DNS was managed by a third-party provider who they were committed to sticking with (this is a blog post for another day). The third-party provider did not support ALIAS or ANAME records and insisted that apex domains could only have A records — that meant we could only use IP addresses!

Because CloudFront has so many edge locations (108 at the time of writing), it wasn’t practical to get a list of all of them and set 108 A records — plus this would require activating the “static IP” feature of CloudFront which gives you a dedicated IP for each edge location, which costs around NZ$1,000 a month.

And to top all that off, whatever solution we decided to use would only be in place for 2 months as the site was being migrated to a fully managed service. We needed a solution that would be quick and easy to implement — AWS to the rescue!

So, we had three choices:

  1. Stay with Fastly and figure out how to ban the bad actors
  2. Move to CloudFront and figure out the redirect (bearing in mind we only had A records to work with)
  3. Do nothing and incur the NZ$5,000 cost each month — high risk if the move to a managed service ended up being delayed. We decided this wasn’t really an option.

We considered spinning up a reverse proxy and pointing the apex domain at it to redirect to the www subdomain (remember, we couldn’t use an S3 bucket because we could only set A records) but decided against this approach because we’d need to make the reverse proxy scalable given we’d be introducing it in front of the CDN during an ongoing DoS attack. Even though the current attack was slow, it could have easily been changed into something more serious.

We decided to stay with Fastly and figure out how to automatically ban IP addresses that were doing too many requests. Aside from the DNS limitation, one of the main drivers for this decision was inspecting the current rate of the DoS — it was so slow that it was below the minimum rate-based rule configuration that the AWS WAF allows (2,000 requests in 5 minutes). We needed to write our own rate-based rules anyway, so using CloudFront and WAF didn’t solve our problems straight away.

Thankfully, Fastly had an API that we could hit with a list of bad IPs — so all we needed to figure out was:

  1. Get access to the Fastly logs,
  2. Parse the logs and count the number of requests,
  3. Auto-ban the bad IPs.

Because Fastly allows log shipping to S3 buckets, we configured it to ship to our AWS account in a log format that could be easily consumed by Athena, and wrote a couple of AWS Lambda functions that:

  1. Queried the Fastly logs S3 bucket using Athena,
  2. Inspected the logs and banned bad actors by hitting the Fastly API, maintaining state in DynamoDB,
  3. Built a report of bad IPs and ISPs and generated a complaint email.

The deployed solution looked something like this:

By leveraging S3, Athena, Lambda and DynamoDB we were able to deploy a fully serverless rate-based auto-banner for bad actors with a very short turnaround. The customer was happy with this solution as it avoided having to incur the $5000 NZD / month cost, avoided needing to change the existing brittle DNS setup and also provided some valuable exposure into how powerful serverless technology on AWS is.

It’s implementing solutions like this that helps set Consegna apart from other cloud consultancies — we are a true technology partner and care deeply about getting outcomes for customers that align with their business goals, not just looking after our bottom line.

Busy-ness And Innovation. How Hackathons Can Successfully Promote New Ideas And Thinking

At Consegna we pride ourselves on our experience and knowledge.  Our recently appointed National Client Manager has a knack for knowing virtually everybody in Wellington. This can be amusing if you’re with him as a 10 minute walk down Lambton Quay can take half an hour. We tend to now leave gaps that long between meetings. One of the questions he will ask people is one we probably all ask, “how’s business?”  The answer is always the same, “busy!”

 

Now sometimes that can just be a standard answer, but if it’s true and we’re all so busy then what are we busy doing? We would like to think we’re doing super innovative work but how much time do we actually dedicate to innovation? Most of the time we’re busy just trying to complete projects. Google famously scraped their “20% time” dedicated to innovation because managers were judged on the productivity of their teams and they were in turn concerned about falling behind on projects where only 80% capacity was used. Yet, the flipside of that was that  “20% time” formed the genesis of Gmail, Adsense and Google Talk.

 

So this leads on to thinking how can we continue to create when we’re all so busy? We have to be innovative about being innovative.  One solution that organisations have been working on is the idea of a hackathon. A short period of time, usually around 48 hours, for organisations to stop what they’re doing and work on exciting new things that traditionally might be out of scope or “nice to haves”.

 

Consegna was selected as the University of Auckland cloud enablement partner in 2017 and were recently involved with the University’s cloud centre of excellence team who wanted to promote and facilitate innovation within their internal teams. One answer to create more innovation was hosting a Hackathon with Auckland University internal teams. In total there were 17 teams of 2-6 people a team. They were tasked with trying to solve operational issues within the University – and there were a number of reasons we jumped at the chance to help them with their hackathon.

 

First and foremost we liked the idea of two days being used to build, break and create. We were there as the AWS experts to help get the boat moving, but deep down we’re all engineers and developers who like to roll up our sleeves and get on the tools.

 

Secondly, after watching larger companies like Facebook define why they use Hackathons, it resonated with the team at Consegna. “Prototypes don’t need to be polished, they just need to work at the beginning”. Facebook Chat came out of a 2007 Hackathon which evolved into Web Chat, then Mobile Chat which then became Facebook Messenger as we know it today. The end result of a hackathon doesn’t need to be be pretty, but the potential could know no bounds. Consegna were able to help Auckland University build something where its success was not judged on immediate outcomes, but on the potential outcomes to do amazing things.

 

Hackathons can also be incredibly motivating for staff members. For those of you who are familiar with Daniel Pink’s book “Drive”, you’ll know money doesn’t always motive staff like we traditionally thought it did. Autonomy, Mastery and Purpose are key parts of Pink’s thinking but also the key to a successful hackathon. The business is not telling you how to do something and in a lot of cases, the guidelines can be very fluid in that they’re not telling you what to build. A guiding question to a successful hackathon can be succinctly put as “what do you think?”

 

We applaud Auckland University for inviting us along to allow their staff members to pick our brains and learn new things. Want to know how to program Alexa? No problem! How can I collect, process and analyse streaming data in real time? Talk to one our lead DevOps Architects, they’ll introduce you to Kinesis. There was as much learning and teaching going on at the hackathon as there was building some amazing applications.

 

It’s important to be aware of potential pitfalls of hosting a hackathon and to be mindful you don’t fall for the traps. You want to make sure you get the most out of your hackathon, everybody enjoys it and that there are plenty of takeaways otherwise bluntly, what’s the point?

 

Hackathons as a general rule don’t allow for access to customers. If you’re wanting to dedicate just 48 hours to solve a problem then how can you have understanding and empathy for customers if they’re not around? If they’re not there to collect feedback and iterate can you even build it? Auckland University got around that problem by largely building for themselves; they were the customer. They could interview each other for feedback so this was a tidy solution so if you think a hackathon will offer a solution for customer-facing applications you might want to think about either making customers available for further enhancements outside of the hackathon or think of a different innovation solution altogether.

 

Before I mentioned how Facebook use hackathons to inspire innovation, but there is one downside – they’re unpaid and on weekends. Employees can feel obligated to participate to please hierarchies which forces innovation out of their staff. Generally, this is not going to go well. The driving factors Pink mentions are not going to be as prevalent if staff are doing it in their own time, usually without pay – to build something the business owns. From what I’ve seen Facebook put on some amazing food for their hackathons but so what? Value your staff, value the work they do and don’t force them to do it on their own time. Auckland University’s hackathon was scheduled over two working days, in work time and made sure staff felt valued for what ultimately was a tool for the University.

 

Over the two days, the 18 teams presented their projects and there were some amazing outcomes. MiA, the winner, used a QR code on a big screen so students can snapshot the QR code and register their attendance to each lecture. This was done with an eye on Rekognition being the next iteration and using image recognition software to measure attendance. Generally speaking, there’s not a whole lot wrong with how the University currently measures attendance with good old fashioned pen and paper but how accurate is it when you’re allowing humans to be involved and how cool is it for an AWS product to take the whole problem away?

 

In all, it was an amazing event to be a part of and we must thank Auckland University for inviting us. Also thanks to AWS for providing a wealth of credits so there was no cost barrier to building and creating.

 

If you’re thinking of hosting your own hackathon I wish you well. It’s a great way to take a break from being busy and focus on creating new tools with potentially unlimited potential. It will get your staff scratching a learning and creative itch they’ve been aching to get at.

 

Most importantly, always take time and make strategies to keep innovating. Don’t be too busy.

Why Cloud is Relevant to Your Business Today

As the newly appointed ICT Manager for a large government agency, Jeff was keen to make his mark quickly and decisively in his new role. Looking at the IT spend over the last three years, he could see that in spite of the market shifting considerably, the agency had been paying exorbitant amounts for application hosting. The market had trended downwards with pressure from large Cloud providers like AWS. Looking at their hosting arrangements more closely, Jeff could not only see their costs remained largely unchanged but also, service reliability had been steadily declining. This project looked like an ideal candidate to reduce cost, increase service levels, and make the mark he wanted.

After looking at the current environment carefully, a current rebuild of the primary public website for the agency appeared to be a good choice. The site was currently in development using AWS services. The development team had chosen AWS for development due to its low cost, well within their budget. Far more compelling to them was the speed with which the developers could provision and utilise the necessary resources. What would typically take internal IT weeks to provide for the developers could be accomplished inside a day using AWS management tools and company best practice.
Using managed services such as AWS Elastic Beanstalk, not only did the development team have ready access to a low-cost development environment they could provision on demand. They could also run parallel application stacks for testing. That just wasn’t possible using their existing infrastructure that was difficult to access and manage. As such, the AWS services allowed new configurations to be tested quickly at a fraction of the cost of traditional infrastructure. Cents per hour, versus a few hundred dollars a month.

With the application completed, launch day and migration commenced. Fortunately, Jeff had identified that the team needed an AWS partner to assist with the migration. This lead to the realisation that a scalable architecture was required to support the fluctuating demand on the website. With the right design, the right AWS partner, the site was migrated and delivered a 75% saving on the former hosting costs. With the automated scaling and monitoring, AWS services provided as part of the production environment, site outages dropped to less than 1% over first few months of operation, improving even more over time. The site had gone from 2 – 3 outages per month on the old hosting, due to network and other issues, to no unscheduled outages from one month to the next.

At this point, one would think this would be the end of the story. The primary production site was on AWS at a fraction of the former cost. Service levels were higher than they ever had been. The new problem that was beginning to emerge was cost control and environmental regulation.

With the success of this project, Jeff’s teams started moving more and more projects to AWS. As more teams in the organisation also adopted this approach, keeping an eye on resource usage started becoming more challenging. Managing what teams and individuals had access to resources was also emerging as a challenge. The situation Jeff was finding himself in after an initial easy win is quite common. Many companies who discovered server virtualisation during the mid-2000’s also learned technology on its own can create all new challenges nobody had previously anticipated.

The simple answer to why Cloud is relevant to your business today is the agility it provides and the transfer of CapEx to OpEx. Not to mention the tangible cost savings you can make. What’s important for ICT managers to understand, however, is the importance of a structured approach to Cloud Adoption. Not every workload is going to be a suitable candidate for migration. Ongoing success requires the implementation of a Cloud Adoption Framework (CAF) and the establishment of a Cloud Centre of Excellence (CCoE). Neither of which need to be as daunting as they sound. The CAF examines your current environment and assesses what workloads would work in the Cloud. It highlights six perspectives around Business, People, Governance, Platform, Security, and Operations. In so doing, it ensures thought is given to each of these within the context of Cloud. What training do your people need for example, when a particular application gets migrated to the Cloud? How would their roles change? What operational support would they need?

A CCoE should be seen as the thought leadership and delivery enablement team that will help plan and execute on the CAF. It usually consists of SMEs from each principal area the CAF examines. By choosing an AWS Consulting Partner like Consegna, who understand this pragmatic, structured approach to cloud adoption and digital transformation, ongoing long-term success starts from a solid foundation.

The discoveries Jeff made during his journey to Cloud are being made by ICT managers on a near-daily basis. An increasing number of Jeff’s peers understand that Cloud is a timely and necessary step to reduce cost and increase agile productivity. Those with that extra slice of understanding and knowledge are working with partners like Consegna to do Digital Transformation the smart way. Putting CAF and CCoE at the forefront of their journey, and seeing great successes in doing so.

Why is CAF Critical to Cloud Adoption?

An all too familiar scenario playing out in numerous organisations today is a rapid push onto the Cloud.  Typically this entails a technical division getting approval to migrate a workload onto to the Cloud, or establish a new project on AWS EC2 instances.

Fast forward a few months and the organisation’s cloud adoption seems to be progressing fairly well.  More and more teams and applications are running on AWS, but it’s around this time that cost and management have started spiralling out of control.  A quick survey of the organisation’s AWS account reveals there are users, instances and various other resources running out of control “on the cloud”.  The typical issues many organisations have wrestled with in their traditional infrastructure space has been directly translated onto the Cloud.  Now the financial controller and other executives are starting to question the whole Cloud strategy.  What exactly has gone wrong here?

Whether you’re prototyping a small proof of concept (POC) or looking to migrate your entire organisation into the Cloud, if you fail to adhere to a Cloud Adoption Framework (CAF) and form a Cloud Centre of Excellence (CCoE), it won’t be a case of if the wheels fall off the wagon, but when.

Think of the CAF as a structured roadmap that outlines how and when you’re going to migrate onto the Cloud, and the CCoE as the project leadership team who will be able to execute the CAF in a manner that will guarantee results.

The CAF looks at Cloud Adoption from six different perspectives. Business, People, Governance, Platform, Security and Operations.

 

It ensures that no matter what project you have in mind, the broader implications of Cloud adoption are highlighted and worked through.  If you’ve ever been privy to the disaster an ad-hoc technology lead adoption results in, you’ll truly appreciate how critical CAF is to successful Cloud transformation.

Consegna specialise in running CAF workshops for organisations to ensure success with end to end Cloud transformation, both from a Business and Technical perspective, for complete Cloud adoption.  For a coffee and chat about CAF and how it can ensure the success of your Cloud migration and digital transformation, contact sales@consegna.cloud for a catch up.

How Do You Assess A Trusted Cloud Partner?

When it comes to digital transformation and cloud adoption, a trusted advisor with the right experience is a crucial component to your success.  How do you evaluate your options?

Case studies such as the one recently published by Consegna.Cloud, highlight the challenges and solutions provided, giving key decision makers essential insights into the scope and scale of other projects in relation to their own.

 

Case Study

 

Key points of interest in the case study are identifying the complexity of the project.  For example, in the QV case study Consegna recently published, the existing systems were almost twenty years old, so a mature system with a lot of legacy components.  Thus, a high level of complexity.

Case studies also highlight the key benefits the project delivered.  In the case of QV, over 50% savings on Infrastructure costs alone.  Please look for these key metrics as you read through the case study.

Such artefacts are useful in establishing the evidence based capabilities that Consegna, a trusted Cloud advisor, possess.  Feel free to contact the Consegna team to ascertain how your project compares to the QV project.  Simply email sales@consegna.cloud  to start the conversation.