CloudTwin Hands-on (#2) Troubleshoot AWS Lambda invocation issues with Lightlytics Discovery

Tal Shladovsky

In the previous hands-on we went over how you can predict the impact of proposed changes made with Terraform and prevent critical mistakes before deploying them with Lightlytics Simulation.
In our next hands-on, we'll go over troubleshooting issues in one of the top most used AWS services: AWS Lambda.

Overview – What is AWS Lambda?

AWS Lambda is an event-driven, serverless computing platform provided by Amazon.
It's a fully managed service that runs your code on high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, and logging so that developers can write and execute code without worrying about administrative tasks and focus on what they are meant to do - writing the application code.Lambda runs code in response to events and automatically manages the computing resources required by that code.
Lambda functions and triggers are the core components of building applications on AWS Lambda. A Lambda function is the code and runtime that processes events, while a trigger is the AWS service or application that invokes the function.

Troubleshooting issues in Lambda

When using the Lambda API, console, or tools you'll probably encounter issues and errors.
Serverless applications are distributed applications and debugging distributed applications is different than debugging single-server or monolithic applications, as you have to understand the state of your workload when the error occurred, and consider multiple services, since most production serverless applications use a combination of Lambda functions and other services, thus making it more complex.

When it comes to troubleshooting Lambda errors, you can run into a common one with a listed status code, that is easier to troubleshoot and fix, while on the other hand, you can run into a generic error with no lead, which makes troubleshooting more difficult and time-consuming, even to Lambda's experts.

Recently, I ran into a tutorial on using an Amazon S3 trigger to invoke a Lambda function, and I ran into the following error:

 "errorType": "Error",
 "errorMessage": "Error getting object CloudTwin.jpeg from bucket cloudtwin-s3. Make sure they exist and your bucket is in the same region as this function.",
 "trace": [
   "Error: Error getting object CloudTwin.jpeg from bucket cloudtwin-s3. Make sure they exist and your bucket is in the same region as this function.",
   "  at Runtime.exports.handler (/var/task/index.js:26:15)",
   "  at processTicksAndRejections (internal/process/task_queues.js:95:5)"

As you can see, the error message indicates an issue getting the object (the image) from the bucket,
while suggesting to make sure both the object and the bucket exist, and the bucket is in the same region as the lambda function.
Obviously, I made sure these resources were configured correctly, so the issue is due to another reason.
Without the Lightlytics solution, I would have probably gone over the resource configuration, to see if I missed a certain configuration or configured something wrongly.
It would have taken me time and there's a chance I wouldn't find the issue after all.

Lightlytics Discovery can help you troubleshoot such issues and more in seconds not hours.
You can query your cloud posture and relationships across layers and accounts, to get all the information you need to take action and fix an issue.

Let's see how it works

Lightlytics enables you to understand resources and services relationships and dependencies. We look across configurations layers of all the resources in your infrastructure and use sophisticated proprietary algorithms to determine what network and permissions flows are feasible.

There are two ways to examine resources relationships within your cloud environment:

1. Reachability between end-points (Search Path/s)
The path search allows you to check and validate reachability and view the configuration which created it between two specific end-points, type of end-point, specific tags, or even any resource across VPC's, Regions, Availability Zones, and Accounts. You can find the search path mode on top of your inventory next to the search box.

2. Resource Intersections
When you intersect a resource, Lightlytics returns, as a result, all of the upstream and downstream reachable paths from the intersected entity, so you can easily understand cross layers dependencies. Some general use-cases might be to view all end-points which can communicate via a specific Security Group, NACL, Internet Gateway, IAM Role/Policy, and even VPC Peering.

In this example, we'll focus on search Paths, as it is more suitable for our use case.

     1. At the Lightlytics Discovery page, go to the Paths tab

2. Insert the Source and Destination, your Lambda and S3 bucket names respectively.

3. Click Find Paths

4. The paths result shows the entire path from the source to the destination, and as you can see, the Lambda's execution role has a configuration blocker ("No permissive IAM Policy") that prevents the Lambda function to communicate with the S3 bucket.

5. Now, that we have a lead for the invocation issue, let's deep-dive into the issue.

6. By taking a look at Lambda's execution IAM role information, we can see it's attached with 2 policies:

          a. AWSLambdaS3ExecutionRole

           b. AWSLambdaBasicExecutionRole

 7. By reviewing the AWSLambdaS3ExecutionRole IAM policy, we can see it has a policy statement that includes a "Deny" effect for "s3:GetObject" action for the desired S3 bucket resource - well, we found our blocker configuration.

8. Let's fix the issue by updating the AWSLambdaS3ExecutionRole IAM policy effect to "Allow"

 "Version": "2012-10-17",
 "Statement": [
     "Effect": "Allow",
     "Action": [
     "Resource": "arn:aws:s3:::cloudtwin-s3"

9. We can track the configuration change that fixed the issue on Lightlytics Events with a complete context of who, what, where and when this activity took place within the cloud environment. FYI, you can get notified of changes in real-time and review with the complete context and impact analysis.

10. Let's re-search the path, to make sure we fixed the blocker configuration correctly:

The blocker is gone now

11. Let's re-test and invoke the Lambda code

12. The execution result is successful now

The Lightlytics solution helps you focus on writing code instead of constantly reviewing it.
We save valuable time and resources and help cloud practitioners get more out of their cloud.
To try it out for free check out our Treemium model - 21 days free.

Deploy cloud infrastructure changes with confidence. Troubleshoot faster with the complete context of your cloud environment.