You can’t blame them. There have been numerous cases in recent years where malicious actors created bogus open source packages that looked authentic, or infected existing packages with code that leaks sensitive data such as credentials, environment variables, session tokens and so forth.
Here are a few recent examples:
- Reported malicious module: getcookies
- Research Found Backdoor In Python Library That Steals SSH Credentials
- Code packages available in PyPI contained modified installation scripts
Protecting your applications against malicious 3rd party libraries isn’t simple. You could start by reviewing the source code of each and every dependency you import. If you think this is a tedious task, just wait until you discover that in most cases, your own dependencies will also depend on other dependencies, so good luck with that :-)
NPM 'Express' package Dependencies Visualized By https://npm.anvaka.com
If you’re thinking of using your favorite open source package security scanner, which checks for known vulnerabilities such as CVEs - think again. These scanners rely on databases of know vulnerable packages, they are incapable of detecting poisoned or malicious packages that aren't registered in public databases.
If code review is not your thing, your next option is to monitor, block or filter unwanted outbound network connectivity.
In a traditional deployment model - on premise, hosted, in an IaaS environment, and even when using containers, you have the required access to the runtime environment, which means you can use a sniffer to monitor network traffic, or use iptables to limit or block outbound network connectivity. However, in public-cloud serverless environments, you are incapable of doing this - you don’t have access to the underlying infrastructure.
Some organizations have responded to this threat by isolating their sensitive AWS Lambda functions inside a Virtual Private Cloud (VPC) and then restricting outbound traffic from that VPC. However, this VPC-based solution presents its own technical challenges - this article includes a lot of useful information about this topic.
Open Source Dependencies: A Double-Edged Sword
In order to emphasize the challenges and risks of using untrusted 3rd party dependencies in your AWS Lambda functions, we’ve created an example function, which receives an email address via an API Gateway request, and stores it inside a DynamoDB table.
Our function imports a (malicious) 3rd party open source library, which we created by forking a popular NPM package ('addressparser'). The library is called ‘addressparser-malicious’. Needless to say, real-world malicious libraries are likely to use less conspicuous names.
Here is the offensive code in our malicious/forked package:
The original and legitimate ‘addressparser’ library takes an email address, and parses it into the different relevant components while normalizing the input. To spice things up, our own malicious version, will grab all of your AWS Lambda runtime environment variables, and post them to a host owned by the attacker, where it will be collected. In our case, it posts the data to 'http://devnull-as-a-service.com/dev/null' .
Can You Spot The Leak?
When data is leaked through an HTTP(s) API call from the AWS Lambda runtime environment, you're not likely to spot any traces of the leakage in the (vanilla) cloud logging facilities such as AWS CloudWatch or CloudTrail - that's not what they were designed for. Moreover, since the outbound connection will most likely use HTTPS (TLS), any external security solution will not be able to view the data and spot the leakage.
When executing our own function, which leaks our environment variables, this is what we see in CloudWatch:
FunctionShield to the Rescue
FunctionShield is a 100% free security library that equips developers with the ability to easily enforce strict security controls on AWS Lambda function runtime by addressing 4 common use cases:
- Disable outbound internet connectivity (except for AWS resources) from the serverless runtime environment, if such connections are not required
- Disable read/write on the /tmp/ directory, if such operations are not required
- Disable child process execution, if such execution is not required by the function
- Disable read access to the function's handler and prevent source code leakage
In addition to the security protections provided, developers also gain tremendous security visibility when using FunctionShield, even if it's just set to "alert". With FunctionShield deployed in your functions, you can quickly get an idea of what your function is executing, who it is communicating with, and whether or not it is writing to disk.
FunctionShield currently supports Node.js, Python and Java, and deploying it couldn’t be simpler. Here’s how to do it:
- In your project’s directory, run:
npm i @puresec/function-shield
- Then add the the library to each function you’d like to protect:
const FunctionShield = require('@puresec/function-shield');
- Call the configure() method, with the desired settings.
We've also posted a 5-minute getting-started guide video here.
Prior to starting the process, make sure you sign up for a token.
Here’s the same AWS Lambda function we used earlier, this time, with FunctionShield deployed:
When we deploy and execute our AWS lambda function, we will receive an exception from FunctionShield, and in CloudWatch, we will be able to see the following information:
FunctionShield successfully blocked the outbound connectivity attempt to ‘devnull-as-a-service.com’, and saved the day!
Some functions might need outbound connectivity for legitimate reasons. If that’s the case, you can call the FunctionShield configure() method multiple times, so you can set the configuration to “allow” just before the relevant code section, and then switch it back to “block” right after. Easy.
If you need reaffirmation and an unbiased opinion from a 3rd party about FunctionShield, check out Jeremy Daly's extremely thorough tutorial/review of FunctionShield.