Working on a recent project, we looked at various ways to ensure that our system was as secure as possible.
It's great that today, we can easily use cloud storage services like AWS S3, Azure Storage Accounts and Google Cloud Storage without having to worry about managing all of the underlying hardware, operating system services and things like geo-redundant backups. But at the same time it's also easy to believe that using Platform as a Service components will handle all of the security and management "stuff" for you. Unfortunately, ensuring that the files you're storing are protected against malware and viruses isn't handled automatically*.
The product we built for our client allows users to upload files that are stored in AWS S3 Buckets. We wanted to protect the system and other users from the threat of malicious files being uploaded (inadvertently or intentionally). This post describes the approach we took and lessons learned along the way that you might find helpful if you're in a similar situation.
* OK, Azure does include Microsoft Defender for Storage, but it's not a built-in service, must be separately enabled and includes extra costs.
Choosing a service to scan AWS S3 Buckets
There are several services that can scan files in S3 Buckets, we considered:
- Cloud Storage Security
- Superna Defender
- Serverless ClamScan
- Trend Cloud One File Storage Security
Our priorities for selecting a service:
- We preferred a Software as a Service solution over self-hosted solutions
- We wanted to keep operating costs low given our client is a start up
- We wanted something that could be integrated in our deployment processes that use AWS CDK
- We wanted a service that provides support and is backed by a Service Level Agreement (SLA) to support our client if something goes wrong
- We wanted file scanning to be a "black box": files are sent to the service and results are returned without us having to know about its internals
In the end, we settled on Trend Cloud One File Storage Security as it ticked all the boxes for us.
Serverless ClamScan gets an honourable mention - as an open source CDK construct it could plug easily into our environment, however we were concerned about ongoing support and a lack of an SLA - see this post for a good description.
Integrating with File Storage Security
Trend Micro provides good documentation describing their platform and integrating it into a solution. Briefly, the solution includes the following components:
- Scanner stack - the component that performs that file scan to detect bad things. You only need one Scanner stack in your environment.
- Storage stack - this component is responsible for detecting new files uploaded to your S3 Bucket, generating a pre-signed URL and passing the URL to the scanner stack. You need one Storage stack for each S3 Bucket you want to scan, however your Storage stacks can share a single Scanner stack.
- Post scan plug-ins - one or more AWS Lambdas that process scan results. Trend Micro provides some examples and ready-to-use plugins and you can also create your own.
The following diagram is a simplified high-level architecture of the file scanning solution we implemented using File Storage Security.
When files are added to the user-files S3 Bucket, they are picked up by the Storage stack (which listens for the S3:ObjectCreated event). The Storage stack includes an AWS SNS topic that you can subscribe to for receiving results of a file scan.
In our solution, we created a simple AWS Lambda, malware-scan-lambda, to process file scan results. If a scan result indicated that the file was malicious, we delete it from it's source Bucket and send a notification to Slack through a WebHook. Naturally, you can take whatever action you want with the scan result and Trend also provides some example Lambda implementations that you can use as a starting point or plug directly into your solution.
Trend Micro already have excellent documentation for File Storage Security that covers their architecture, deployment options and configuration so I don't want to repeat that information here. Instead I think it is helpful to talk about some of the challenges we encountered and how we resolved them.
Deployment using AWS CDK
Our team uses CDK for deploying our infrastructure to AWS. We wanted to incorporate File Storage Security into our existing pipelines and code with minimal fuss.
Trend hosts a GitHub repository that includes CloudFormation templates that define the resources in their Scanner and Storage stacks. We've standardised on using AWS CDK for our infrastructure deployments and wanted to use CDK to deploy File Storage Security components too.
Instead of re-creating the CloudFormation templates using CDK you can simply import a CloudFormation template in a CDK project using
To do this, we copied Trend's Scanner and Storage Stack CloudFormation templates to a
templates directory in our project and created two classes:
The following code shows our
ScannerStack class and an example of how we import the CloudFormation template using CDK.
ScannerQueueUrlthat we need to pass into the
StorageStack. This is exposed as a public field,
For the Storage Stack template, we followed the same approach by creating a class called
StorageStack and importing the template using
Deploying the CDK components is one part of the process. The other part is configuring File Storage Security so that it can work with your AWS infrastructure. For our implementation, we manually configured
StorageStackManagementRoleARN (as described in the Trend documentation).
Trend Micro publish an API that can be used to automate these manual configuration steps. In a future version we intend to use this to fully automate the deployment process end-to-end.
Working with scan results
Trend's File Storage Security allows you to process scan results through a custom AWS Lambda by subscribing to an AWS SNS Topic.
ScanResultTopicARNthat provides a reference to the Topic to subscribe to.
Our solution included deployment of a Lambda stack that subscribes to the topic passed into the stack via a parameter.
The Lambda handler you create should be of type
aws-lambda.SNSHandler and take a parameter of type
aws-lambda.SNSEvent. The following example is a minimal implementation of a Lambda to process a scan result:
Scanning existing files (not just newly added ones)
The solution we created only scans files that are added to an S3 Bucket. In our project, we were building a new, "green fields" application so this wasn't a concern. But what do you do if you're trying to protect an existing Bucket that already contains files?
Trend documents some approaches on how to do this, including the ability to run scheduled scans over your files.
How do you go about testing that malware scanning is working correctly? I didn't really want to get my machine infected with anything nor did I want to spread any malware in our company or around our clients.
Fortunately, there's a solution that is perfectly safe. The European Institute for Computer for Anti-Virus Research (EICAR) has developed an anti-virus test file that you can use to safely trigger a "safe positive" (a term I've just made up). You can safely pass this file around and upload it to an S3 Bucket to cause File Storage Security to detect a file with a "virus".
In our solution, we trigger a notification in Slack. When uploading an instance of the EICAR test file, the following notification is generated:
I found it interesting to put this solution together and it was the first time I'd dived deeper into virus and malware scanning beyond using antivirus software on my machine.
I hope this is useful to you and helps you protect the files in your AWS S3 solutions.