Replicate emails to S3 in real-time from Google Workspace, Gmail, or any other mailbox

I will walk you through the steps of setting up real-time email replication from your Google Workspace (formerly known as G Suite) account to an Amazon Web Services (AWS) Simple Storage Service (S3) bucket. By the end of this tutorial, you will have a ~free, secure, and reliable method for backing up your emails to S3.

Info
The setup we will be doing in AWS is fairly generic and you can use it to backup any emails. This includes ProtonMail, iCloud, personal Gmail account, and many other providers. The only two requirements are (a) the ability to forward emails and (b) own a domain name.

Solution overview

We will be setting up Google Workspace to silently forward all incoming emails to Amazon Web Services (AWS) Simple Email Service (SES). In turn, AWS SES will store them in S3.

Solution design diagram

Use cases

There are several reasons why you might want to replicate emails to S3:

  • Data backup: Replicating emails to S3 can provide an additional layer of protection against data loss. By keeping a copy of emails in S3, you can ensure that you have a backup in case something happens to their primary email system.

  • Data archiving: You may want to keep a record of all emails for a certain period of time. Replicating emails to S3 can provide an easy way to store and access these emails over the long term.

  • Data analysis: Some may want to replicate emails to S3 to perform analysis on them using tools like Amazon Athena or Amazon EMR. This can be useful for understanding communication patterns, identifying trends, and more.

  • Data migration: If you are planning to switch to a different email provider, replicating emails to S3 can provide a convenient way to keep all data in the same format.

Cost

Note
These costs are based on my usage (2000 emails per month), so it might be different for your use case. Also, check for pricing changes.
ProductUsage typeCostComment
Google WorkspaceAdvanced routingn/aNo separate billing for this
AWS SESIncoming emails$0.1/m$0.1 per 1000 emails after first 1000
AWS SESIncoming email chunks~$0/m$0.09 per 1000 256kiB of complete email chunks, but most emails are <256kiB
AWS S3Storage+$0.005/m~0.1MiB per message * 2000 email per month * $0.023 per GiB Standard
AWS S3Requests$0.01/m2000 emails * 1 PUT request * $0.005 per 1000 PUT requests
TCO (5 years)~$16$6 SES Incoming emails + $9.15 S3 Storage + $0.6 S3 Requests

Setup

Note
This tutorial assumes that you already have a Google Workspace account and an AWS account set up. If you do not have these accounts, please follow the instructions provided by Google and AWS to set them up before proceeding.

AWS S3

First, you need to choose a single region where all your resources will be located. At the time of writing, Email Receiving is supported only in the following regions: us-east-1, us-west-2 and eu-west-1. Therefore, you will need to choose one of them.

The next step would be a creation of an S3 bucket. Feel free to set it up as you wish, but make sure you allow SES to write into it via ACL:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ses.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::{{{ bucket-name }}}/*",
            "Condition": {
                "StringEquals": {
                    "AWS:SourceAccount": "{{{ aws-account-id }}}"
                },
                "StringLike": {
                    "AWS:SourceArn": "arn:aws:ses:*"
                }
            }
        }
    ]
}

I used the following configuration for the bucket:

  • Versioning: disabled. There will be no overwrites or manual access to files, so I do not see a benefit in having it enabled;
  • Encryption: enabled. No comments here 😄
  • ACL: disabled, per AWS best practices;
  • Enforce bucket owner: enabled. I don’t want the files to be owned by AWS’s internal account;
  • Lifecycle rules: move into Standard-IA after 30 days.
Tip
Avoid using Glacier here. AWS SES stores each email as a separate file, making Glacier not cost-effective.

AWS SES

First, you will need to verify a domain identity in the same region where your newly created bucket is. You must verify a domain (not a single address) due to the way Email Receiving is built.

After verifying your domain you can go to the “Email Receiving” section. Create a new ruleset and add a rule to it. Set it up as follows:

  • Enable Require TLS;
  • Disable Check for Spam and Viruses. You don’t need that because Gmail will do that for you;
  • Choose an unused email in the domain you verified, e.g. backup@example.com, and add it to the receipt condition;
  • Add a Deliver to S3 action:
    • Select the bucket you created and, optionally, a key prefix (e.g. v1);
    • Disable encryption as all the files will be encrypted by S3 anyway;
  • Make sure to enable both the rule and the ruleset you just created.

Google Workspace

Login into Google Admin and navigate to Apps -> Workspace -> Gmail -> Hosts. Here you will need to add an entry for the SES Email Receiving SMTP endpoint:

  • Type: Single host;
  • Host: inbound-smtp.{{{ aws-region }}}.amazonaws.com;
  • Port: 587;
  • Enable Perform MX lookup, Require TLS, Require CA-signed certificate, and Validate certificate hostname to enhance security;
  • Click on Test and ensure the connection succeeds.

Now is the final step: routing.

Warning
Be very careful with the next section. Mistakes here could result in your emails being dropped, bounced, and/or sent to a 3rd party.

Navigate to the Gmail Routing settings and add a new route in the Routing section:

  • Name: backup (or whatever you’d like to call it);
  • Conditions: Inbound, Internal receiving;
  • Modify message -> Also deliver to -> Add more recipients:
    • Select the route you created in the previous step;
    • Change the recipient to the email you chose in AWS configuration (e.g. backup@example.com);
    • Enable Suppress bounces;
    • Optionally, you may want to enable adding of X-Gm-* headers. This could be useful if you have many users in your Google Workspace;
  • Enable Require TLS;
  • Set Account types to affect to Users. You don’t need to include Groups because emails to your groups will be sent to a user anyway.

Verification

  • Send an email to one of your Google Workspace users from a different mailbox;
  • Wait for it to arrive;
  • Navigate to Google Admin -> Reporting -> Email log search;
  • Enter the email you just used in the “Sender” field and click “Search”;
  • Verify email got delivered to S3.
Expected state in Google Admin Email Log

Troubleshooting

Delivery to SES bounced with the “mailbox unavailable” error

That likely means you have not verified your domain in AWS SES or verification hasn’t been completed yet. Note that you need to verify the whole domain and not a single email address for SES Email Receiving to work.