Authenticating S3 access using non-anonymous request URLs

If you run a small data center and have capped bandwidth you don't want to be delivering bulk data to customers. It is better to place the data in the cloud and redirect your customers to get the data there. Amazon's S3 is a good place for that as creating a public URL is trivial. If the data is not public then S3 has a simple mechanism for enabling you to authenticate access. To do this you run your own authentication service; this service prepares a signed, time limited URL that you give to the client to use to download the data from S3. The network interaction is all done within SSL and so you don't need to worry about the URL escaping into the wild and even if it did the loss is time limited.

The AWS S3 service calls this a non-anonymous request URL. For example, if your data is in the "2019-Q4.tsv" item in the "com.andrewgilmartin.bucket1" bucket the URL is

https://s3.amazonaws.com/com.andrewgilmartin.bucket1/2019-Q4.tsv

Your authentication service will (after authenticating the user) redirect the user's HTTP client to the URL

https://s3.amazonaws.com/com.andrewgilmartin.bucket1/2019-Q4.tsv
    ?AWSAccessKeyId=<<AWS_ACCESS_KEY>>
    &Expires=<<EXPIRES>>
    &Signature=<<SIGNATURE>>

This is the non-anonymous request URL. The <<SIGNATURE>> is a base64 encoding of an SHA1 encryption of the HTTP method ("GET"), the path ("/com.andrewgilmartin.bucket1/2019-Q4.tsv"), and the expiration time (<<EXPIRES>>). The <<AWS_ACCESS_KEY>> corresponding secret key is used for the encryption. An example Java implementation is at S3RestAuthenticationUrlFactory.

For any of this to work you will need an AWS access key id and secret key that is associated with an IAM user with a policy to access the S3 bucket. If you have not done this before the video AWS S3 Bucket Security, Restrict Privileges to User using IAM Policy is a good tutorial. If you only want to allow read access then remove the "s3:PutObject" and "s3:DeleteObject" actions from the example policy.