Authenticating S3 access using non-anonymous request URLs

If you run a small data center and have capped bandwidth you don't want to be delivering bulk data to customers. It is better to place the data in the cloud and redirect your customers to get the data there. Amazon's S3 is a good place for that as creating a public URL is trivial. If the data is not public then S3 has a simple mechanism for enabling you to authenticate access. To do this you run your own authentication service; this service prepares a signed, time limited URL that you give to the client to use to download the data from S3. The network interaction is all done within SSL and so you don't need to worry about the URL escaping into the wild and even if it did the loss is time limited.

The AWS S3 service calls this a non-anonymous request URL. For example, if your data is in the "2019-Q4.tsv" item in the "com.andrewgilmartin.bucket1" bucket the URL is

https://s3.amazonaws.com/com.andrewgilmartin.bucket1/2019-Q4.tsv

Your authentication service will (after authenticating the user) redirect the user's HTTP client to the URL

https://s3.amazonaws.com/com.andrewgilmartin.bucket1/2019-Q4.tsv
    ?AWSAccessKeyId=<<AWS_ACCESS_KEY>>
    &Expires=<<EXPIRES>>
    &Signature=<<SIGNATURE>>

This is the non-anonymous request URL. The <<SIGNATURE>> is a base64 encoding of an SHA1 encryption of the HTTP method ("GET"), the path ("/com.andrewgilmartin.bucket1/2019-Q4.tsv"), and the expiration time (<<EXPIRES>>). The <<AWS_ACCESS_KEY>> corresponding secret key is used for the encryption. An example Java implementation is at S3RestAuthenticationUrlFactory.

For any of this to work you will need an AWS access key id and secret key that is associated with an IAM user with a policy to access the S3 bucket. If you have not done this before the video AWS S3 Bucket Security, Restrict Privileges to User using IAM Policy is a good tutorial. If you only want to allow read access then remove the "s3:PutObject" and "s3:DeleteObject" actions from the example policy.

Creating a Maven project for your web application

This posting continues the series on moving from an Ant to a Maven build. If you are interested in my help with your Ant to Maven transition contact me at andrew@andrewgilmartin.com.

The last stage is to actually move the Ant build to Maven. Your source tree is now quite spartan. It contains the web application, lots of configuration files, servlets or controllers, and non-core supporting classes. As before, you will create a new Maven project, establish dependencies, copy your files, and build and test until complete.

This project is a combination of webapp and Java but Maven does have an automated way of creating this. Instead, you need to first create the webapp project and then create the java tree. Create the webapp project

mvn archetype:generate \
  -DarchetypeGroupId=org.apache.maven.archetypes \
  -DarchetypeArtifactId=maven-archetype-webapp \
  -DarchetypeVersion=1.4 \
  -DinteractiveMode=false \
  -DgroupId=com.andrewgilmartin \
  -DartifactId=system-application \
  -Dversion=1.0-SNAPSHOT

Now create the Java tree

cd system-application
mkdir -p \
  src/main/java \
  src/main/resources \
  src/test/java \
  src/test/resources

The result is

.
├── pom.xml
└── src
    ├── main
    │   ├── java
    │   ├── resources
    │   └── webapp
    │       ├── WEB-INF
    │       │   └── web.xml
    │       └── index.jsp
    └── test
        ├── java
        └── resources

The pom.xml file is little different from those created before. The significant change is the <packaging/> element

<packaging>war</packaging>

The "war" value directs Maven to create the war instead of a jar (the default). Now add to the pom.xml the common and system-core dependencies, and any other dependencies specific to the application.

Your web application runs within a servlet container and that container provides some of your dependencies. You need these dependencies for compilation, but they should not be bundled into your war. Maven calls these "provided" dependencies. For these dependencies add a <scope/> element to your <dependency/> element, eg

<dependency>
    <groupId>org.apache.tomcat</groupId>
    <artifactId>tomcat-servlet-api</artifactId>
    <version>8.0.15</version>
    <scope>provided</scope>
</dependency>        

Copy the application's code, configuration, and webapp from the system source tree to this project. Build and test as normal until you have clean results.

If you are interested in my help with your Ant to Maven transition contact me at andrew@andrewgilmartin.com.

Creating Maven projects for the core packages and command line tools

This posting continues the series on moving from an Ant to a Maven build. If you are interested in my help with your Ant to Maven transition contact me at andrew@andrewgilmartin.com.

With your common packages now having their own Maven build you can move on to the system itself. For this series I am assuming that your system is composed of a web application with several command line tools. The web application is likely a large set of servlets or Spring controllers. It's a monolith and it is going to stay that way for the near future. The command lines tools are used for nightly batch operations or ad hoc reports, etc. What they have in common is that they require some of the system's packages to function. Eg, they depend on its data access packages, protocol facilitation packages, billing logic packages, etc. The next stage is to separate the system's core code, the command line tools, and the application code and its configuration.

System Core Project

Create a new Maven project for the system core code

mvn archetype:generate \
  -DgroupId=com.andrewgilmartin \
  -DartifactId=system-core \
  -DarchetypeArtifactId=maven-archetype-quickstart \
  -DarchetypeVersion=1.4 \
  -DinteractiveMode=false

Replace the groupId and artifactId as appropriate.

Copy all the system core code to this project much like you did when extracting the common code. You will likely again find that the core code has entanglements with non-core code that you are going to have to work out. That can be very difficult and require some refactoring; hopefully not significant enough to abandon the whole effort.

As you are assembling the system-core project you may discover that it tries to come to life. You have the Java equivalent of archaea and bacteria, ie a self configuring class or sets of classes. These are classes with static blocks, eg

public class Archaea {
    static { /* do some configuration */ }
}

That static block is executed as the class is used. Normally this has not been an issue as the classes were always used in the context of the whole system. Now they are isolated. If they depended on external resources or files that are no longer available then their initialization failures leave them in undefined states. You will need to work this out. Can the static block be eliminated or replaced with initialization upon first instance use? Maybe a Design Patterns refactoring is needed.

Build and test as normal until you have clean results.

Once your system-core project is complete remove its code from the system's source tree, remove unneeded dependencies from the Ant build.xml, and add the new dependency to the <mvn-dependencies/> element in build.xml. Build and test the system as normal until you have clean results.

Command Line Tool Projects

Now extract the command line tools from the system into their own Maven projects. These projects will depend on the system-common and system-core projects. The Maven build will also need to create an "uberjar", that is a single jar that bundles all the classes and jars needed to run the tool.

Pick a command line tool and create a new Maven project for it as you would normally. Eg, for the gizmo command line tool use

mvn archetype:generate \
  -DgroupId=com.andrewgilmartin \
  -DartifactId=gizmo \
  -DarchetypeArtifactId=maven-archetype-quickstart \
  -DarchetypeVersion=1.4 \
  -DinteractiveMode=false

Replace the groupId and artifactId as appropriate. Add to the pom.xml the system-common and system-core dependencies, and any other dependencies specific to the tool. Copy the tool's code from the system source tree to this project. Build and test as normal until you have clean results.

To create the "uberjar" update pom.xml and replace the whole <plugins/> with

<plugins>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>3.1.0</version>
        <configuration>
            <descriptorRefs>
                <descriptorRef>jar-with-dependencies</descriptorRef>
            </descriptorRefs>
            <archive>
                <manifest>
                    <addClasspath>true</addClasspath>
                    <mainClass>com.andrewgilmartin.gizmo.App</mainClass>
                </manifest>
            </archive>
        </configuration>
        <executions>
            <execution>
                <id>assemble-all</id>
                <phase>package</phase>
                <goals>
                    <goal>single</goal>
                </goals>
            </execution>
        </executions>
    </plugin>
</plugins>

Replace "com.andrewgilmartin.gizmo.App" with the fully qualified class name of the tool. When you now build the Maven project you will see "maven-assembly-plugin" log

--- maven-assembly-plugin:3.1.0:single (assemble-all) @ gizmo ---
Building jar: /home/ajg/src/gizmo/target/gizmo-1.0-SNAPSHOT-jar-with-dependencies.jar

The file "gizmo-1.0-SNAPSHOT-jar-with-dependencies.jar" is the uberjar. To trial run your command line tool use

java -jar target/gizmo-1.0-SNAPSHOT-jar-with-dependencies.jar

Don't forget to add whatever command line options prevent the tool from doing any actual work!

Once your tool is complete remove its code from the system's source tree and remove unneeded dependencies from the Ant build.xml.

Continue this procedure for each of your command line tools.

Where are we

At this point you have

  1. System common code Maven project
  2. System core code Maven project
  3. Command line tools Maven projects
  4. Remaining system Ant project

The remaining system is just the web application with its configuration, servlets or controllers, and the odd ball classes that don't fit in system-common or system-core. The next stage is to refactor the system Ant project itself.