Running Apache in a docker container

Running Apache in a docker container

When working with web applications that look at the whole of the requested URL, you’ll generally want to use hostnames other than localhost when debugging the app locally in your IDE. In my other blog post I explain my usual setup for dealing with this. Although it works quite well, one drawback is that I have to install and configure Apache every time I get myself a new development machine or decide to try some other flavour of linux distro.

One of my objectives for running Apache in a Docker container is to have a uniform and repeatable process for getting Apache up and running on any development machine (and configured for the project at hand). Enter Docker: you build, you run, it works. Or at least in theory.

The big picture

The general idea for the Docker Apache container is as follows:

  • Set up DNS on my development machine so that browser requests to various URL’s of interest end up at localhost. This is accomplished by adding entries for my web app hostnames to my hosts file.
  • Have the Docker container intercept traffic at ports 80 and 443
  • Have Apache redirect back to my development machine port 8080 from within the container

Source code for this project: Dockerfile

In the following, I will assume that you have downloaded the source code for this blog post. Checkout the master branch and look in the apache-docker directory, which is where I put all the files relevant to this post. I will also assume that you are somewhat familiar with Docker. Here is my Dockerfile together with some explanations:

FROM alpine:3.7

RUN  apk update && apk upgrade && \
     apk add apache2 && \
     apk add apache2-proxy && \
     apk add apache2-ssl && \
     rm -rf /var/cache/apk/*

COPY conf.d/* /etc/apache2/

CMD  [ "/usr/sbin/httpd", "-D", "FOREGROUND"]
  • The FROM statement sets the base image upon which my own image is built from the statements in the rest of the Dockerfile. The alpine image is a very bare and stripped down linux distro. It has a kernel and some utilities (e.g. vi), but nothing in the way of managing what is started at system startup. So no systemd or other daemons.
  • The RUN apk statement installs the apache2 packages that I need for running Apache with mods for SSL, redirection and some other basic stuff. At the end of the install I remove the downloaded package files in order to preserve space. apk is the package manager of the Alpine distro, comparable to Ubuntu apt. As is good Docker practice, all commands for running an update of the system, installing stuff and cleaning up afterwards are bundled into a single RUN statement. This reduces the number of layers, which is good. See here for some more best practices.
  • The COPY statement takes all files from the conf.d directory in the Docker project directory and places them in the /etc/apache2/ directory in the target image.
  • The CMD statement finally tells Docker what to do once the container starts running. More on this below.

You can build an image from this Dockerfile with the following command:

   docker build -t dimario/apache:1.0 .

Note there is a dot at the end of this command. It tells Docker to look in the current directory for the Dockerfile. And instead of dimario/apache:1.0 you can use your own tag, as long as you adhere to the Docker naming conventions.

A closer look at the Apache configuration

If you look in the conf.d directory you’ll see that I opted to put all necessary files in a single flat directory structure. In this way I need only a single Docker COPY and it results in one single image layer. If you want to place the files where they normally would go, i.e. the certificate and key in /etc/ssl and the virtualhosts in a subdir of the Apache configuration, you can mimick the correct /etc directory structure in conf.d and copy that to the images /etc. I will leave this excercise to the reader.

As you can see, I set up all the necessary modules with LoadModule in the httpd.conf file. In that main config file I also set up a “default website” that simply serves http://localhost/index.html by which I can tell if at least Apache is up and running (the files at /var/www/ in the container are installed by apk). For some more debugging aid I also set up the info module which reports all kinds of interesting information when you browse to http://localhost/server-info

Finally, I kept the configuration of my https virtualhosts in separate files which are Included at the end of httpd.conf. One important detail to note in both virtualhost configuration files is the use of a ${DEVHOST} environment variable in the actual redirection statements. I will explain this in the section about the Docker run command line.

CMD: keep Apache running in the container

The actual executable file that is the Apache webserver is named httpd in this distro. So once the container has been initialized by Docker, we want to run the /usr/sbin/httpd program. The way to tell Docker what to run once the container is initialized is by way of the CMD statement.

Here was a tricky piece of the puzzle that took me some time to get right. One Docker quirk is that whatever you start with CMD gets run with process ID 1 (like init in a real linux system) and when that process stops running, so does the container it was running in.

And … the Apache executable normally spawns a couple of child processes and then exits. This behaviour comes in handy during a normal system startup when the init scripting needs to regain control after starting Apache. But in this case it causes the container to stop running as soon as /usr/sbin/httpd exits, which is to say immediately.

Fortunately, there is a way to tell the Apache executable to not exit after spawning its child processes, which is by using the -D FOREGROUND flag. So, the Docker statement

   CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]

causes the linux kernel in the container to execute /usr/sbin/httpd -D FOREGROUND after it has initialized itself, which runs Apache without exiting immediately.

Running the container, command line

After building the image with the Apache installation, configuration and startup command in place it is time to actually run it. Here is the rather intimidating command line which I will explain piece by piece:

docker run -it --name rundfunk --hostname rundfunk -e DEVHOST=192.168.1.15 -p:80:80 -p:443:443 dimario/apache:1.0
  • docker run: This is the actual Docker command to run an image, meaning that a container is prepared using the image and then started. The rest of the arguments on the docker run command line are applied partly to the creation of the container and partly to the way it will be started.
  • - it: The -i flag means you want an interactive session, and the -t flag means you want the container to assign a tty (terminal) to that session. Flags or switches are usually combined: instead of writing -i -t, you write -it.
  • - - name rundfunk: This is the name I want to use for this container in subsequent interactions with it. If you leave out this optional argument, Docker will assign a fantasy name. I like to keep track of things so I prefer to name my container myself.
  • - - hostname rundfunk: This is the hostname that Docker will patch into the network configuration of the Docker container that is being created. See below for more information. Note that although I use the same value for both --name and --hostname they have a different meaning and are used in a different way. There is no Docker magic that relies on both these arguments having the same value.
  • - e DEVHOST=192.168.1.15: This sets up an environment variable in the Docker container named DEVHOST. The value is the IP address of my development machine. You need to fill in your own IP address here. See below for more information.
  • - p:80:80 and - p:443:443: These two arguments tell Docker to intercept any network traffic directed at the host ports 80 (http) and 443 (https) and send them to the Docker containers coresponding ports instead. See below for more information.
  • dimario/apache:1.0: the final argument is the name (or tag, if we want to be precise) of the Docker image that you want to use for creating the container. As described above, the Dockerfile for this image determines what stuff is installed and configured, and what program should run inside the container after it has started.

The travels of my request

I will explain how all of the above fits together in my particular setup. The overall idea is that I type in a URL (for example https://local-scarlet.acme.eu/some/page?show-all=true) in my browser on my development machine. This request then gets routed to Apache / Docker for decrypting the HTTPS and then routed back to my development machine to be processed by my webapplication which is running in Tomcat controlled by my IDE. Here is what happens to my request:

  1. the hostname local-scarlet.acme.eu is looked up in DNS. Because I have added this hostname to the hosts file of my development machine, DNS is able to find an IP address for this hostname. It turns out to be 127.0.0.1, none other than my very own development machine.
  2. the browser sends the request to ip address 127.0.0.1 port 443 because the URL uses the https protocol.
  3. The Docker container intercepts the request on port 443 because I told it to do so with my -p:443:443 command line argument.
  4. The Docker container has a process listening on port 443 because I told Apache Listen 443 in the configuration. The request is thus picked up by Apache in the container.
  5. Apache uses the certificate and private key (provided by me in the container configuration) to establish an encrypted connection with the browser and decrypt the message.
  6. The ProxyPass directive in the Apache configuration tells Apache that the decrypted message must be forwarded to http://${DEVHOST}:8080/site/. Here is where I once again provide a piece of information: by way of the -e DEVHOST=192.168.1.15 argument on the command line that ran the image and started the container, there is an environment variable set inside the container with the name DEVHOST which holds - by no coincidence - the IP address of my development machine. In fact, when Apache starts the environment variable is evaluated then and there so the forward actually goes to http://192.168.1.15:8080/site/.
  7. Note that port 8080 is not intercepted by the Docker container because I did not add an argument with such instructions to the docker run command line. So the decrypted request is sent to my development machine, and not to the Docker container.
  8. Tomcat is up and running on my development machine and listening to port 8080 so it gracefully receives the request and sends it to my web application, where finally my breakpoint is triggered in the debugger.

More considerations

  • Docker controls the content of the /etc/hosts file inside the container. I tried various schemes for adding entries to that file and although I succeeded in one case this was a rather hacky solution that raises more questions than it answers.
  • In Windows and Mac OS, but not in linux, there exists a magical hostname host.docker.internal which resolves to the IP address of the host system.
  • In linux, but not on other host OS’s, you can use --network=host on the docker run command line to make the container use the same networking configuration as the host system. This includes sharing the same hosts file and the same IP address.
  • The network connection from the Docker container back to your development host system apparently is blocked when you use Strongswan VPN on linux. In that case you can try using the Docker IP address of your development host instead of its regular IP address.

Troubleshooting

A very useful thing to do is log on to an active Docker container in order to have a looksee at what is going on inside. For this you can use the docker exec command line like this:

  docker exec -it rundfunk /bin/sh

The active container is identified by the --name you gave it when you created the container with the docker run command, in this case “rundfunk”. For alpine linux the command shell is sh. If you try this for an Ubuntu derived container you will probably have bash instead of sh.

You will be user root (comparable to Windows admin) inside the container. You can install additional programs, for instance mc (midnight commander) or curl. Remember however that the changes you make to a container are not propagated back to the image from which the container was created. Next time you run the image you get a virginal container.

Once you have logged on to your running container, you can control Apache:

  • httpd -k stop stops Apache
  • httpd -t tests the validity of the configuration
  • httpd -k start -e debug starts Apache again with the log level set to debug.

The Apache log files are located at /var/logs/apache2 in the container.

Mario Pinkster
Mario Pinkster

Java Developer at Sentia Consultancy