When working with web applications that look at the whole of the requested URL, you’ll generally want to use hostnames other than localhost
when debugging the app locally in your IDE. In my other blog post I explain my usual setup for dealing with this. Although it works quite well, one drawback is that I have to install and configure Apache every time I get myself a new development machine or decide to try some other flavour of linux distro.
One of my objectives for running Apache in a Docker container is to have a uniform and repeatable process for getting Apache up and running on any development machine (and configured for the project at hand). Enter Docker: you build, you run, it works. Or at least in theory.
The big picture
The general idea for the Docker Apache container is as follows:
- Set up DNS on my development machine so that browser requests to various URL’s of interest end up at
localhost
. This is accomplished by adding entries for my web app hostnames to myhosts
file. - Have the Docker container intercept traffic at ports
80
and443
- Have Apache redirect back to my development machine port
8080
from within the container
Source code for this project: Dockerfile
In the following, I will assume that you have downloaded the source code for this blog post. Checkout the master branch and look in the apache-docker
directory, which is where I put all the files relevant to this post. I will also assume that you are somewhat familiar with Docker. Here is my Dockerfile together with some explanations:
FROM alpine:3.7
RUN apk update && apk upgrade && \
apk add apache2 && \
apk add apache2-proxy && \
apk add apache2-ssl && \
rm -rf /var/cache/apk/*
COPY conf.d/* /etc/apache2/
CMD [ "/usr/sbin/httpd", "-D", "FOREGROUND"]
- The FROM statement sets the base image upon which my own image is built from the statements in the rest of the Dockerfile. The
alpine
image is a very bare and stripped down linux distro. It has a kernel and some utilities (e.g.vi
), but nothing in the way of managing what is started at system startup. So nosystemd
or other daemons. - The RUN apk statement installs the apache2 packages that I need for running Apache with mods for SSL, redirection and some other basic stuff. At the end of the install I remove the downloaded package files in order to preserve space.
apk
is the package manager of the Alpine distro, comparable to Ubuntuapt
. As is good Docker practice, all commands for running an update of the system, installing stuff and cleaning up afterwards are bundled into a singleRUN
statement. This reduces the number of layers, which is good. See here for some more best practices. - The COPY statement takes all files from the
conf.d
directory in the Docker project directory and places them in the/etc/apache2/
directory in the target image. - The CMD statement finally tells Docker what to do once the container starts running. More on this below.
You can build an image from this Dockerfile with the following command:
docker build -t dimario/apache:1.0 .
Note there is a dot at the end of this command. It tells Docker to look in the current directory for the Dockerfile. And instead of dimario/apache:1.0
you can use your own tag, as long as you adhere to the Docker naming conventions.
A closer look at the Apache configuration
If you look in the conf.d
directory you’ll see that I opted to put all necessary files in a single flat directory structure. In this way I need only a single Docker COPY
and it results in one single image layer. If you want to place the files where they normally would go, i.e. the certificate and key in /etc/ssl
and the virtualhosts in a subdir of the Apache configuration, you can mimick the correct /etc
directory structure in conf.d
and copy that to the images /etc
. I will leave this excercise to the reader.
As you can see, I set up all the necessary modules with LoadModule
in the httpd.conf
file. In that main config file I also set up a “default website” that simply serves http://localhost/index.html
by which I can tell if at least Apache is up and running (the files at /var/www/
in the container are installed by apk
). For some more debugging aid I also set up the info module which reports all kinds of interesting information when you browse to http://localhost/server-info
Finally, I kept the configuration of my https
virtualhosts in separate files which are Include
d at the end of httpd.conf. One important detail to note in both virtualhost configuration files is the use of a ${DEVHOST}
environment variable in the actual redirection statements. I will explain this in the section about the Docker run
command line.
CMD: keep Apache running in the container
The actual executable file that is the Apache webserver is named httpd
in this distro. So once the container has been initialized by Docker, we want to run the /usr/sbin/httpd
program. The way to tell Docker what to run once the container is initialized is by way of the CMD
statement.
Here was a tricky piece of the puzzle that took me some time to get right. One Docker quirk is that whatever you start with CMD
gets run with process ID 1 (like init
in a real linux system) and when that process stops running, so does the container it was running in.
And … the Apache executable normally spawns a couple of child processes and then exits. This behaviour comes in handy during a normal system startup when the init scripting needs to regain control after starting Apache. But in this case it causes the container to stop running as soon as /usr/sbin/httpd
exits, which is to say immediately.
Fortunately, there is a way to tell the Apache executable to not exit after spawning its child processes, which is by using the -D FOREGROUND
flag. So, the Docker statement
CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]
causes the linux kernel in the container to execute /usr/sbin/httpd -D FOREGROUND
after it has initialized itself, which runs Apache without exiting immediately.
Running the container, command line
After building the image with the Apache installation, configuration and startup command in place it is time to actually run it. Here is the rather intimidating command line which I will explain piece by piece:
docker run -it --name rundfunk --hostname rundfunk -e DEVHOST=192.168.1.15 -p:80:80 -p:443:443 dimario/apache:1.0
- docker run: This is the actual Docker command to run an image, meaning that a
container
is prepared using the image and then started. The rest of the arguments on thedocker run
command line are applied partly to the creation of the container and partly to the way it will be started. - - it: The
-i
flag means you want an interactive session, and the-t
flag means you want the container to assign atty
(terminal) to that session. Flags or switches are usually combined: instead of writing-i -t
, you write-it
. - - - name rundfunk: This is the name I want to use for this container in subsequent interactions with it. If you leave out this optional argument, Docker will assign a fantasy name. I like to keep track of things so I prefer to name my container myself.
- - - hostname rundfunk: This is the
hostname
that Docker will patch into the network configuration of the Docker container that is being created. See below for more information. Note that although I use the same value for both--name
and--hostname
they have a different meaning and are used in a different way. There is no Docker magic that relies on both these arguments having the same value. - - e DEVHOST=192.168.1.15: This sets up an environment variable in the Docker container named
DEVHOST
. The value is the IP address of my development machine. You need to fill in your own IP address here. See below for more information. - - p:80:80 and - p:443:443: These two arguments tell Docker to intercept any network traffic directed at the host ports
80
(http) and443
(https) and send them to the Docker containers coresponding ports instead. See below for more information. - dimario/apache:1.0: the final argument is the name (or tag, if we want to be precise) of the Docker
image
that you want to use for creating the container. As described above, theDockerfile
for this image determines what stuff is installed and configured, and what program should run inside the container after it has started.
The travels of my request
I will explain how all of the above fits together in my particular setup. The overall idea is that I type in a URL (for example https://local-scarlet.acme.eu/some/page?show-all=true
) in my browser on my development machine. This request then gets routed to Apache / Docker for decrypting the HTTPS and then routed back to my development machine to be processed by my webapplication which is running in Tomcat controlled by my IDE. Here is what happens to my request:
- the hostname
local-scarlet.acme.eu
is looked up in DNS. Because I have added this hostname to thehosts
file of my development machine, DNS is able to find an IP address for this hostname. It turns out to be127.0.0.1
, none other than my very own development machine. - the browser sends the request to ip address
127.0.0.1
port443
because the URL uses thehttps
protocol. - The Docker container intercepts the request on port
443
because I told it to do so with my-p:443:443
command line argument. - The Docker container has a process listening on port
443
because I told ApacheListen 443
in the configuration. The request is thus picked up by Apache in the container. - Apache uses the certificate and private key (provided by me in the container configuration) to establish an encrypted connection with the browser and decrypt the message.
- The
ProxyPass
directive in the Apache configuration tells Apache that the decrypted message must be forwarded tohttp://${DEVHOST}:8080/site/
. Here is where I once again provide a piece of information: by way of the-e DEVHOST=192.168.1.15
argument on the command line that ran the image and started the container, there is an environment variable set inside the container with the nameDEVHOST
which holds - by no coincidence - the IP address of my development machine. In fact, when Apache starts the environment variable is evaluated then and there so the forward actually goes tohttp://192.168.1.15:8080/site/
. - Note that port
8080
is not intercepted by the Docker container because I did not add an argument with such instructions to thedocker run
command line. So the decrypted request is sent to my development machine, and not to the Docker container. - Tomcat is up and running on my development machine and listening to port
8080
so it gracefully receives the request and sends it to my web application, where finally my breakpoint is triggered in the debugger.
More considerations
- Docker controls the content of the
/etc/hosts
file inside the container. I tried various schemes for adding entries to that file and although I succeeded in one case this was a rather hacky solution that raises more questions than it answers. - In Windows and Mac OS, but not in linux, there exists a magical hostname
host.docker.internal
which resolves to the IP address of the host system. - In linux, but not on other host OS’s, you can use
--network=host
on thedocker run
command line to make the container use the same networking configuration as the host system. This includes sharing the same hosts file and the same IP address. - The network connection from the Docker container back to your development host system apparently is blocked when you use Strongswan VPN on linux. In that case you can try using the Docker IP address of your development host instead of its regular IP address.
Troubleshooting
A very useful thing to do is log on to an active Docker container in order to have a looksee at what is going on inside. For this you can use the docker exec
command line like this:
docker exec -it rundfunk /bin/sh
The active container is identified by the --name
you gave it when you created the container with the docker run
command, in this case “rundfunk”. For alpine
linux the command shell is sh
. If you try this for an Ubuntu derived container you will probably have bash
instead of sh
.
You will be user root
(comparable to Windows admin
) inside the container. You can install additional programs, for instance mc
(midnight commander) or curl
. Remember however that the changes you make to a container are not propagated back to the image from which the container was created. Next time you run
the image you get a virginal container.
Once you have logged on to your running container, you can control Apache:
- httpd -k stop stops Apache
- httpd -t tests the validity of the configuration
- httpd -k start -e debug starts Apache again with the log level set to
debug
.
The Apache log files are located at /var/logs/apache2
in the container.