Discovering addresses of all instances that form a service

If you browse through the official Docker documentation, you will not find any reference to addresses of individual instances that form a service.

The previous sentence might not be true at the time you're reading this. Someone might have updated the documentation. However, at the time I'm writing this chapter, there is not a trace of such information.

The fact that something is not documented does not mean that it does not exist. Indeed, there is a special DNS that will return all IPs.

To see it in action, we'll create the global service called util and attach it to the proxy network:

docker service create --name util \
--network proxy --mode global \
alpine sleep 1000000000

docker service ps util

Before proceeding, please wait until the current state is set to running.

Next, we'll find the ID of one of the util instances and install drill that will show us the information related to DNS entries:

ID=$(docker ps -q --filter label=com.docker.swarm.service.name=util)

docker exec -it $ID apk add --update drill

Let's start by drilling the DNS proxy:

docker exec -it $ID drill proxy

The output is as follows:

;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 31878
;
; flags: qr rd ra ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;
; QUESTION SECTION:
;
; proxy. IN A

;; ANSWER SECTION:
proxy. 600 IN A 10.0.0.2

;
; AUTHORITY SECTION:

;
; ADDITIONAL SECTION:

;
; Query time: 0 msec
;
; SERVER: 127.0.0.11
;
; WHEN: Fri Sep 9 16:43:23 2016
;
; MSG SIZE rcvd: 44

As you can see, even though we are running three instances of the service, only one IP is returned 10.0.0.2. That is the IP of the service, not an individual instance. To be more concrete, it is the IP of the proxy service network end-point. When a request reaches that end-point, Docker network performs load balancing across all the instances.

In most cases, we do not need anything else. All we have to know is the name of the service and Docker will do the rest of the work for us. However, in a few cases, we might need more. We might need to know the IPs of every single instance of a service. That is the problem Docker Flow Proxy faced.

To find the IPs of all the instances of a service we can use the "undocumented" feature. We need to add the tasks prefix to the service name.

Let's drill again:

docker exec -it $ID drill tasks.proxy

This time, the output is different:

;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 54408
;
; flags: qr rd ra ; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;
; QUESTION SECTION:
;
; tasks.proxy. IN A

;; ANSWER SECTION:
tasks.proxy. 600 IN A 10.0.0.4
tasks.proxy. 600 IN A 10.0.0.3
tasks.proxy. 600 IN A 10.0.0.5

;
; AUTHORITY SECTION:

;
; ADDITIONAL SECTION:

;
; Query time: 0 msec
;
; SERVER: 127.0.0.11
;
; WHEN: Fri Sep 9 16:48:46 2016
;
; MSG SIZE rcvd: 110

We got three answers, each with a different IP 10.0.0.4, 10.0.0.3, 10.0.0.5.

Knowing the IPs of all the instances solved the problem of having to synchronize data. With tasks.<SERVICE_NAME> we have all the info we need. The rest is only a bit of coding that will utilize those IPs. It is a similar mechanism used when synchronizing databases (more on that later).

We are not done yet. The fact that we can synchronize data on demand (or events) does not mean that the service is fault tolerant. What should we do if we need to create a new instance? What happens if an instance fails and Swarm reschedules it somewhere else?