Docker Networking Model(4) — External Network Access

12 min readApr 22, 2023

This article is the third in a series of articles. For those who are unfamiliar, please refer to the first two articles in the series:

Preface

In the previous article, we built a Bridge network model step by step using commands and finally succeeded in allowing two containers to access each other through the ping command. However, neither of these two containers could access the internet, and there was still a mystery of the iptables command left in the end. Therefore, in this article, we will clarify this information as a conclusion.

The following articles will explore from three perspectives, along with three main axes, which are:

How containers access each other
How containers actively access external services
How external networks actively access containers

The third point of functionality is the meaning of the docker -p feature, so the example will also explain in detail how to use it.

Each perspective has three main axes to explore, which are:

Who to send the packet to
Who to process the packet
Who filters the packet

In technical terms, the above three concepts are roughly the following items, but this article will not go into too much detail on each component:

routing table + forwarding table
kernel + iptables
conntrack q

Environment

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.3 LTS
Release:        18.04
Codename:       bionic
$ uname -a
Linux k8s-dev 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ docker --version
Docker version 19.03.13, build 4484c46d9d

How containers actively access external services

After discussing the basic access methods between containers, we can delve deeper to explore how to handle the case where a container wants to access an external service, such as external websites like 8.8.8.8. This is one of the most commonly used types in container services, as without Internet access, many things cannot be done.

This example will explore how to make a container as our client-side and access the external 8.8.8.8 server through the PING command.

The architecture is shown below and the ultimate goal is for container C1 to access 8.8.8.8 through ping.

Who to send the packet to

Packet forwarding needs to be decided first. Who is it going to be sent to? Let’s review the routing table inside the current container c1.

$ docker exec -it c1 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.55.66.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0

If we try to connect to 8.8.8.8, we will get an error because the system does not know how to route the packet of 8.8.8.8 and there are no rules to match.

$ docker exec -it c1 ping 8.8.8.8
connect: Network is unreachable

Networking debugging requires extra caution. I suggest that all network debugging starts with a diagram of the architecture and then thinking about how you think the packet should work. For all key points, can you prove that it is right or wrong? By this process, the scope of possible errors can be reduced.

To solve this problem, we can explicitly tell the system how to send packets for 8.8.8.8. But the problem with this approach is that if you want to access 1.1.1.1 today, you need to write another rule, so we can use another approach.

If there is no rule that matches, let’s go with the default! This is a common practice in system practices because no one can predict which website you want to access. Writing a rule for each website is not reasonable.

With this idea in mind, the next question is, who do I send it to for processing? This involves L3 routing concepts that are relatively complex, but the conclusion is that we want the host to help us process it. If the host itself has the ability to access the external network, can’t we rely on the host to help us process it? We just need to find a way to send the packet to the host so that the host knows there is a packet to process.

To achieve this, we need to do the following configurations:

Give the Linux Bridge (hwchiu0) an IP address (10.55.66.1)
Tell the container that the default gateway is to send the packet to Linux Bridge (hwchiu0)

The above two concepts are translated into system commands as follows:

$ sudo ifconfig hwchiu0 10.55.66.1 netmask 255.255.255.0
$ sudo docker exec -it c1 ip route add default via 10.55.66.1
$ sudo docker exec -it c1 ip route show
default via 10.55.66.1 dev eth0
10.55.66.0/24 dev eth0  proto kernel  scope link  src 10.55.66.2

After the configuration is completed, the container now has an additional rule, and by default, the packets are sent through eth0, and its Gateway is set to 10.55.66.1.

This section does not explain the concept of Gateway, just think that we want Linux Bridge to help us forward, just send the packets to him and it’s fine!

Next, we will monitor the packets on eth0 to see what happens when container c1 sends packets to 8.8.8.8.

In order to avoid iptables from interfering with filtering functions, we first modify it so that default packets are allowed.

Open two terminals in the host and run

Terminal 1

$ sudo iptables -P FORWARD ACCEPT
$ sudo docker exec -it c1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.

Terminal 2

$ sudo tcpdump -vvvnn -i eth0 icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:01:20.425931 IP (tos 0x0, ttl 63, id 60779, offset 0, flags [DF], proto ICMP (1), length 84)
    10.55.66.2 > 8.8.8.8: ICMP echo request, id 16700, seq 1, length 64
19:01:21.445051 IP (tos 0x0, ttl 63, id 60899, offset 0, flags [DF], proto ICMP (1), length 84)
    10.55.66.2 > 8.8.8.8: ICMP echo request, id 16700, seq 2, length 64

By executing the above commands, it can be observed that the packets are still not passing and the correct ICMP response cannot be obtained, however, the relevant packets can be observed from the host machine’s eth0, which are marked as 10.55.66.2 wanting to send to 8.8.8.8. Currently, only the ICMP request has been seen and no response.

The above flow process can be further explained by the following flow chart.

How to view the picture:

The white boxes above represent different components, and if there is an IP above the component, it represents the IP address of the component.
Arrows between each component describe the flow of packets and indicate what IP addresses are in the flow of packets, from whom to whom.
The description below the flow of packets is which components are involved in the current process.

Refreshing the Current Process and Ideas

When a packet is sent from the container to 8.8.8.8, the packet will be sent out through eth0 due to the rules.
After the packet is sent out from eth0, the packet will reach the veth0 virtual network card on the system due to the properties of veth.
The packet will enter the world of Linux Bridge from veth0, and finally, the packet will enter Linux Bridge (hwchiu0) itself through the Forwarding Table concept.
Once hwchiu0 receives the packet, it becomes the work of the kernel, which is too complex and ignored here.
The kernel finally uses its own routing table to determine how to forward the packet to 8.8.8.8, and then sends it to the eth0 network card on the host machine according to the rules.
The packet was sent out, but we can’t monitor the incoming packet.

Who to process the packet

The reason we can’t receive the packet is simple. The source of the packet we sent out is 10.55.66.2, which is usually a private network segment and many people in the world may use 10.55.66.2. In this case, 8.8.8.8 has no idea how to send the packet back to 10.55.66.2.

At this point, we need to think, can our host go online and can its packets come back? Can we ask the host to be a good person and help us change the source of the packet to its IP? And then the host can find a way to send the packet back to the container behind us.

This concept is called Network Address Translation (NAT) and in this example, we want to modify the source IP address of the packet, so this behavior is called Source NAT (SNAT).

To achieve this goal, we need to use iptables rules to help us and iptables have many ways to meet this requirement, we choose the simplest and most common way, MASQUERADE, to handle this dynamic SNAT function.

We use the following rule to tell iptables, from now on, whenever you see a packet from 10.55.66.2/32 and sent out from the host’s eth0 network interface, please help modify the source of the packet and change it to yourself.

$ sudo iptables -t nat -I POSTROUTING -s 10.55.66.2/32 -o eth0 -j MASQUERADE
$ sudo docker exec -it c1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=18.2 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=61 time=15.0 ms
^C

Now, when we listen to packets using tcpdump, we will see that everything is different!

$ sudo tcpdump -vvvnn -i eth0 icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:40:52.893673 IP (tos 0x0, ttl 63, id 39804, offset 0, flags [DF], proto ICMP (1), length 84)
    10.0.2.15 > 8.8.8.8: ICMP echo request, id 19069, seq 3, length 64
19:40:52.906130 IP (tos 0x0, ttl 62, id 11543, offset 0, flags [DF], proto ICMP (1), length 84)
    8.8.8.8 > 10.0.2.15: ICMP echo reply, id 19069, seq 3, length 64
19:40:54.962322 IP (tos 0x0, ttl 63, id 40179, offset 0, flags [DF], proto ICMP (1), length 84)
    10.0.2.15 > 8.8.8.8: ICMP echo request, id 19079, seq 1, length 64
19:40:54.977470 IP (tos 0x0, ttl 62, id 11559, offset 0, flags [DF], proto ICMP (1), length 84)
    8.8.8.8 > 10.0.2.15: ICMP echo reply, id 19079, seq 1, length 64

Let’s divide the content into two parts to look at.

19:40:52.893673 IP (tos 0x0, ttl 63, id 39804, offset 0, flags [DF], proto ICMP (1), length 84)
    10.0.2.15 > 8.8.8.8: ICMP echo request, id 19069, seq 3, length 64[DF], proto ICMP (1), length 84)    8.8.8.8 > 10.0.2.15: ICMP echo reply, id 19069, seq 3, length 64

We can see that the IP of the outgoing packet is no longer 10.55.66.2, but the host itself’s 10.0.2.15. You might be wondering why 8.8.8.8 can respond even though 10.0.2.15 is also a private IP. This is because there is another layer of SNAT outside my system environment. It’s normal and reasonable for a packet to undergo multiple SNATs, but the entire operation logic is consistent.

    8.8.8.8 > 10.0.2.15: ICMP echo reply, id 19069, seq 3, length 64

In this case, we can also see that the packet has returned successfully, and the ICMP response can also be seen from within the container.

Here’s a supplement, if we monitor the packets for network interface hwchiu0, which is the Linux Bridge network interface, we will get the following information.

$ sudo tcpdump -vvvnn -i hwchiu0 icmp
tcpdump: listening on hwchiu0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:47:55.128471 IP (tos 0x0, ttl 64, id 18861, offset 0, flags [DF], proto ICMP (1), length 84)
    10.55.66.2 > 8.8.8.8: ICMP echo request, id 19515, seq 1, length 64
19:47:55.146004 IP (tos 0x0, ttl 61, id 11872, offset 0, flags [DF], proto ICMP (1), length 84)
    8.8.8.8 > 10.55.66.2: ICMP echo reply, id 19515, seq 1, length 64

In this situation, from the perspective of bridge hwchiu0, it sees all packets as 10.55.66.2, which has no relation to the IP of the host machine.

So, with the same flowchart, what would the situation look like now?

In this diagram, we can successfully receive packets, so there is an additional route for incoming packets, where the IP is specially marked in red to indicate that the packet has been altered.

The packet goes through the Linux Kernel according to all previous concepts.
The Linux Kernel checks the Routing Table and confirms that the packet should be sent to eth0. I
ptables intervenes at this point and changes the source IP address of the packet through the MASQUERADE function.
The source of the packet is changed to 10.0.2.15 and finally 8.8.8.8 receives the packet and sends it back. When the response packet reaches eth0, iptables intervenes again.
After all, whoever helps you convert the packet, should also help you convert it back and change the target of the packet from 10.0.2.15 to the original 10.55.66.2.

In fact, there is also the involvement of conntrack to handle this, but it is too complex, so it is ignored here. We just need to understand the general concept.

The packet goes back to the hands of the container through Linux Bridge + Forwarding Table + Veth and other components.

Who filters the packets?

In this section, we will see how to allow our packets to pass through iptables.

According to our previous process and the initial operation, there are actually two places where iptables operates.

The Linux Bridge will secretly ask iptables to handle it once.
The Linux Kernel will also have one when sending the packet to the host’s eth0.

These two sections are completely different in this situation. For the Linux Bridge (hwchiu0), the container’s packets will reach him, and this is “reaching” him, so the rule in iptables is not the FORWARD forwarding concept, but the INPUT to handle it. So we don’t need to handle it specifically.

The INPUT chain itself is not changed to default dropped, so it will all pass.

On the other hand, the second process is the LINUX KERNEL forwarding packets, starting from eth0, so the FORWARD rule table of iptables must be considered, so what we need to handle is this process.

First, we will change the FORWARD in iptables back to the default dropped packets, returning to the configuration after installing Docker.

$ sudo iptables -P FORWARD DROP
$ sudo docker exec -it c1 ping 8.8.8.8

At this point, you will find that the packets will not pass, and there will be no messages when monitoring the packets of eth0 through tcpdump, but there will be packets if monitoring hwchiu0.

$ sudo tcpdump -vvvnn -i eth0 icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

$ sudo tcpdump -vvvnn -i hwchiu0 icmp
tcpdump: listening on hwchiu0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:16:28.313865 IP (tos 0x0, ttl 64, id 60164, offset 0, flags [DF], proto ICMP (1), length 84)
    10.55.66.2 > 8.8.8.8: ICMP echo request, id 21202, seq 33, length 64

So, we need to tell the system to allow our packets to pass through using iptables. There are many ways to do this:

Process based on source and destination IPs
Process based on source and destination NICs

Using NICs is much simpler, it’s not definite, it depends on your design. Docker will use NICs to process, so the number of rules won’t increase when you create more and more containers.

Following is the rule we use, we tell iptables:

Packets from hwchiu0 to eth0, let it pass!
Packets from eth0 to hwchiu0, let it pass!

$ sudo iptables -t filter -I FORWARD -i hwchiu0 -o eth0 -j ACCEPT
$ sudo iptables -t filter -I FORWARD -i eth0 -o hwchiu0 -j ACCEPT
$ sudo docker exec -it c1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=16.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=61 time=15.8 ms

At this point, if we look at the architecture diagram, iptables will interfere in two places in this example, which are the entry to hwchiu0 and the host’s eth0. However, because the concepts are different in these two places, different tables in iptables are used, and we process the FORWARD table because it is set to default to drop packets.

Conclusion:

Based on the information in this and the previous article, here is a summary of how Docker containers can access the Internet under the Bridge network model.

Have a Linux Bridge and set an IP
Create a container and connect it to the host’s Linux Bridge through veth
Set the IP of the container’s network interface and set a default routing rule to allow the Linux Bridge to help forward the outgoing packets
Set the relevant iptables rules so that the system will not drop the packets when forwarding
Set the iptables SNAT rules so that our outgoing packets have a chance to come back and eventually return to the container

This series of rules may seem like a lot, but they are all based on the TCP/IP network rules, simply put, how the packets should go, who will help process the packets, and whether anyone will intercept the packets. By organizing these three ideas, we can clearly analyze the direction of the packets and debug them.