WORK IN PROGRESS
When you hoist the sails on your docker ship, it sets course for some iptables adventures. If ye be a seasoned sailor who already knows the ins and outs of iptables, ye can weigh anchor and sail on ahead! But if deciphering the following snippet leaves you feeling like a landlubber, don’t fret! This post will chart a course to give you the full rundown and context of all the salty details.
On a fresh installation of Docker, it’ll add these iptable rules as part of the installation.
# stitched together using: https://orgmode.org/manual/Noweb-Reference-Syntax.html
$ sudo iptables-save -t nat
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
COMMIT
$ sudo iptables-save -t filter
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
iptables primer
iptables in nutshell is a user-space cli utility program to manage incoming, forwarding and outgoing traffic through the linux kernel. Although I won’t dive into the nitty-gritty details of iptables
, this post will provide the bare essentials to understand how Docker utilizes it.
iptables
learning resources
man iptables
,man iptables-extensions
- iptables tutorial by Oskar Andreasson : Somewhat outdated but good resource. Also needs some styling.
- IPTables - CentOS Wiki, iptables - ArchWiki
- A Deep Dive into Iptables and Netfilter Architecture | DigitalOcean
- My personal wiki entry (not really intended for public reading)
- Real-time Iptables Monitor : Old perl script, excellent for understanding how packets flow through iptables.
Isn’t
iptables
obsolete?Yes, it is true that the original iptables is made obsolete by the newer nftables. But there’s an in-between, iptables-nft which uses the newer
nftables
kernel API but reuses the legacy packet-matching code of iptables. As of the moment, you get the best of both worlds usingiptables-nft
.If you’ve a recent system and it has iptables installed, it’s most likely
iptables-nft
, check by running the following. If it has thenf_tables
in the output, it’s usingiptables-nft
.λ iptables -V iptables v1.8.7 (nf_tables)
General consensus on learning, learn iptables syntax first with
iptables-nft
(Plenty learning materials), then learnnftables
, then maybe try out the config managers likefirewalld
. Later checkoutBPF / XDP
.
Commands
iptables -nvL --line-number -t [table_name]
: CLI view for table.--line-number
is important because it tells you the rule ordering.-n
shows numeric IPs, and-v
helps us see the interface.iptables-save -t [table_name]
: File view for table. This commands does not save anything as the name implies but simply dumps the table data in the format it’ll be saved to disk. (We’re using this in our examples as it’s better for explaining)
Semantics
--->PRE------>[ROUTE]--->FWD---------->POST------>
Conntrack | Mangle ^ Mangle
Mangle | Filter | NAT (Src)
NAT (Dst) | | Conntrack
(QDisc) | [ROUTE]
v |
IN Filter OUT Conntrack
| Conntrack ^ Mangle
| Mangle | NAT (Dst)
v | Filter
The diagram on top shows how tables
and chains
are related. Eg. we can see that, nat:PREROUTING
is checked before the nat:INPUT
.
Tables
- Tables are an organizational structure for iptables. Chains are stored in tables, which in turn store rules, which in turn store matches and targets.
- 5 tables:
filter
(default),nat
,mangle
,raw
,security
- Docker makes changes to just
nat
&filter
tables so we’ll be focusing on those.
Chains
- Chains are simply list of rules which are followed in order. Eg.
ChainXYZ = [rule1, rule2, rule3, rule4]
- Chains are per tables. i.e chain
OUTPUT
onnat
table is different from chainOUTPUT
onfilter
table. - Two types of chains: builtin and user defined.
- Built-in chains represent the
netfilter
hooks which trigger them.filter
table:INPUT
,FORWARD
,OUTPUT
nat
table:PREROUTING
,INPUT
,OUTPUT
,POSTROUTING
- Custom user defined chains represent
targets
which can be jumped to from the built-in chains.- Eg. Docker adds custom chains
- Custom chains cannot have a
default policy
hence, you’ll see a-
in the custom Docker added chains in theiptables-save
output.
- Built-in chains represent the
- Understanding how chains are traversed is important to make sense of iptable rules but we won’t be delving into that territory in this post.
Rule
- Rule = Match(s) + Target/Action
Match
- Match is something that specifies a special condition within the packet that must be true (or false), if a match is true it can jump to a
target
- Types (Not official, just classifying)
- Generic: A generic match is a kind of match that is always available, whatever kind of protocol we are working on, or whatever match extensions we have loaded. (eg.
-s
) - Implicit: Implicit matches are implied, taken for granted, automatic. These are protocol specific. (eg.
-p tcp --dport
) - Explicit: Explicit matches are those that have to be specifically loaded with the
-m
or--match
option.
- Generic: A generic match is a kind of match that is always available, whatever kind of protocol we are working on, or whatever match extensions we have loaded. (eg.
Target
- These are basically actions to perform when a match occurs in a rule. Specified by the
-j
flag. - 3 Types: User defined(another
chain
), Builtin targets(ACCEPT
,DROP
,QUEUE
,RETURN
), Target extensions(man iptables-extensions
).
Relevant notes
[some_number:some_other_number]
: You’ll see this iniptable-save
output. This is[Packets:Bytes]
that have matched each rule. The default policies also have counters. It’s useful to see when you run something how this counters change. You can use the-Z
option to clear the counters.-c
option can be used to view counters by rule.-A
: Append rule to a chain-I
: Insert rule on top of a chain
nat table
Here’s what the nat
table in its default state(Docker not installed yet) looks like.
$ sudo iptables-save -t nat
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
filter table
Here’s what the filter
table in its default state(Docker not installed yet) looks like.
$ sudo iptables-save -t filter
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
Docker networking primer
Different container managers(docker, podman, lxd etc) provide a number of ways networking can be done with containers. An interesting one is the bridged networking
approach(which this post is basedon), it essentially boils down to 3 things
.
- Creating
veth
pair fromhost
tonet namespace-X
. Every new container will add newveth
interface and remove it once container is stopped.veth
is a virtual device that acts as a tunnel between network namespaces. These devices create interconnected peering between the two connected links and pass direct traffic between them. - Adding a
bridge
for theveth
pair to talk through. When you installdocker
, it automatically createsdocker0
bridge(a virtual bridge interface) for containers to communicate with each other and to the outside world. - Adding iptables rules to access outside network (This is what we’re focusing on in here)
Image credits, Now usually docker/podman/lxd does all this for you so you won’t have to worry about it.
Docker networking/container networking learning resources
- Offical docker docs: Docker and iptables, Use bridge networks, Networking overview
- Container Networking | In the mood for programming
- Why Docker exposes my private services to the world?
- Introduction to Linux interfaces for virtual networking
- Deep Dive into Linux Networking and Docker - Bridge, vETH and IPTables
- 8.2.5 About Veth and Macvlan
Docker’s game with iptables
Docker makes changes to 2 tables, nat
(to resolve packets to and from containers, and more?) and filter
(for isolation purposes, and more?).
Changes to nat table
$ sudo iptables-save -t nat
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
COMMIT
:DOCKER - [0:0]
- Adds the
DOCKER
chain. As it’s a custom chain, it cannot have adefault policy
, hence-
.
Intended for
Used to handle network address translation (NAT) for Docker containers.- I am not sure what it is intended for. Please help.
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
- Packet coming from any {interface, protocol, source} to a
LOCAL
address, jump toDOCKER
chain. - See
man iptables-extensions
for details on what isLOCAL
Intended for
- Traffic that’s coming from outside(external/host) to the container.
- I am not sure what it is intended for. Please help.
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
- This rule is a combination of generic matching(
-d
) and explicit matching(-m
). - Packets which are local being sent to a
LOCAL
address but not to the loopback range, jump toDOCKER
chain.
Intended for
- I am not sure why docker is interested in host traffic here.
- I am not sure what it is intended for. Please help.
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
- Packet coming from any {interface, protocol,
172.17.0.0/16
network} to be sent out via anything butdocker0
interface shouldMASQUERADE
Intended for
- This is needed for internet access
- Specifies that the source IP when going out via anything but
docker0
needs to be masqueraded/SNAT’d.
-A DOCKER -i docker0 -j RETURN
- In the default state, the
DOCKER
chain isn’t doing much but simply returning to the calling chain. But it’ll return only if the packet is incoming fromdocker0
interface.
Intended for
This becomes more useful as we add containers and open ports in them- I am not sure what it is intended for. Please help.
Changes to filter table
$ sudo iptables-save -t filter
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
- Adds custom chains, since these are custom chains and cannot have a
default policy
, hence-
.
Intended for
DOCKER
: It contains rules that control incoming traffic to Docker containersDOCKER-USER
: user-defined iptables rulesDOCKER-ISOLATION-STAGE-1
: Restrict traffic (not a lot of info available about this chain)DOCKER-ISOLATION-STAGE-2
: Restrict traffic (not a lot of info available about this chain)
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
- Anything that’s forwarded, first jumps to
DOCKER-USER
chain and then to theDOCKER-ISOLATION-STAGE-1
chain.
Intended for
- Making sure docker filters get applied
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
- Accept packet coming from any {interface, protocol, source} to be sent via
docker0
interface if they areESTABLISHED
orRELATED
- To be sent via
docker0
in this case means, packets going via the docker containers. - See
man iptables-extensions
forconntrack
for details onESTABLISHED
orRELATED
Intended for
- We’re only allowing packets which are already
ESTABLISHED
orRELATED
, which means it will not allowNEW
packets.
-A FORWARD -o docker0 -j DOCKER
- Packets coming from any {interface, protocol, source} to be sent via
docker0
interface, jump toDOCKER
chain. - To be sent via
docker0
in this case means, packets going to the docker containers.
Intended for
- In fresh installation of Docker, there’s no rule defined in the
filter:DOCKER
chain. But will become useful as containers start exposing ports etc.
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
- Accepts packets coming from any {protocol, source,
docker0
interface} to be sent via anything butdocker0
interface (container to outside world) - Accepts packets coming from any {protocol, source,
docker0
interface} to be sent viadocker0
interface (container to container)
Intended for
- In fresh installation of Docker, there’s no rule defined in the
filter:DOCKER
chain. But will become useful as containers start exposing ports etc.
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
- Packets coming from any {protocol, source,
docker0
interface} to be sent via anything butdocker0
interface, jump toDOCKER-ISOLATION-STAGE-2
(from container via outside interface, allow egress traffic)
Intended for
- In its default state,
DOCKER-ISOLATION-STAGE-1
doesn’t seem to do much but rules can be added via container configuration to enforce traffic restrictions for containers.
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
- Drop packets coming from any {protocol, source, interface} to be sent via
docker0
interface (from anywhere via container, drop ingress traffic)
Intended for
- In its default state,
DOCKER-ISOLATION-STAGE-2
doesn’t seem to do much but rules can be added via container configuration to enforce traffic restrictions for containers.
-A DOCKER-USER -j RETURN
- This adds a placeholder rule in the
DOCKER-USER
chain
Intended for
- When using
DOCKER-USER
, you’d want to use-I
instead of-A
so that it inserts the rule on top.
Other curious cases
What happens when ports open
- This is a TODO
- Understanding iptables rules added by docker
- Always only bound to localhost: instead 8080:8080 use 127.0.0.1:8080:8080
What happens to iptables when you start a container
Nothing
What happens if you delete all the rules and restart docker
- TODO
- https://github.com/moby/moby/issues/43896
- when we reconfigure or reload iptables, all these rules is lost.
- Current solution - restart docker but it is bad - restarting docker service causes restart all containers.
How to prevent docker from making changes to iptables
- You can set the no iptables thingy (not recommended)
- Use
DOCKER-USER
chain - You can use lxd like me
Resources
- Issues related to docker and iptables
- Tools related to docker and iptables