Multipath link aggregation (e.g. bonding) allows the simultaneous use of multiple physical links to enable increased throughput, load balancing, redundancy, and fault tolerance. There are a variety of standard policies available that can be used right out of the box with no configuration. These policies are directly inspired by the policies offered by the Linux kernel but are now offered in user-space and hence available on all platforms that ZeroTier supports (including Windows!).
Terminology
Link - A cable, wifi connection, pidgeon, etc.
Bonding policy: What rules we use to choose the link that a given packet is sent out on.
Failover - When a physical link goes down and the bond responds in some way.
Standard policies
...
Policy name | Fault tolerance | Min. failover (sec.) | Default Failover (sec.) | Balancing | Aggregation efficiency | Redundancy | Sequence Reordering |
---|---|---|---|---|---|---|---|
| None |
|
| none |
| 1 | No |
| Brief interruption |
|
| none |
| 1 | Only during failover |
| Fully tolerant |
|
| none |
| N | Often |
| Self-healing |
|
| packet-based |
| 1 | Often |
| Self-healing |
|
| flow-based |
| 1 | Only during failover |
| Self-healing |
|
| adaptive flow-based |
| 1 | Only during failover and re-balance |
...
It is possible to direct ZeroTier to form a certain type of bond with specific peers of your choice. For instance, if one were to want active-backup
by default but for certain peers to be bonded with a custom load-balanced bond such as my-custom-balance-aware
one could do the following:
...
:
Code Block |
---|
{ "settings": { "defaultBondingPolicy": "active-backup", "peerSpecificBonds": { "f6203a2db3":"my-custom-balance-awareactive-backup", "45b0301da2":"my-custom-balance-awarexor", "a92cb526fa":"my-custom-balance-awarebroadcast" } } } |
Active backup (active-backup
)
...
Traffic is sent on only on ( one) path at any given time. A different path becomes active if the current path fails. This mode provides fault tolerance with a nearly immediate fail-over. This mode does not increase total throughput.
mode
:primary, spare
Link option which specifies which link is the primary device. The specified device is intended to always be the active link while it is available. There are exceptions to this behavior when using differentlinkSelectMethod
modes. There can only be oneprimary
link in this bonding policy.linkSelectMethod
: Specifies the selection policy for the active link during failure and/or recovery events. This is similar to the Linux Kernel'sprimary_reselect
option but with a minor extension:optimize
: (default if user provides no failover guidance) The primary link can change periodically if a superior path is detected.always
: (default when links are explicitly specified): Primary link regains status as active link whenever it comes back up.better
: Primary link regains status as active link when it comes back up and (if) it is better than the currently-active link.failure
: Primary link regains status as active link only if the currently-active link fails.
The zerotier-cli bond <peerId> rotate
command will forcibly switch to the next link available from the failover queue in an active-backup
bond.
...
Code Block |
---|
{ "settings": { "defaultBondingPolicy": "active-backup", "active-backup": { "linkSelectMethod": "always", "links": { "eth0": { "failoverTo": "eth1", "mode": "primary" }, "eth1": { "mode": "spare" }, "eth2": { "mode": "spare" }, "eth3": { "mode": "spare" } } } } } |
Broadcast (broadcast
)
...
Traffic is sent on (all) available paths simultaneously. This mode provides fault tolerance and effectively immediate failover due to transmission redundancy. This mode is a poor utilization of throughput resources and will not increase throughput but can prevent packet loss during a link failure. The only option available is dedup
which will de-duplicate all packets on the receiving end if set to true
.
...
Balance round robin (balance-rr
)
...
...
Traffic is striped across multiple paths. Offers partial fault tolerance immediately, full fault tolerance eventually. This policy is unaware of protocols and is primarily intended for use with protocols that are not sensitive to reordering delays. The only option available for this policy is packetsPerLink
which specifies the number of packets to transmit via a path before moving to the next in the RR sequence. When set to 0
a path is chosen at random for each outgoing packet. The default value is 8
, low values can begin to add overhead to packet processing.
...
Balance XOR (balance-xor
, similar to the Linux kernel's balance-xor with xmit_hash_policy=layer3+4
)
...
Traffic is categorized into flows based on source port, destination port, and protocol type these flows are then hashed onto available links. Each flow will persist on its assigned link interface for its entire life-cycle. Traffic that does not have an assigned port (such as ICMP pings) will be randomly distributed across links. The hash function is simply: src_port ^ dst_port ^ proto
.
...
Balance aware (balance-aware
, similar to Linux kernel's balance-*lb
modes)
...
Traffic is dynamically allocated and balanced across multiple links simultaneously according to the target allocation. Options allow for packet or flow-based processing, and active-flow reassignment. Flows mediated over a recently failed links will be reassigned in a manner that respects the target allocation of the bond. An optional balancePolicy
can be specified with the following effects: flow-dynamic
(default) will hash flows onto links according to target allocation and may perform periodic re-assignments in order to preserve balance. flow-static
, will hash flows onto links according to target allocation but will not re-assign flows unless a failure occurs or the link is no longer operating within acceptable parameters. And lastly packet
which simply load balances packets across links according to target allocation but with no concern for sequence reordering.
...
Code Block |
---|
{ "settings": { "defaultBondingPolicy": "balance-aware", "balance-aware": { "allowFlowHashing": true|false, "rebalanceStrategy": "passive"|"opportunistic"|"aggressive" } } } |
Available commands
...
The zerotier-cli bond list or listbonds
command will show the current type and health status of bonds between all peers:
Code Block |
---|
$ zerotier-cli listbonds <peer> <bondtype> <status> <links> 16a03a3d03 active-backup HEALTHYHealthy 4/4 a92cb526fa balance-xor DEGRADEDDegraded 2/3 45b0301da2 balance-xor HEALTHYHealthy 6/6 f6203a2db3 balance-rr HEALTHY 3/3 |
The link
command will provides a facility to add and remove links in real time and nominate them for usage for particular bonds:
Code Block |
---|
$ zerotier-cli link add ppp0
$ zerotier-cli link nominate ppp0 a92cb526fa
$ zerotier-cli link remove ppp0 |
...
balance-rr Healthy 3/3 |
The zerotier-cli bond <peerId> show
command prints the parameters of a bond and real-time quality metrics of its constituent links:
Code Block |
---|
$ zerotier-cli show bond a92cb526fa PEER BOND TYPE STATUS LINKS 4ac8a21e06 balance-xor Degraded 4/7 BOND PARAMETERS Failover Interval Up Delay Down Delay Packets Per Link Link Selection Method 500 40000 0 N/A N/A LINK QUALITY STATUS LAT LTM PDV PLR PER SPD ALLOC 1.2.3.4/40000 Dead 46 11 11 1.000 0.000 100 0.00 34.65.23.78/3000 Alive 3 11 11 0.000 0.000 10 0.25 5.23.21.100/9000 Dead 17 11 11 0.970 0.220 1000 0.00 12.165.23.199/10000 Alive 7 11 11 0.000 0.000 10 0.25 88.11.22.78/8000 Alive 1 11 11 0.001 0.000 100 0.25 6.123.91.100/9000 Dead 22 11 11 1.000 0.000 100 0.00 188.11.22.78/8000 Alive 11 11 0.000 0.000 1000 0.25 |
The zerotier-cli bond <peerId> link <action>
command will provides a facility to add and remove links in real time and nominate them for usage for particular bonds:
Code Block |
---|
$ zerotier-cli 11bond a92cb526fa link add 11ppp0 $ 0.000 0.000 1000 0.25zerotier-cli bond a92cb526fa link nominate ppp0 $ zerotier-cli bond a92cb526fa link remove ppp0 |
The zerotier-cli bond <peerId> enable
will forcibly enable bonding on
Logging
...
If you compile with ZT_TRACE=1
ZeroTier will output important bond events to stderr
allowing you to see the decisions being made by the bonding layer in real-time. For instance, below is a fairly straightforward case of the creation of an active-backup
bond and the subsequent failure of its primary link and instantaneous recovery on a backup link:
...
Only packets with internal IDs divisible by
16
are included in measurements, this amounts to about6.25%
of all traffic.failoverInterval
specifies how quickly failover should occur during a link failure. In order to accomplish this a combination of active and passive measurement techniques are employed which may result inVERB_HELLO
probes being sent everyfailoverInterval / 4
time units. As a mitigationmonitorStrategy
may be set todynamic
so that probe frequency directly correlates with native application traffic.
Areas of future development:
Protocol inspection (for example: diagnosing TCP issues and attempting to remedy via path-reassignment.)