Skip to content

Gateway

Gateway is a VM running on NFS (ID 100). It provides internet access to our compute cluster, and replaces a previous external service owned by iBug.

Server name: gateway.acsalab.com

Because InfiniBand interface cannot be bridged, we use a headless bridge vmbr8 to connect the VM to the cluster.

Now we use VXlan to bridge InfiniBand and vmbr8 instead of using NFS to route the traffic. Thus, we need to add a vxlan interface on every server.

Forwarding

/etc/sysctl.d/10.conf
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1

Network configuration

We use systemd-networkd to configure the network.

We get rid of the incomprehensible systemd-networkd-wait-online.service by replacing it with a sleep command.

systemctl edit systemd-networkd-wait-online.service
[Service]
ExecStart=
ExecStart=/bin/sleep 1

Interfaces

The VM has two interfaces:

  • ens18 connects to USTCnet and provides external access.

  • ens19 connects to vmbr8 and is used for internal communication.

VXLAN

To add the vxlan interface, we need to add these configurations to /etc/network/interfaces.

/etc/network/interfaces
auto vxlan0
iface vxlan0
    pre-up ip link add $IFACE type vxlan id 1 group 239.1.1.1 dev ibp175s0 || true
    post-down ip link delete $IFACE || true
    mtu 1500

Then add vxlan0 to the vmbr8's bridge port.

Routing

Routing rules:

  • 2: table main suppress_prefixlength 1
  • 3: from <addr> and oif <iface> rules
  • 9: fwmark rules
  • 19: USTCnet routes
  • 20: China IP routes
  • 32766: The default table main rule. It's slightly complicated to remove it, so we might as well keep it.
  • 32767: table default

Note that the rule with priority 2 is not associated with any interface, so we define it in 00-lo.network with [Match] Name=lo.

Similarly, the rule with priority 32767 doesn't exist by default for IPv6, so we also define it in 00-lo.network.

There is one extra route: 10.1.13.0/24 via 192.0.0.1 dev ens19, so that the gateway VM can reach the compute nodes.

USTCnet and China routes

We fetch the latest China IP list from https://github.com/gaoyifan/china-operator-ip and produce systemd-networkd configuration files for them. Then we restart systemd-networkd to load the lists.

Crontab entry:

7 7 * * * /etc/routes/cron.sh

See the scripts under /etc/routes.

External access

See config related to the warp interface, as well as the following files:

Not much can be documented publicly, sorry.

Firewall

We maintain iptables manually. The authoritative copy of the rules is located under /root/iptables. A convenient script apply.sh is provided to apply the rules, after manually editing the rules.v4 and rules.v6 files.

DNS

We use AdGuard Home as the DNS server. It is installed under /etc/AdGuardHome.

We use https://github.com/fernvenue/adguardhome-upstream for DNS routing. A custom script at /etc/AdGuardHome/update-upstream.sh is used to update the upstream list daily.

53 6 * * * /etc/AdGuardHome/update-upstream.sh
/etc/AdGuardHome/update-upstream.sh
#!/bin/bash

set -e

WGET="wget --bind-address=172.16.0.2 -q"
OUTFILE="/etc/AdGuardHome/upstream.txt"

$WGET -O '/var/tmp/default.txt' https://cdn.jsdelivr.net/gh/fernvenue/adguard-home-upstream/v4.conf
$WGET -O '/var/tmp/chinalist.txt' https://cdn.jsdelivr.net/gh/felixonmars/dnsmasq-china-list/accelerated-domains.china.conf
$WGET -O '/var/tmp/applechina.txt' https://cdn.jsdelivr.net/gh/felixonmars/dnsmasq-china-list/apple.china.conf

sed -i 's|server=|[|g' '/var/tmp/chinalist.txt'
sed -i 's|114.114.114.114|]tls://223.5.5.5|g' '/var/tmp/chinalist.txt'
sed -i 's|server=|[|g' '/var/tmp/applechina.txt'
sed -i 's|114.114.114.114|]tls://223.5.5.5|g' '/var/tmp/applechina.txt'

# The following line is used to temporarily solve the issue that `upstream_dns_file` does not support Chinese domains.
cat '/var/tmp/applechina.txt' '/var/tmp/chinalist.txt' | perl -CIOED -p -e 's/^.*\p{Script_Extensions=Han}.*$//g' > /var/tmp/upstream.txt
# WARP often fails on UDP, so use TCP HTTPS
sed -i 's|h3:|https:|g' /var/tmp/default.txt
sed -i '/^$/d' /var/tmp/upstream.txt

# When the upstream solves this problem in the future, changes need to be made here.
sed 's|\<tls://223\.5\.5\.5\>|202.38.64.1|g' '/var/tmp/default.txt' '/var/tmp/upstream.txt' > "$OUTFILE"
rm -rf '/var/tmp/default.txt' '/var/tmp/applechina.txt' '/var/tmp/chinalist.txt' '/var/tmp/upstream.txt'
systemctl restart AdGuardHome.service

AdGuard could support multiple users by adding account manually to the file /etc/AdGuardHome.yaml, with username and a hashed password. The password's hash can be calculated by htpasswd.

htpasswd -B -C 10 -n <Username>