Connecting Kubernetes To Your Network Using BGP

C

Having a Kubernetes cluster to run your network services is incredibly valuable for many organizations. However, the default configuration of a basic cluster using the Calico CNI isn’t really very ideal for simple automated deployment of services as ingress communications require some form of proxy running in the cluster. Many examples will reference uses of the Kube-Proxy component to provide ingress communications. This approach isn’t very ideal in my opinion as it still requires a fair amount of manual configuration to expose services in the cluster to an external network. Calico CNI uses BGP peering between nodes by default which is great for achieving the highest performance and simplicity in your cluster networking. This approach does not encapsulate any layer 3 traffic on the wire so you can easily inspect your traffic for debugging purposes.

Enter BGP peering for Kubernetes using Calico CNI. Using this method, you can have all of your Kubernetes nodes communicate directly with an external network router via BGP in order to distribute routes to your cluster services and pods. Once this has been setup, any services you configure (properly) in your cluster will become immediately accessible to your network as routes are injected immediately upon activation.

In order to apply the YAML formatted configuration files needed to setup BGP peering, you will need to have the Calico CNI control script installed in an environment that has proper permissions to manage the cluster. If you used the tutorial I provided in this blog for deploying Kubernetes, then this control script will have already been installed to each of your control plane nodes with the alias “kubectl-calico” and “calicoctl” so you’re already good to get started.

If you setup your Kubernetes cluster some other way and do not currently have the Calico CNI control script installed, execute the following commands from an environment you have setup with kubectl for the cluster:

export calico_version=$(curl --silent "https://api.github.com/repos/projectcalico/calico/releases/latest" | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')

curl -A "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/81.0" -o kubectl-calico -L https://github.com/projectcalico/calico/releases/download/$calico_version/calicoctl-linux-amd64

sudo chown root:root kubectl-calico
sudo chmod +x kubectl-calico
sudo mv kubectl-calico /usr/local/bin/
sudo ln -s /usr/local/bin/kubectl-calico /usr/local/bin/calicoctl

At this point, you should be ready to start building the basic configuration file required to update the Calico CNI in order to establish BGP peering both with your network router and between nodes (route reflection). The example I provide in this tutorial will assume a very basic setup that assumes only one availability zone with a single top-of-rack (ToR) router. If you are building something with even better redundancy, it’s not complicated to pivot from this configuration to one that supports multiple availability zones and/or multiple ToR routers.

Let’s start by applying some additional labels to all of your cluster nodes. These labels will be used to selectively enable route reflection via BGP. Execute the following command for each node in your cluster by updating the “HOSTNAME-HERE” string to match the hostname configured on each node. If you’re unsure of a node’s hostname, just execute the “hostname” command on that node.

kubectl label node HOSTNAME-HERE route-reflector=true

After this has been completed, you can verify that everything looks correct by executing the following command:

kubectl get nodes --show-labels

Now let’s create a YAML formatted configuration file using the following command:

echo 'apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  logSeverityScreen: Info
  nodeToNodeMeshEnabled: false
  serviceClusterIPs:
  - cidr: 10.10.10.0/23
  serviceExternalIPs:
  - cidr: 10.10.0.0/23
  - cidr: 172.21.0.0/24
  listenPort: 179
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: as-rtr1
spec:
  peerIP: 172.22.100.254
  asNumber: 64512
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: as-node
spec:
  nodeSelector: all()
  peerSelector: route-reflector == \'true\'' | tee /tmp/calico-bgp-configuration.yaml

Before you apply this configuration file to the cluster, some of the values will probably need modified to reflect appropriate settings that match your environment.

The first item to update is the “serviceClusterIPs” entry. This should contain the CIDR that you configured for your service networks CIDR when initializing the cluster. If you followed my tutorial for deploying the Kubernetes cluster initially, all of this should have some familiarity already.

  serviceClusterIPs:
  - cidr: 10.10.10.0/23

The next item to update is the “serviceExternalIPs” entry. This should contain one or more CIDR entries to reflect any IP ranges that you intend to assign external service IP addresses from. For example, if you were going to deploy a DNS resolver for your network with an IP address of “10.10.0.1” then you would add a CIDR entry that includes that particular IP address such as “- cidr: 10.10.0.0/24”. IP addresses assigned to services from the ranges configured in this entry will be announced as individual /32 routes via BGP. This can be very useful if you need to cherry pick addresses in-between existing network allocations or if you need to temporarily “hijack” the traffic of another service on your network. A good example of this would be doing brief testing of new services that will replace existing deployments elsewhere on your network.

  serviceExternalIPs:
  - cidr: 10.10.0.0/23
  - cidr: 172.21.0.0/24

Lastly, you may need to update the “peerIP” and “asNumber” entries to reflect the appropriate IP address and ASN of your ToR router that the cluster nodes should peer to.

spec:
  peerIP: 172.22.100.254
  asNumber: 64512

Now that the configuration file is ready, it’s time to apply it to the cluster. Don’t forget that you will need to update your ToR router with the appropriate configuration to support BGP peering from each of your cluster nodes. The remote ASN used by the cluster nodes in this configuration will be the same as the ASN assigned in the previous section. To apply the configuration file to the cluster, execute the following command:

calicoctl apply -f /tmp/calico-bgp-configuration.yaml

At this point, if your ToR router has already been properly configured for BGP peering to your cluster nodes, you should see the peering relationships begin to establish after no more than one to two minutes typically, often quicker depending on your ToR router’s BGP configuration.

That’s it! Wow, wasn’t that simple? Your Kubernetes cluster is now ready to begin announcing routes for any services and pods that you deploy. Once the relationships have been established, you should see some announcements to cover what is already deployed in the cluster. You will only see /32 announcements for services which have been configured with one or more external IP addresses. None of the default services deployed for a fresh cluster define this setting so don’t expect to see any out of the gate if you have not already configured something this way.

There are some important details to note about this particular configuration (which can be alternated to work slightly differently). With this configuration, the node-to-node mesh networking is disabled which means that traffic for a given service IP will be routed only to the cluster nodes which are actively hosted an instance of the service. This is really handy if any of your services have a need to see the source IP of ingress traffic (such as for security purposes). Depending on your use case, this might not be the most ideal configuration though since additional external configuration would be required on your router to properly balance ingress traffic to each node hosting a service instance.

When you use the node-to-node mesh networking feature, you can direct ingress traffic for your services to any mesh node in the cluster even if the service doesn’t have an instance running on a particular node. If the service isn’t running locally, the node will forward the traffic to a different node that is running the service. This approach is great if you want ingress traffic to be automatically balanced across all available nodes running a particular service. The caveat here is that when using mesh networking, your service may not always see the source IP address of the ingress traffic since the packets will be rewritten with a source IP address that reflects the cluster node that forwarded the traffic via the mesh network.

I recommend you check out this post next to take your Kubernetes deployment to the next level.

About the author

Add Comment

By Matt

Matt

Get in touch

If you would like to contact me, head over to my company website at https://azorian.solutions.