Kubernetes on the MPCDF HPC Cloud
This guide sets up a production Kubernetes, for a testing/development version check out the dev branch.
The [Heat orchestration template](https://docs.openstack.org/heat/ussuri/ template_guide/hot_spec.html) "Magnum ohne Magnum" (MOM) described below automates the deployment of a production-ready Kubernetes cluster on the MPCDF HPC Cloud, including "out-of-the-box" [support](https://github.com/kubernetes/ cloud-provider-openstack) for persistent storage and load balancers.
For an equivalent, non-templatized procedure, see the step-by-step version.
Deployment
Dashboard
- Create an application credential with default settings. Record the secret somewhere safe.
openstack application credential create $APP_CRED_NAME
-
Launch a new orchestration stack.
- Select the template
mom-template.yaml
as a local file or URL. - Provide (at least) the application credential id and secret, as well as the keypair you want to use to login to the SSH gateway node.
- Select the template
edit mom-env.yaml # fill-in (at least) the required parameters
openstack stack create $STACK_NAME -t mom-template.yaml -e mom-env.yaml
Scaling
The number and/or size of the worker nodes may be changed after the initial deployment, as well as the size of the controller. The command-line client makes this easy, for example:
openstack stack update $STACK_NAME --existing --parameter worker_count=$COUNT
Only the changed parameters need to be mentioned. When changing the worker flavor, there will be a rolling reboot of the nodes, one per 90 seconds. Scaling is also possible via the dashboard through the "Change Stack Template" action. Be sure to provide the exact same version of the template.
Administration
You can login to the gateway via its external IP, found on the dashboard in the "Output" section of the "Overview" tab or with:
openstack stack output show STACK_NAME gateway_ip -f value -c output_value
ssh GATEWAY_IP -l root
If you are not in the Garching campus network you will need to use one of the SSH gateways to reach the gateway machine, more information see the connecting documentation.
The tools kubectl
and helm
as well as the administrative credentials for
your Kubernetes cluster are installed on the SSH Gateway. Try:
kubectl get node -o wide
The control plane and worker nodes can be reached via the SSH gateway:
ssh -i ~/.ssh/id_rsa root@IP
Remote Clients
- Download
/root/.kube/config
from the gateway to your local machine - Run
export KUBECONFIG=config
, or add the contents of theconfig
to your existing environment withkubectl config set-cluster
, etc.
Tools such as kubectl
should now work out-of-the-box, provided the connections
originate from the specified API client network. This parameter may be updated
as necessary, for example to support off-site administrators. In this case it
is recommended to choose the smallest possible range.
Example Usage
-
Externally-accessible service
kubectl apply -f examples/svc-demo.yaml kubectl get svc svc-demo # note external ip curl http://$SERVICE_IP
-
Ingress-managed endpoints
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/cloud/deploy.yaml kubectl get svc ingress-nginx-controller -n ingress-nginx # note external ip kubectl apply -f examples/ingress-demo.yaml curl http://$INGRESS_IP/demo/
-
Pod with persistent storage
kubectl apply -f examples/pvc-demo.yaml kubectl exec pvc-demo -- /bin/sh -c "echo Hallo > /data/file.txt" kubectl delete pod pvc-demo kubectl apply -f examples/pvc-demo.yaml kubectl exec pvc-demo -- cat /data/file.txt
Limitations
- The external network, application credential, and key pair cannot be changed after the initial deployment
- Load balancers are not automatically removed prior to stack deletion, which blocks stack deletion. If possible, delete these resources from Kubernetes beforehand
- Volumes are also not removed automatically but do not block stack deletion
- Kubernetes upgrades and certificate renewal must be performed manually
- containerd is the only supported CRI
- Calico with VXLAN overlay is the only supported CNI