Anthos clusters on VMware
Issues
- When deploying Anthos clusters on VMware releases with a version number of 1.9.0 or higher, that have the Seesaw bundled load balancer in an environment that uses NSX-T stateful distributed firewall rules,
stackdriver-operator
might fail to creategke-metrics-agent-conf
ConfigMap and causegke-connect-agent
Pods to be in a crash loop. The underlying issue is that stateful NSX-T distributed firewall rules terminate the connection from a client to the user cluster API server through the Seesaw load balancer because Seesaw uses asymmetric connection flows. The integration issue with NSX-T distributed firewall rules affect all Anthos clusters on VMWare releases that use Seesaw. You might see similar connection problems on your own applications when they create large Kubernetes objects whose sizes are bigger than 32K. Follow these instructions to disable NSX-T distributed firewall rules, or to use stateless distributed firewall rules for Seesaw VMs. - If your clusters use a manual load balancer, follow these instructions to configure your load balancer to reset client connections when it detects a backend node failure. Without this configuration, clients of the Kubernetes API server might stop responding for several minutes when a server instance goes down.