From f3c41dd13a0a86382b80d564e9de0d6b06fb1dbf Mon Sep 17 00:00:00 2001 From: "Suren A. Chilingaryan" Date: Sun, 11 Mar 2018 19:56:38 +0100 Subject: Various fixes before moving to hardware installation --- docs/network.txt | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 docs/network.txt (limited to 'docs/network.txt') diff --git a/docs/network.txt b/docs/network.txt new file mode 100644 index 0000000..a164d36 --- /dev/null +++ b/docs/network.txt @@ -0,0 +1,58 @@ +Configuration +============= +openshift_ip Infiniband IPs for fast communication (it also used for ADEI/MySQL bridge + and so should reside on fast network. +openshift_hostname The 'cluster' host name. Should match real host name for certificat validation. + So, it should be set if default ip does not resolve to host name +openshift_public_ip We may either skip this or set to our 192.168.26.xxx network. Usage is unclear +openshift_public_hostname I guess it is also for certificates, but while communicating with external systems +openshift_master_cluster_hostname Internal cluster load-balancer or just pointer to master host +openshift_public_master_cluster_hostname The main cluster gateway + + +Complex Network +=============== +Some things in OpenShift ansible scripts are still implemented with assumption we have +a simple network configuration with a single interface communicating to the world. There +are several options to change this: + openshift_set_node_ip - This variable configures nodeIP in the node configuration. This + variable is needed in cases where it is desired for node traffic to go over an interface + other than the default network interface. + openshift_ip - This variable overrides the cluster internal IP address for the system. + Use this when using an interface that is not configured with the default route. + openshift_hostname - This variable overrides the internal cluster host name for the system. + Use this when the system’s default IP address does not resolve to the system host name. +Furthermore, if we use infiniband which is not accessible to outside world we need to set + openshift_public_ip - Use this for cloud installations, or for hosts on networks using + a network address translation + openshift_public_hostname - Use this for cloud installations, or for hosts on networks + using a network address translation (NAT). + + This is, however, is not used trough all system components. Some provisioning code and +installed scripts are still detect kind of 'main system ip' to look for the +services. This ip is intendified either as 'ansible_default_ip' or by the code trying +to look for the ip which is used to send packet over default route. Ansible in the end does +the some thing. This plays bad for several reasons. + - We have keepalived ips moving between systems. The scripts are actually catching + this moving ips instead of the fixed ip bound to the system. + - There could be several default routes. While it is not a problem, scripts does not expect + that and may fail. + +For instance, the script '99-origin-dns.sh' in /etc/NetworkManager/dispatcher.d. + * def_route=$(/sbin/ip route list match 0.0.0.0/0 | awk '{print $3 }') + 1) Does not expect multiple default routes and will find just a random one. Then, + * if [[ ${DEVICE_IFACE} == ${def_route_int} ]]; then + check may fail and the resolv.conf will be not updated because currently up'ed + interface is not on default route, but it actually is. Furthermore, + * def_route_ip=$(/sbin/ip route get to ${def_route} | awk '{print $5}') + 2) ignorant of keepalived and will bound to keepalived. + + But I am not sure the problems are limited to this script. There could be other places with + the same logic. Some details are here: + https://docs.openshift.com/container-platform/3.7/admin_guide/manage_nodes.html#manage-node-change-node-traffic-interface + +Hostnames +========= + The linux host name (uname -a) should match the hostnames assigned to openshift nodes. Otherwise, the certificate verification + will fail. It seems minor issue as system continue functioning, but better to avoid. The check can be performed with etcd: + etcdctl3 --key=/etc/etcd/peer.key --cacert=/etc/etcd/ca.crt --endpoints="192.168.213.1:2379,192.168.213.3:2379,192.168.213.4:2379" -- cgit v1.2.3