diff options
Diffstat (limited to 'docs/consistency.txt')
-rw-r--r-- | docs/consistency.txt | 36 |
1 files changed, 36 insertions, 0 deletions
diff --git a/docs/consistency.txt b/docs/consistency.txt new file mode 100644 index 0000000..127d9a7 --- /dev/null +++ b/docs/consistency.txt @@ -0,0 +1,36 @@ +General overview +================= + - etcd services (worth checking both ports) + etcdctl3 --endpoints="192.168.213.1:2379" member list - doesn't check health only reports members + oc get cs - only etcd (other services will fail on Openshift) + - All nodes and pods are fine and running and all pvc are bound + oc get nodes + oc get pods --all-namespaces -o wide + oc get pvc --all-namespaces -o wide + - API health check + curl -k https://apiserver.kube-service-catalog.svc/healthz + +Storage +======= + - Heketi status + heketi-cli -s http://heketi-storage.glusterfs.svc.cluster.local:8080 --user admin --secret "$(oc get secret heketi-storage-admin-secret -n glusterfs -o jsonpath='{.data.key}' | base64 -d)" topology info + - Status of Gluster Volume (and its bricks which with heketi fails often) + gluster volume info + ./gluster.sh info all_heketi + - Check available storage space on system partition and LVM volumes (docker, heketi, ands) + Run 'df -h' and 'lvdisplay' on each node + +Networking +========== + - Check that both internal and external addresses are resolvable from all hosts. + * I.e. we should be able to resolve 'google.com' + * And we should be able to resolve 'heketi-storage.glusterfs.svc.cluster.local' + + - Check that keepalived service is up and the corresponding ip's are really assigned to one + of the nodes (vagrant provisioner would remove keepalived tracked ips, but keepalived will + continue running without noticing it) + + - Ensure, we don't have override of cluster_name to first master (which we do during the + provisioning of OpenShift plays) + +
\ No newline at end of file |