From 18da6e4b5942f4fcaa9db3ba3bf1dfcd1857e9ea Mon Sep 17 00:00:00 2001 From: "Suren A. Chilingaryan" Date: Thu, 10 Jan 2019 06:43:26 +0100 Subject: Update troubleshooting documentation --- docs/consistency.txt | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) (limited to 'docs/consistency.txt') diff --git a/docs/consistency.txt b/docs/consistency.txt index dcf311a..082a734 100644 --- a/docs/consistency.txt +++ b/docs/consistency.txt @@ -5,7 +5,7 @@ General overview oc get cs - only etcd (other services will fail on Openshift) - All nodes and pods are fine and running and all pvc are bound oc get nodes - oc get pods --all-namespaces -o wide + oc get pods --all-namespaces -o wide - Check also that no pods stuck in Terminating/Pending status for a long time oc get pvc --all-namespaces -o wide - API health check curl -k https://apiserver.kube-service-catalog.svc/healthz @@ -50,10 +50,14 @@ Networking ovs-vsctl del-port br0 This does not solve the problem, however. The new interfaces will get abandoned by OpenShift. - ADEI ==== - MySQL replication is working - - No caching pods are hung (for whatever reason) - - \ No newline at end of file + - No caching pods or maintenance pods are hung (for whatever reason) + * Check no ADEI pods stuck in Deleting/Pending status + * Check logs of 'cacher' and 'maintenace' scripts and ensure none is stuck on ages old time-stamp (unless we re-caching something huge) + * Ensure were is no old pending scripts in '/adei/tmp/adminscripts' + Possible reasons: + * Stale 'flock' locks (could be found out by analyzing backtraces in correspond /proc//stack) + * Hunged connections to MySQL (could be found out by executing 'SHOW PROCESSLIST' on MySQL servers) + \ No newline at end of file -- cgit v1.2.3