Local volumes and StatefulSet to provision Master/Slave MySQL and Galera cluster

author: Suren A. Chilingaryan <csa@suren.me> 2018-03-20 15:47:51 +0100
committer: Suren A. Chilingaryan <csa@suren.me> 2018-03-20 15:47:51 +0100
commit: e2c7b1305ca8495065dcf40fd2092d7c698dd6ea (patch)
tree: abcaa7006a9c4b7a9add9bd0bf8c24f7f8ce048f /docs/README
parent: 47f350bc3aa85a8bd406d95faf084df2abf74ae9 (diff)
download: ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.gz
ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.bz2
ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.xz
ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.zip
1 files changed, 91 insertions, 0 deletions
diff --git a/docs/README b/docs/README
new file mode 100644
index 0000000..4f75b5b
--- /dev/null
+++ b/docs/README
@@ -0,0 +1,91 @@
+OpenShift Platform
+------------------
+The OpenShift web frontend is running at 
+     https://kaas.kit.edu:8443
+
+However, I find it simpler to use command line tool 'oc' which 
+ - On RedHat platforms the package is called 'origin-clients' and is installed 
+ from OpenShift repository available as package  'centos-release-openshift-origin'.
+ - For other distribut check here (we are running version 3.7)
+    https://docs.openshift.org/latest/cli_reference/get_started_cli.html#installing-the-cli
+
+Basically, it is also a good documentation to start using it.
+    https://docs.openshift.com/container-platform/3.7/dev_guide/index.html
+
+Infrastructure
+--------------
+ - We have 3 servers running with names ipekatrin[1-3].ipe.kit.edu. This is internal names. The external 
+ access is provided using 2 virtual ping-poing ip's katrin[1-2].ipe.kit.edu. By default they are assigned 
+ to both master servers of the cluster, but will migrate both to a single surviving server if one of the 
+ masters die. This is enabled by keepalived daemon and ensures load-balancing and high-availability. 
+ The domain name 'kaas.kit.edu' is resolved to both ips in round-robin fashion. 
+ 
+ - By default, the executed service have have names in the form '<service-name>.kaas.kit.edu'. For instance,
+ you can test
+    adei-katrin.kaas.kit.edu            - This is a ADEI service running on the new platform
+    adas-autogen.kaas.kit.edu           - Sample ADEI with generated data
+    katrin.kaas.kit.edu                 - Is the placehorder for futre katrin router
+    etc.
+ 
+ - OpenVPN connection with KATRIN virtual network is running on master servers. Non-masters route the traffic
+ trough the masters using keepalived IP. So, katrin network should be transparently visible from any pod in
+ the cluster.
+
+Users
+-----
+ I have configured a few user accounts using ADEI and UFO passwords. Furthermore, to avoid a mess of 
+conteiners, I have created a number of projects with appropriate administrators.
+  kaas  (csa, kopmann)  - This is a routing service (basically Apache mod_rewrite) to set redirects from http://katrin.kit.edu/*
+  katrin (katrin)       - Katrin database
+  adei (csa)            - All ADEI setups
+  bora (ntj)            - BORA
+  web (kopmann)         - Various web sites, etc.
+  mon (csa)             - Monitoring
+  test (*)              - Project for testing
+
+If needed, I can create more projects/users. Just let me know.
+
+Storage
+-------
+ I have created a couple of gluster volumes for different purpose:
+    katrin_data:        - For katrin data files
+    datastore           - Other non-katrin large data files
+    openshift           - 3 times replicated volume for configuration, sources, and other important small files
+    temporary           - Logs, temporary files, etc.
+    
+ Again, to not mess data from the different projects, on each volume there are subfolders for all projects. Furthermore,
+ I have tried to add a bit of protection and assigned each project a range of group ids. The subfolders can only be read
+ by appropriate group. I also pre-created correpsonding PersistentVolume (pv) and PersistentVolumeClaims (pvc): 'katrin', 'data', ...
+ 
+ There is a special pvc called 'host'. This is to save data on the local raid array bypassing gluster (i.e. on each OpenShift node
+ the content of the folder will be different).
+ 
+ WARNING: Gluster supports dynamic provisioning using Heketi. It is installed and worked. However, heketi is far from being 
+ of production quality. I think it is OK to use it for some temporary data if you want, but I would suggest to use pre-created 
+ volumes for important data.
+
+ - Curently, I don't plan to provide access to the servers itself. The storage should be managed from the OpenShift pods solely.
+ I made a sample 'manager' pod equipped with scp, lftp, curl, etc. It mounts all default storage. You need to start it and, then,
+ you also can connect interactively either using both web interace and console app.
+        oc -n katrin scale dc/kaas-manager --replicas 1
+        oc -n katrin rsh dc/kaas-manager
+ Just an example, build your own configuration with required set of packages.
+ 
+Databases
+---------
+ Gluster works fine if you mostly read data or if you perform mostly sequential writes. It plays very bad with 'databases' and similar
+ loads. I guess it should not be issue for Katrin database as it is relatively small (AFAIK) and do not perform many writes. For something,
+ like ADEI the gluster is not viable option to back MySQL server. There are several options to handle volumes for appliations performing a
+ large amount of small random writes:
+    - If High Availability (HA) is not important, just pin a pod to a certain node and use 'host' pvc.
+    - For databases, either Master/Slave replication can be enabled (you will still need to pin node and use 'host' pvc). The Galera cluster 
+    can be installed for multi-master replication. It is configured using StatefulSet feature of OpenShift. I have not tested recovery throughly, 
+    but it is working, quite performant, and masters are synchronized without problems.
+    - For non-database applications, the Gluster block storage may be used. The block storage is not shared between multiple pods, but private
+    to a specific pod. So, it is possible to avoid certain amount of locking and context switches. So, performance is significantly beter. I was
+    even able to run ADEI database on top of such device. Though it is still singificnatly slower than native host performance. There is again
+    heketi-based provisioner, but it works even worse when one providing standard Gluster volumes. So, I suggest to ask me to create block
+    devices manually if necessary.
+    
+ Otherall, if you have data intensive workload, we can discuss the best approach.
+ 
+\ No newline at end of file
author	Suren A. Chilingaryan <csa@suren.me>	2018-03-20 15:47:51 +0100
committer	Suren A. Chilingaryan <csa@suren.me>	2018-03-20 15:47:51 +0100
commit	e2c7b1305ca8495065dcf40fd2092d7c698dd6ea (patch)
tree	abcaa7006a9c4b7a9add9bd0bf8c24f7f8ce048f /docs/README
parent	47f350bc3aa85a8bd406d95faf084df2abf74ae9 (diff)
download	ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.gz ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.bz2 ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.xz ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.zip