Ands v.2 ======== - Try overlay2 storage driver (LVM is used in Ands v.1). Check also further docker configuration options: 'cgroup-driver', ... * This actually seems problematic in CentOS-8. Something, like 'rsync portage portage/.tmp' is EXREMELY slow (<1 MB/s). Just check eix-sync. - Integrate fast Ethernet and use conteiner native networking. OpenVSwitch is slow and causes problems. Alternatively, can we rely on some hardware features of novel network cards, e.g. Mellanox ASAP2 (Accelerated Switch and Packet Processing) - Do not run pods on Master nodes, but Gluster and a few databases pods (MySQL) are OK (multiple reasons, especially mounting a lot of Gluster Volumes) * Restrict all periodic jobs to a specific node: easy to re-install (non-master), fast SSD storage, ...? - Object Storage should be integrated, either Gluster Block is ready for production or we have to use Ceph as well - Automatic provisioning would be much better then handling volumes trough Ands. Basically, this will render Ands redundant. We can switch to Helm, etc. But, we need ability to easily understand which volume belong to which pod/namespace and automatically kill redundant volumes. - Avoid conflicts with SCC private vlans (KIT WiFi, VPN, ...?) Questions ========= - Updates to cluster configuration (evaluate current load, etc.)? Non-scheduling masters? Something with storage? Specify appropriate node parameters - Shall we switch to plain Kubernetes or keep using OpenShift. Discussion (just take a not about security - it is right to ban containers running as root, otherwise hazard to our data model): https://cloudowski.com/articles/10-differences-between-openshift-and-kubernetes/ - Can we find a good distributed storage for data-intensive databases. Current, slave-master model requires too much manual attention. - Can we find a way to run GUI applications in containers? Kind of having CUDA profiller would be nice. - Think about monitoring. Probably SNMP + it would be nice to have some kind of SQL database with perofrmance metrics.