Site Tools


sysadmin:projects:s23:linuxrebuild

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
sysadmin:projects:s23:linuxrebuild [2023/04/12 18:51] kjohns23sysadmin:projects:s23:linuxrebuild [2023/04/12 19:50] (current) kjohns23
Line 3: Line 3:
 ==== Problem ==== ==== Problem ====
  
-The current Linux/NoMachine setup has two main issues. An increase in VSCode sessions has increased the total resource consumption on the linux nodes. +The current Linux/NoMachine setup has two main issues. An increase in VSCode sessions has increased the total resource consumption on the linux nodes leading to partial/whole system outages
  
 Any solution other than rebuilding the current system as is will involved parallelizing the setup to reduce the impact any one failed node will have on other students using the cluster. Any solution other than rebuilding the current system as is will involved parallelizing the setup to reduce the impact any one failed node will have on other students using the cluster.
  
-Most potential solutions suggest breaking apart the NoMachine and SSH services as it is easier (and cheaper) to parallelize the SSH systems where as NoMachine is licenses per server and CPU core.+Most potential solutions suggest breaking apart the NoMachine and SSH services as it is easier (and cheaper) to parallelize the SSH systems where as NoMachine is licensed per server and CPU core.
  
 ==== Solutions ==== ==== Solutions ====
Line 23: Line 23:
   * Does not limit impact by any one node becoming unavailable   * Does not limit impact by any one node becoming unavailable
   * Limits ability of extra resources bursty workloads   * Limits ability of extra resources bursty workloads
 +  * NoMachine head node remains particularly vulnerable to outages (would take down all NX sessions)
 +
 +== Notes ==
 +  * A more thorough use of Ansible would be recommended to effectively manage updates
  
 === New SSH VMs ==== === New SSH VMs ====
  
 Build new KVM based VMs in Proxmox Build new KVM based VMs in Proxmox
 +
 +== Benefits ==
 +  * Setup is closer to current system and would involve fewer unknowns
 +  * SSH/VSCode connections would no longer impact NoMachine
 +
 +== Drawbacks ==
 +  * Limited ability to parallelize ssh before management of nodes becomes more difficult
 +
 +=== Kubernetes Based Shared Containers ===
 +
 +Build a SoCS Linux Docker Container and deploy to Kubernetes. These containers would be shared, similar to the current environment, but Kubernetes offers the opportunity to run many more nodes in parallel than a VM based setup could be effectively managed. 
 +
 +== Benefits ==
 +  * Potential to auto-scale cluster to more responsively meet the load
 +  * SSH/VSCode connections would no longer impact NoMachine
 +  * Container based setup could also be distributed to students
 +
 +== Drawbacks ==
 +  * Potentially more complex setup with more unknowns
 +
 +== Notes ==
 +  * Will need to determine best cluster ingress configuration. Metallb? Traefik? HAProxy? Something else?
  
 === Container SSH ==== === Container SSH ====
  
-Use [[ContainerSSH|https://containerssh.io/]] to allow one kubernetes container per student+Use [[https://containerssh.io/|ContainerSSH]] to allow one kubernetes container per student. Students would SSH to the cluster as they currently do, however they would instead be routed to their own dynamically provisioned container.
  
 == Benefits == == Benefits ==
   * Completely removes impact of one student an another user's environment   * Completely removes impact of one student an another user's environment
 +  * Container based setup could also be distributed to students
  
 == Drawbacks == == Drawbacks ==
   * Under relatively inactive development - new and potentially unstable   * Under relatively inactive development - new and potentially unstable
   * Complex setup for authentication server   * Complex setup for authentication server
sysadmin/projects/s23/linuxrebuild.1681325475.txt.gz · Last modified: 2023/04/12 18:51 by kjohns23