High Performance Computing Clusters in the Computer Science Department, Cal Poly, SLO
Cal Poly, SLO, where I work is a teaching university. We don’t do a lot of research and historically, there hasn’t been a lot of high performance computing on campus. A number of years ago, Information Technology Services (ITS, the central campus computing people) got a grant and set up a small cluster for general use. Users of this cluster that I’ve talked to weren’t very happy with it. It wan’t particularly big, and it wasn’t particularly easy to use. Since then, various faculty with research projects have gotten their own clusters which their departments operated, not ITS. The first cluster I’m aware of like this was ravel.csc.calpoly.edu, which I helped spec out, buy, rack up, and loaded the operating system on.
Ravel.csc.calpoly.edu is a 23 node high performance computing cluster located in the Computer Science Department, Cal Poly, SLO, building 14-238. It was originally acquired by Dr. Diana Franklin with an NSF grant in April of 2007. The compute nodes are two socket, 2 core machines (4 cores per node) for a total of 92 64 bit cores of Intel Xeon 5130 at 2 GHz. Each node has 4GB of memory. Nodes use Gigabit Ethernet for interconnect. The head node has a 1TB external array of SAS drives to hold user data. User home directories are mounted with NFS.
Ravel runs a Linux distribution for HPC named Rocks.
Dr. David Marshall in the Cal Poly Aerospace Engineering Department got a grant to buy a cluster for research. He bought a small cluster from PSSC Labs. Steger.aero.calpoly.edu has 11 nodes. Each compute node is a two socket, quad core machine (8 cores per node) for a total of 88 64 bit cores of AMD Opteron(tm) Processor 2378 at 2.4 GHz. Each node has 32GB of memory. Nodes have both Infiniband and Gigabit Ethernet for interconnects. The cluster has a separate 4TB filer acting as an NFS server to hold user data. User home directories are mounted with NFS.
The AERO department did not have a space with proper power or cooling to operate steger, so they asked us if we could host it for them. We moved it to the Computer Science department machine room in August 2009. So I’m now the administrator for steger. One of the first things I did was reload steger with Rocks.
Dr. Chris Lupo got a grant to explore using the rendering engines on video cards as compute nodes in a cluster. NVIDIA now sells what were “video cards” as General Purpose Graphics Processing Units (GPGPU or GPU). We wound up buying a workstation from AMAX that has four NVIDIA Tesla 2050 GPU cards in it. That’s four times 448 cores per card = 1792 GPU cores in a 4U workstation form factor. It came preloaded with Debian on it. We’re considering putting Rocks on it, too.
I’m the primary system administrator for all of these systems ( Greg Porter, glporter @ calpoly. edu). Email me if you have issues. Other system administrators might be able to help as well, try computer-science-sysadmins @ polymail. calpoly. edu. All of these clusters are open for general use by students, faculty, and staff. Email Greg Porter or the sysadmins (addresses above) if you’d like an account.