In this post I will share some thoughts about the idea behind the open source data center. The general architecture will look like this:

Open Source Datacenter

In the open source data center we have three logical units, storage servers, virtual hosts and a controller node. All servers will run with a RHEL clone, in this case with Scientific Linux.CentOS or RHEL itself will also work and, with some modifications in the bootstrap scripts, almost any Linux will be able to server this kind of structure. There is no special reason for using SL except the case that it is enterprise ready and that it is supported for more than ten years.

The storage servers will operate as a network RAID system based on GlusterFS http://www.gluster.org/. GlusterFS is used i.e. at CERN to store the huge amount of data collected during there experiments. It can scale up to many Peta bytes with standard hardware.

The virtual layer will be based on KVM. In principal, the type of the hypervisor does not matter, regardless if we are using KVM, XEN, VMware or whatsoever. The storage is presented via NFS and the control is done by the controller unit. This is supported by all of them but KVM is the product that came out of the box with nearly every Linux, is completely GPL does nor cause additional costs. The hypervisor will be controlled with oVirt and Puppet. Openstack is also an option, but currently it is easier to work with oVirt.

In the next blog I will share my thoughts about infrastructure as code and why it will become extremely useful for this project.