DDH has been undertaking a major infrastructure project – a complete server renewal – over the past few months which we’re just about to complete, and we thought it might be interesting to say a bit more about the project, and the way in which we run our server infrastructure.
Once this work is complete – and the migration from old infrastructure to new is now underway (hooray!) – we will have a robust, resiliant and high performance development environment and platform for hosting our web-based projects which will last us for a full five years, with a sustainability and replacement plan taking us well into the future. It’s taken us a while to get here – as well as the support of College and a fair bit of creative thinking all round – but we now have the flexibility to support and sustain projects of all sizes. We have modest annual costs associated with the collocation of our servers but our sustainability planning covers this, and otherwise we’re self-sufficient, and this is a great place for a department like ours to be.
Where we’ve come from
A few people in the department will have fond memories of the original CCH server infrastructure – namely, Maple, a Sun SPARCstation 20 (aka ‘pizza box’) that already had a good vintage when KCL’s Information Services and Systems department gave it over to our use. (There was also ‘Pigeon’ – but we won’t go into that 😉 Maple was running Solaris of course, with Apache and Tomcat, and provided the web environment for some of our earliest projects – most notably first versions of PASE, PBW, CCEd and CVMA. Maple was pretty nifty (with its vast 9GB storage – later upgraded to 18GB… ) but soon began to show its age, and as our stable of research collaborations increased it became pretty clear we would need a more substantial server infrastructure of our own. We employed a half-time Systems Administrator, shared with the former AHDS, and began to ensure that new project applications included decent allowances for hardware. Over the next few years our stable of servers grew rapidly to the point where we had something like 14 separate physical servers, each with discrete storage (i.e. separate hard disks rather than a single shared filestore) – in other words, server sprawl. Our first big infrastructure project gave us 20TB of centralised storage and we made what was, at the time, a fairly brave foray into the world of VMware and server virtualisation. It took a long time to get this project finished (not least because we moved premises a couple of times!) but we ended up with a centrally managed server infrastructure and the ability to offer web hosting and project infrastructure as a service – a much more sustainable way to go than buying new hardware every time a new project came along. As with all things this equipment has aged though and the time came for renewal…
Where we are ( – and ‘the science bit’)
The most astonishing thing about the gear we have just installed is that it shows how rapidly technology has advanced even in the last three years. Our new infrastructure, in total, amounts to about 14 ‘U’ (units of height in a rack cabinet) of equipment. The equipment it replaces totalled about 50U, in part because we had some very old pieces of historical equipment still running, but even so – this is quite a difference. It means that amongst other things the space we need at the data centre we use (University of London Computing Centre) reduces from 2 full rack cabinets to less than half of one – and that means less heat, less power, and much less collocation cost. Despite that our new equipment is an enormous improvement over what it replaces. Here’s the list:
- 3 x Dell PowerEdge R610 servers with dual Xeon X5680 CPUs, no local storage, and 96GB RAM each (VMware hosts)
- 1 x Dell EqualLogic 6500E (‘Jumbo’) iSCSI SAN with 48TB nominal storage (48 x 1TB hard disks)
- 2 x Dell PowerConnect 6224 managed switches
- (plus one not-new PowerEdge 1950 server for management purposes, and a couple of UPS batteries)
– and that’s it. The old gear, by comparison, comprised:
- 5 x Dell PowerEdge 2600 dual Xeon servers each with 2GB RAM and dual 3GHz Xeons (historical)
- 2 x Dell PowerEdge 2850 dual Xeon servers (legacy)
- 3 x Dell PowerEdge 2950/III dual Xeon servers each with 32GB RAM
- 3 x Dell PowerEdge 1950/III single Xeon management servers
- Dell / EMC Clariion CX3-20 Fibre-Attached SAN with 1 storage shelf of 5TB and 3 storage shelves of 12TB (33TB nominal total)
- 2 x Brocade Silkworm 200e 4GB Fibre switches
- 4 x 1300W UPS
- (Partridge in a pear tree)
The new equipment finally sees us move to a VMware only environment (running VMware ESXi 4.1) and greatly simplifies things. We’re using a unified technology for storage and networking (iSCSI uses standard networking cabling and equipment, whereas our older infrastructure used expensive and difficult to manage fibre optics) and the enormous reduction in the number of boxes we have by itself will yield huge benefits. In addition to that, we have complete redundancy (our network switches failover; the virtual servers can rearrange themselves over the VMware hosts if one of them has a problem; and our disk storage will be replicated to a second box in our offices at Drury Lane for backup). We also have room for expansion – we can easily enough bolt in another VMware host, or storage box, if we need either in the future (and that’s something our sustainability plan and charging model allows for — more on that another time maybe). The storage infrastructure will scale to 2.3PB (yes, PetaBytes) if we ever need it to.
Below are some front-and-back geek photos of the new gear, expertly installed by our Systems Manager Tim Watts (notice please the quality of the cabling!), who has been working round the clock on the testing and implementation of our new infrastructure. The photo with the coloured boxes shows, in green, the new equipment, and in red, the old (and we’ve already taken some out) – just to give a sense of the way the technology has progressed! If you’re a real hardware junkie you can find more of Tim’s photos online (which he has been taking throughout the upgrade process).