Social media giant Facebook has recently switched to a new protocol called KEA from ISC DHCP to accelerate its server deployment techniques and work functions to all new levels. The new protocol deployment will help in shrinking the time it takes between the point a server is physically installed in a data center and the point it comes online and starts serving the clients.
By this year end, Facebook’s team of engineers are planning to switch their server functions from an older version of Dynamic Host Configuration Protocol (DHCP) to new KEA.
Going technically further, when a new device is connected to a network, DHCP is used to assign it an IP address from a defined range configured for that network. It’s a small but important function in IT infrastructure management that has proven to make a big difference in the time it takes to deploy new servers in Facebook data centers.
Previously, Facebook used to utilize ISC (Internet Systems Consortium) DHCP to accelerate the server work functions and it has been using it from the past 18 years or so. The new alternative, called Kea is more appropriate for today’s IT, which, especially in the case of a company like Facebook, bears little resemblance to IT of 18 years ago.
Facebook uses new protocol Kea to install the operating system on new servers and to assign IP addresses to out of band management interfaces. These are interfaces which administrators of systems use to monitor and manage servers remotely, whether or not they are switched on or have an OS installed.
Internet Systems Consortium (ISC) propelled DHCP was simply too slow for Facebook’s scale and pace of change. The technical teams working with this protocol had to workout for three -six hours, when replacing network cards or entire servers, slowing down repair times. This was due to the fact that techs would load a static configuration file into DHCP servers, but the servers would have to restart to pick up the changes. Thus, all the Facebook’s DHCP servers running with ISC were spending more time restarting to pick up the changes than serving actual traffic.
But things have changed from early this month, as Kea’s implementation has eased out the deployment annoyances. When a DHCP server needs to be deployed or changed, the system grabs the configuration information from the inventory. This means there’s no longer a need to generate static configuration files and reload DHCP servers for changes to take effect.
According to our source, Facebook’s new DHCP application run on Linux’s container technology called Tupperware, and is similar to Google’s Borg.
The old model was also less resilient. The company used to have two redundant DHCP servers per cluster of servers, but if both of them went down, the cluster suffered.
The new approach is a virtual cluster of DHCP servers distributed across the network. They manage a common pool of IP addresses, and any virtual DHCP machine can assign an address to any other device on the network. In this way, if local DHCP servers in a cluster fail, the system can recover faster.
With the new stateless design, it takes one or two minutes to propagate changes in the system, instead of 3-6 hours.