Americas

  • United States

Asia

How to work out the kinks in the cloud

feature
Dec 17, 201414 mins
Cloud ComputingData Center

Large organizations are feeling the pressure to push ahead with software-defined data centers, even if they must build them using immature technologies.

cloud person spiral
Credit: Thinkstock

Blame it on the public cloud service providers. It was, after all, the Amazons of the world that raised the bar by making the provisioning of IT resources look so easy. Why should users have to wait? If I can get it quickly and easily there, the reasoning goes, why can’t I get the same agility from my internal data center?

No one cares that the 1% of enterprises that built their business in the cloud aren’t dragging decades of legacy infrastructure with them, says Zeus Kerravala, principal analyst at ZK Research. For the 99% — traditional enterprises such as banks and manufacturers — the existential challenge is how to catch up.

“Every big company now has to compete with startups that are trying to disrupt their business,” says Mark Collier, chief operating officer at the OpenStack Foundation. “The No. 1 driver for SDDC is speed and the need to empower developers who are writing applications for their companies to move more quickly. Velocity, these days, is everything.”

Building a software-defined data center (SDDC) is the first step toward a private cloud infrastructure that can achieve those goals, but technical limitations and cultural issues make it a challenging one.

SDDC is a catch-all term that includes at a minimum software-defined computing, networking and storage elements, as well as an orchestration layer to coordinate the configuration of data center infrastructure, as driven by the resource and service-level requirements set by those applications hosted on top of it.

What’s more, the single control point that an SDDC establishes through a series of APIs shouldn’t stop at the four walls of the traditional data center. A well-designed architecture should serve as the foundation for a broader software-defined infrastructure that extends control over all IT resources, including those available in private clouds, public clouds and traditional data center resources both on-premises and in colocation facilities.

While everyone feels a sense of urgency because they think they’re behind the competition with SDDC, there’s no need to panic, says Kerravala: 95% of companies are still in the learning phase. “You’ll have to make some big investments, so don’t rush into this. We’re not even in the first inning.”

All the chips on SDDC

The standards and software for an SDDC based on open architecture aren’t fully mature. But that didn’t stop Intel from creating an SDDC based on OpenStack, a set of technologies created by the OpenStack Foundation and embedded, to one degree or another, into data center infrastructure products offered by Cisco, IBM, Hewlett-Packard and other vendors.

After proving the technology in a greenfield data center, the chipmaker is now consolidating all 13 of its internal data centers, including all traditional as well as cloud-aware applications, under a single software control point.

Intel’s architecture offers five APIs to its application developers: one for computing, one for networking, one each for block and object storage, and one for identity management. Intel also added a platform-as-a-service (PaaS) option built using the open-source Cloud Foundry platform. In this way, users can access lower-level APIs or use higher-level abstractions. “We have to serve different types of users,” says Das Kamhout, principal engineer.

And serve them quickly. “The pace at which public cloud service providers are moving is crazy. So if private shops want to stay current, they need to stay on the leading edge — at least a little bit,” Kamhout says.

The move to an SDDC has allowed Intel to create a private cloud infrastructure that offers improved utilization of computing, storage and networking resources, user self-service with a turnaround measured in less than an hour rather than weeks, and empowerment of application developers, who can now define infrastructure requirements in software through a series of common APIs. After years of hard work, Kamhout says, “We’re just getting to the point where developers can stand up something and use data center capacity.”

But redeploying data center infrastructure on emerging technologies required a certain amount of panache in the face of uncertainty — and lots of baling wire to tie it all together. Because Intel started early, it didn’t use OpenStack initially. As a result, it had to do a lot of custom coding, including constructing its own orchestration layer to make everything work. “We are replacing all of that with native OpenStack, since this allows us to remove a lot of ‘technical debt’ and be part of a larger community that has the exact same issues our team has,” Kamhout says.

With the new paradigm, an IT staff used to configuring hardware manually had to be trained to use scripting tools such as Python and Puppet, and to manage virtual Linux clusters. Kamhout says the new skill sets required include knowledge of automation, scripting, Linux and data analysis. What Kamhout calls the “Next, Next, Finish” tools for configuration are still six to nine months away. Everyone wants an “easy” button for configuration. But in the meantime, he says, “You either up-level your workforce or wait until the tools are out there.”

Once all 13 data centers are on board, the focus shifts to a unified approach to hybrid cloud architecture where Intel turns to a public cloud to handle spikes in activity while its private cloud supports baseline workloads. “Bridging is the next phase,” Kamhout says.

“We expect to deal with one set of APIs inside and outside of our data center. We want a federated, interoperable, open cloud.” But public clouds would need to expose their infrastructure to those APIs — something they aren’t ready to do yet. “They consider it their secret sauce,” he says. But it will happen over time, Kamhout says, adding that a cloud service provider’s means of acquiring a server shouldn’t be seen as a competitive differentiator.

Expecting cloud service providers to directly support industry standard APIs isn’t realistic, says Dave Shacochis, vice president of cloud platforms at cloud services provider CenturyLink. But eventually, service providers like CenturyLink “will be incented to publish their own API specs,” he says, and third parties will be needed to build the translations between the various private and public cloud platforms.

Driven by a need for speed

FedEx is all about speed of delivery, and Chris Greer, technical fellow at FedEx Corporate Services, says the company wants that to extend to IT services. “In the evolution of our data center, software rules everything,” he says. His goal: To automate everything from software-defined application processing down through computing, storage and networking all the way to the foundation — power, heat, cooling, and even white space on the floor.

All of these, he says, should be actively managed for maximum efficiency, as dictated by the resource needs of the applications at the top of the stack. While FedEx has reference implementations in place, it’s moving forward cautiously — and with the full knowledge that management of some layers, such as power and cooling, aren’t even addressed by OpenStack today.

But what is available is solid, says Jonathan Bryce, executive director at the OpenStack Foundation. “The technical capabilities are mature, but there has to be a methodical transition across the company to make this happen.”

Don Fike, a vice president and technical architect at FedEx, says there hasn’t been nearly enough industry unification around OpenStack standards — but he’s not hesitant about moving forward. “You have to press forward now, because it will take time to achieve this,” he says. And “you have to be committed to rework everything,” he adds. “You can’t let old legacy decisions hold you back.” FedEx is following a three-step track to an SDDC.

“Virtualize your hardware first,” and move to software-defined storage and networking second, Greer advises. “Then you have a lot of work to capture the application personalities and understand how they interact out in the world.” “Within the next year, we’ll be at scale in programming hardware devices,” with software-defined networking (SDN), storage and security to follow, he says. While Greer would like to have an open implementation, the project will still require custom programming to fill in the gaps — the “secret sauce” to get it all to work together.

Networking and storage are the hardest to automate, says Fike. “Our other challenge, from a regulatory standpoint, will be firewalls and security, but that’s just as important for being able to move workloads around,” he explains. FedEx’s needs are complex and application-driven — and that was the company’s starting point. “We’re building a software-defined data center with full knowledge of the thousands of apps running in our data center,” Greer says. “The way these apps interact, their configurations and what they talk to is extremely important.”

Unfortunately, the capability to capture configuration requirements for thousands of applications and translate those into automation tasks is where product offerings are the least mature, says Fike. Eventually, he expects FedEx’s vendor partners to start offering easier, more flexible ways to capture application configuration requirements, automate associated tasks and deploy them on public or private cloud infrastructures. But, he says, “the requirements around that are just too daunting for them right now.”

Stitching it all together

Columbia Sportswear didn’t wait for standards to mature, opting instead for a custom-built stack using VMware tools running on a converged Vblock hardware infrastructure. “It was about being able to build a highly virtualized, highly flexible, highly scalable environment and be able to move it transparently for multiple use cases,” says Suzan Pickett, manager of global infrastructure services at Columbia. The company didn’t want to wait because it saw the automation enabled by an SDDC as a way to avoid adding more IT staff as it faced a 700% growth in systems needs over the past several years.

After segregating Columbia’s apps by business criticality, Pickett went forward, virtualizing 85% of the company’s server hardware and then migrating an SAP ERP system that controls North American operations onto two Vblock System 700 converged infrastructure platforms earlier this year. “We went live without a single critical incident,” she says.

Now the company is considering a hybrid cloud strategy that could peel away the lower application tiers from its internal data center and branch offices. “I want to extend my policies, security framework, even my Active Directory single sign-on, and have confidence in the underlying platform,” she says.

Those moves, however, must be compatible with Columbia’s VMware-based tool set. The IT staff now spends less time on rack-and-stack work and more on automation, provisioning and establishing service catalogs. System delivery times have dropped from between three and six weeks to about four minutes. “If I need a back-end database, our cloud automation center spins that up. Our staff is more highly skilled and efficient at what they’re doing,” Pickett says.

Columbia’s servers are fully virtualized. The company used EMC’s VPLEX technology to set up the virtual environment, and it’s evaluating EMC’s ViPR for software-defined storage. The next step, Pickett says, will be to adopt software-defined networking so all network layer changes will be transparent as apps migrate between data centers.

“SDN is huge for us,” Pickett says. But while Columbia is planning on adopting VMware’s NSX for that purpose, she says she sees that technology as a “Version 1” product that needs to mature. Likewise, Cisco’s Application Centric Infrastructure (ACI), built around its 9000 series switches, is “still in its infancy” and not a pure software-based model, she adds. (Ishmael Limkakeng, vice president of product management at Cisco, says ACI — when compared to the software overlays offered by competitors — is the most complete, and it can support other brands of switches in addition to Cisco’s 9000 series products).

But Columbia does have Cisco switches. Going forward, Pickett says, “I don’t feel that we need to stick with a single-vendor approach. I don’t feel that we’re tied into any one technology because we’re defining the process underneath that.” There is also a risk associated with using a custom-built stack tied to specific cloud vendors, says Kerravala. Can you migrate from one to another? What if you merge with a company that uses another cloud provider? “Down the road, you may not be able to use it with multiple cloud providers,” he says.

Perhaps the biggest challenge, outside of the need to gain executive-level buy-in, is selling SDDC to IT staffers, whose job security is tied to the need for manual racking, stacking and configuration of servers, storage and networking. “They’re not set up in IT for a cloud-like environment. They still need domain experts in storage, networking, server and apps, but the shift to a DevOps model is mandatory,” Kerravala says. “Our challenge was how to move toward automation and have IT be cool with the fact that we were exposing it as user self-service,” says Intel’s Kamhout. “That was a scary point.”

That attitude is understandable, says Kerravala, because nearly half of the operational cost savings that accrue from moving to an SDDC come from labor, which accounts for 40% to 45% of total costs. While IT still has rack-and-stack work to do, most configuration of low-level components can be handled entirely in software, by automated tools.

If, like Intel, your IT staff doesn’t understand Linux or scripting, expect to invest in retraining. “Most of IT isn’t interested in changing from a personal perspective,” Kamhout says, “so our leadership team had to help them transform and learn the new skills. That process takes years.” Pickett faced similar training challenges. “For us, it’s really about skill set maturity with some of these automation products,” she says, adding that the task is even harder when the networking, storage and server teams exist in their own silos.

Nirvana will be reached when workloads associated with applications trickle down to the hardware, which then adapts in real time, says Kamhout. For example, he’d like to see lower-level infrastructure, such as power and cooling, respond to workload demands in the most energy-efficient way, such as moving workloads into one resource pool at night and shutting down or stepping down power and cooling for unused areas of the data center to save energy. “We’re still a few years away from that being real,” he says. “But more is possible as we get to more advanced automation.”

robert_mitchell

Robert L. Mitchell is a freelance writer and editor. Previously, he was a national correspondent for Computerworld. He also served as an editor at Network World and BYTE magazine, and was part of the editorial team that launched TechBeacon.com.

More from this author