CAS Leverages the Versatility of OpenStack Private Cloud

When CAS (Chemical Abstracts Service), a division of the American Chemical Society (ACS) began its journey toward a fresh approach to data management, leaders understood that meant moving into the cloud.

CAS embodies the ACS vision of “improving people’s lives through the transforming power of chemistry.” Since its founding in 1907, CAS scientists and technologists have sought innovative ways to collect, organize and share all publicly disclosed scientific information, creating the most complete content collection for researchers in the world.

Scientific researchers, patent professionals and business leaders from around the world rely on a suite of CAS products covering science, engineering, technology, patents, business information and more. That includes the flagship CAS REGISTRYSM, the most comprehensive collection of disclosed chemical substance information in the world, cataloguing more than 129 million unique organic and inorganic substances and counting.

Who says scientists don’t know how to have fun? CAS employees take a break from compiling the world’s largest chemical databases to form a human CAS logo.

As the growth of published scientific literature continued to grow exponentially, CAS recognized the need to modernize, said Matt Greenwood, a senior engineer who’s been with CAS for 17 years. And that meant moving into the cloud.

“We knew we needed to become cloud native with new solutions,” Greenwood said. “And we wanted to be very aggressive about dramatically speeding up our time to market and expanding our product portfolio to meet our global researchers’ needs.”

That also required getting more self-service power in hands of CAS development teams, so they could drive new insights from the data that were not easily possible using other technologies. Infrastructure as a service was an obvious way of enabling that, Greenwood said — and looking at open source technology was also a natural fit for CAS.

“We wanted to see what options would enable self-service and allocation of infrastructure that would bring greater value and computing power for both our developers and our customers,” he said, and from the beginning, OpenStack looked appealing. “We started kicking the tires, and the team quickly recognized the potential of cloud computing.”

By 2015, CAS engineers and developers had built a midsize OpenStack cloud in its own data center. And while they were pleased with OpenStack’s capabilities, they also faced first hand the complexities that come with going it alone. And that led CAS leadership to tackle another question: continue to DIY, or find a partner?

CAS compared the costs of investing in full time OpenStack experts to partnering with a seasoned OpenStack service provider. Pretty quickly, Greenwood said, it became clear that keeping its own people free to work on CAS’s core business made more sense. Focusing internal resources on running infrastructure wasn’t adding value to the business, he said: “It’s the value we’re adding on top of OpenStack that differentiate us.”

CAS then spent many months looking in depth at its options, investigating the capacity of vendors, including Rackspace.

“We were really looking for depth of experience,” Greenwood said. “Maturity. How many clusters had been built? Actively run? How many customers did they have running OpenStack in production? How deep was their bench? Did the team encompass ten OpenStack experts or one hundred?”

Rackspace’s longtime leadership within the OpenStack community made it an early favorite. A co-inventor of OpenStack with NASA, Rackspace surpassed one billion server hours of experience last year, and still runs the world’s largest production OpenStack clouds.

Also, Greenwood said, “we recognized that because Rackspace is packaging OpenStack, it was making decisions about what was production ready and what was not. But Rackspace was also very flexible about its willingness to add a service we might need, even if it was on an unsupported basis. We really wanted that flexibility, and Rackspace was willing to work with us in that way.”

The CAS and Rackspace collaboration began small, to remotely support the OpenStack cluster in the CAS Columbus, Ohio data center. That grew and grew some more, until Rackspace was managing two “pretty large” OpenStack clouds in the CAS data center, Greenwood said.

Scott Hollingsworth, a strategic account development manager at Rackspace who has worked closely with Greenwood as the relationship has grown, called CAS the poster child for OpenStack private cloud.

“They started by doing it themselves, and they had a good understanding of the capabilities of OpenStack,” said Hollingsworth. “They had ambitious goals of migrating their customer facing solutions over to OpenStack, and had the foresight to know that they would need a dedicated OpenStack team to be able to design, deploy and manage their clouds. The way CAS uses its cloud is very complex — so having Rackspace help manage the OpenStack piece was critical for them. It allows their developers and product teams to focus on building on top of OpenStack rather than monitoring the stability and uptime of the cloud itself.”

The partnership — and the trust, Greenwood and Hollingsworth say — between CAS and Rackspace continues to evolve and grow. This year, the company expanded its relationship with Rackspace further, to host “a couple small clouds” in a Rackspace data center, testing out the prospect of having capacity outside CAS’s data center.

CAS is also stepping up its visibility in the OpenStack community, presenting for the first time at the OpenStack Summit next month in Boston. CAS DevOps Engineer Monica Rodriguez and Senior Technologist Scott Coplin will join Rackspace OpenStack Architect Chris Breu for a session on day one of the Summit, “A Series of Unfortunate Deployments: Running a Lambda Architecture on OpenStack.”

While the title of the talk riffs on the name of the popular book and TV series by Lemony Snicket, the session describes not just the challenges but the successes the team found as it tackled the goal of deploying a scalable, high-performance chemistry search engine based on the Lambda architecture in OpenStack.

 “We wanted to share some of the major issues we faced and walk through how we solved them,” said Rodriguez. The trio will also highlight the importance of operators working closely with developers to ensure maximum performance and stability, she said.

Greenwood said it’s been heartening to see the pace of change at CAS “accelerate dramatically as we continue to modernize. Increasing the sophistication of our solutions is a continuous process, and we’re on the right path.”


Please enter your comment!
Please enter your name here