Four Considerations For A Smooth Hadoop Implementation

By Shanti Subramanyam, Co-Founder & CEO, Orzota

There are a number of roadblocks a company can encounter when implementing a Big Data strategy. There isn’t a one-size-fits-all approach.

At Orzota, we help companies overcome some of the most common Big Data hurdles and accelerate their Big Data Discovery; translate their Big Data assets into business outcomes; and address the specific use case they’re looking to implement. We offer services and tools around running and leveraging the insights that Apache Hadoop can produce.

As we work with our clients to build out their Hadoop implementations, here are four key areas to consider:


Apache Hadoop implementations can yield great value but are hard to get started, and are difficult to maintain. Compounding the problem is a lack of industry knowledge about newer technologies and a scarcity of trained professionals. Expert consultants and engineers are required to deliver a Hadoop architecture based on a client’s requirements and specific needs. This helps in the creation of an integrated data platform that encompasses both structured and unstructured data and hybrid cloud environments. An on-demand Hadoop service such as Rackspace’s Cloud Big Data Platform can shorten the provisioning and installation time so users can focus on beginning their data discovery activities.

Data Science

Having data in Hadoop solves only part of the problem. To get from data to insights, you need the ability to process and analyze this data efficiently. A team of data scientists can help provide business insights by creating the right analytics solutions and integrating it with rest of the environment.

Data Management

A data management platform built atop a scalable infrastructure provider, such as Rackspace Cloud Big Data, simplifies the deployment and management of Hadoop applications. Our platform is geared towards running ETL and Data Science applications that aren’t inherent as part of the core Hadoop distribution, with the goal of delivering a complete Big Data solution that meets your specific needs.

Expediting Proof of Concept and Production Deployments

Because Hadoop is a newer technology, many companies are still trying to validate how they are going to leverage Hadoop. A proof-of-concept or proof-of-technology experience is necessary to walk through running Hadoop and deriving queries. Without a trusted consultant, these proof of concept exercises can be expensive, frustrating to setup and difficult to produce accurate results. The ability to quickly spin up POC environments empowers customers to validate and move into more production workloads faster.

The benefits of being able to spin a production deployment using the same technique works especially well for ETL jobs that are run at specific times (e.g. once every eight hours). Moving quickly from PoC to production obviously reduces the lead time for the project.

Those are just four of the many considerations companies must mull over when building a Big Data strategy. When thinking about these types of projects, focus on the architecture, data science, data management and expediting proof of concept and production deployments. Paying close attention to these four things will save you a major headache down the road.

This is a guest post written and submitted by Shanti Subramanyam, founder and CEO of Orzota. Orzota provides technology-enabled services to help businesses accelerate their Big Data deployments by helping with implementation strategy, architecture, data science and implementation.


Please enter your comment!
Please enter your name here