Vous êtes sur la page 1sur 4

Hadoop Desktop Cluster


Paul Morse | Hadoop, Hadoop Desktop Cluster, CriKit, Copyright 2012 All Rights Reserved | September 30, 2012

Hadoop Desktop Cluster - CriKit

Hadoop is Hot
The number of organizations that want to investigate and use Hadoop and other Big Data solutions is growing rapidly. It has been claimed that 50% of the worlds data will be held in Hadoop by 2015. That seems like an aggressive prediction, but there is no doubt that the entire sector of Big Data is exploding with no end to the expansion in the near term. One of the major inhibitors to broad adoption can be the high cost of hardware to test and evaluate these new technologies. What is needed is a low cost computing environment with which to test these new technologies to determine if they are viable solutions for the organization. There are, of course, many options from public cloud providers and SaaS providers that have built competent solutions around Hadoop and other Big Data technologies, but many organizations are reluctant to put their private data in public environments to simply try out these new solutions. Further, it seems all the major hardware vendors are standing by to offer solutions in the hundreds of thousands of dollars coupled with services offerings that rival or exceed the hardware costs just to take a look at the technology. What is needed for a lot of organizations with a tight budget is a compact, low-wattage, multi-node sandbox to test the basic functionality to see if a larger scale environment is indicated for further pursuing Hadoop solutions. Test small, go large if it works for you.

Enter CriKit
One solution for testing big data solutions is CriKit, Desktop Private Cloud. Originally created for testing and running Private Cloud software from a variety of companies, it is perfect for budget-conscious organizations that want to test the functionality of Hadoop or Cassandra and many of the functional add-on products like DataMeer or Tableau or myriad others.

Why Crikit?
CriKit is a unique desktop cluster solution. It was designed to be at the confluence of compact size, low-wattage, but high compute power, with future reusability in mind. By being broken into discrete compute nodes and not put in a proprietary case, the MicroServers can be reassigned as desktop devices when they have outlived their usefulness as servers. This extends the useful life of the MicroServers by 5- 7 years. A dual hard drive CriKit MicroServer costs roughly $1,500.00 USD. If it has a multi-role useful life of 10 years, that equates to approximately 41 Cents a day per node, or less than $2.00 a day for the 4 compute nodes in the base CriKit system. This cost compares very favorably against larger, more costly on-premise solutions and Public Cloud offerings from a variety of cloud vendors. Further, Crikit can be used in a hybrid cloud configuration where Hadoop experts can refine their Hadoop environment to be the most efficient locally, then



Hadoop Desktop Cluster - CriKit

burst to Public Clouds for large scale processing. This helps ensure that organizations minimize their Public Cloud Hadoop computing spend, where simple mistakes can be very costly.

What is in a CriKit?
CriKit was designed to be very simple and provide the compute power, networking and management necessary to build private clouds or data clusters. CriKit MicroServers can easily be added to increase the processing capability of the cluster, and the financial step function of adding MicroServers is small compared with larger, proprietary offerings. Each desktop CriKit environment for a minimal, 4 node Hadoop cluster contains Computing Nodes - CriKit contains 4 energy-efficient compute nodes that include an Intel Server Motherboard, a 64 Bit Intel Xeon Server CPU, 16 GB of RAM, Dual 1 Gb Ethernet Network Interface Controllers ( NICs) and varying sizes and types of SATA III, 2.5 inch drives. CriKit nodes can contain up to 2 SATA III, 2.5 inch spinning, Hybrid or Solid State Drives. Testing has shown that SSDs provide the best read performance by a wide margin. 1 Gb Ethernet Switch - CriKit comes standard with an 8 Port, unmanaged, 1 Gb Ethernet switch. Managed switches, and switches with more ports for larger CriKit implementations are available. Keyboard, Video and Mouse Switch - A high-quality, 8 node DVI/USB switch is included with CriKit. This switch can be daisy-chained with additional switches to accommodate 511 CriKit compute nodes and one Management Workstation.

Management/Development Workstation - CriKit comes with a high-powered workstation to manage the cloud environment and provide high developer productivity. Purchasers can select the components of the workstation like CPU, Disk, Memory, etc or decide not to buy the workstation component and use their own desktop or laptop machine as a cluster management station.

A 4 node CriKit cluster provides the minimum hardware necessary to run a Hadoop cluster. For testing and evaluation, 1 of the 4 nodes can contain all the Hadoop-related management functions and the 3 remaining nodes can be the compute or slave nodes. With 4 cores and 8 threads in each CPU, plus 16 GB of RAM and SSDs up to 1.2 TB in each node, there is enough compute horsepower, memory and high-speed storage to adequately test Hadoop with moderate amounts of data.



Hadoop Desktop Cluster - CriKit

Further, if you want to use CriKit as a 4 node private cloud platform and run Hadoop in virtual machines to test Hadoop scalability over many virtual machine nodes, this is a viable configuration option as well.

There are a growing number of organizations that want to test and evaluate Hadoop and other Big Data solutions and add-on products. CriKit provides a low-cost, low-wattage, compact and quiet desktop computing platform that is ideal for organizations on a tight budget.

http://www.crikit.info http://www.cloudademia.com http://www.usmicro.com