Vous êtes sur la page 1sur 4

Summary : HBase is the Hadoop database.

Use it when you need random, realtime : read/write access to your Big Data. This project's goal is the : hosting of very large tables -- billions of rows X millions of : columns -- atop clusters of commodity hardware. URL : http://hbase.apache.org/ License : APL2 Description: HBase is an open-source, distributed, column-oriented store modeled : after Google' Bigtable: A Distributed Storage System for Structured : Data by Chang et al. Just as Bigtable leverages the distributed : data storage provided by the Google File System, HBase provides : Bigtable-like capabilities on top of Hadoop. HBase includes: : : * Convenient base classes for backing Hadoop MapReduce jobs : with HBase tables : * Query predicate push down via server side scan and get : filters : * Optimizations for real time queries : * A high performance Thrift gateway : * A REST-ful Web service gateway that supports XML, Protobuf, : and binary data encoding options : * Cascading source and sink modules : * Extensible jruby-based (JIRB) shell : * Support for exporting metrics via the Hadoop metrics : subsystem to files or Ganglia; or via JMX ----------------------------------------------------------------------------------------Summary : Hive is a data warehouse infrastructure built on top of Hadoop URL : http://hive.apache.org/ License : Apache License v2.0 Description: Hive is a data warehouse infrastructure built on top of Hadoop that : provides tools to enable easy data summarization, adhoc querying : and analysis of large datasets data stored in Hadoop files. It : provides a mechanism to put structure on this data and it also : provides a simple query language called Hive QL which is based on : SQL and which enables users familiar with SQL to query this data. : At the same time, this language also allows traditional map/reduce : programmers to be able to plug in their custom mappers and reducers : to do more sophisticated analysis which may not be supported by the : built-in capabilities of the language. --------------------------------------------------------Description: Oozie client is a command line client utility that allows remote : administration and monitoring of worflows. Using this client : utility you can submit worflows, start/suspend/resume/kill : workflows and find out their status at any instance. Apart from : such operations, you can also change the status of the entire : system, get vesion information. This client utility also allows you : to validate any worflows before they are deployed to the Oozie : server. ------------------------------------------------------------------------Summary : Oozie is a system that runs workflows of Hadoop jobs. URL : http://incubator.apache.org/oozie/ License : APL2 Description: Oozie is a system that runs workflows of Hadoop jobs.

: Oozie workflows are actions arranged in a control dependency DAG : (Direct Acyclic Graph). : : Oozie coordinator functionality allows to start workflows at : regular frequencies and when data becomes available in HDFS. : : An Oozie workflow may contain the following types of actions : nodes: map-reduce, map-reduce streaming, map-reduce pipes, pig, : file-system, sub-workflows, java, hive, sqoop and ssh (deprecated). Hadoop Installation: ------------------------sudo yum install hadoop-0.20-mapreduce-jobtracker sudo yum install hadoop-hdfs-namenode sudo yum install hadoop-hdfs-secondarynamenode sudo yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode sudo yum install hadoop-client sudo yum install hadoop-yarn-resourcemanager sudo yum install hadoop-hdfs-namenode sudo yum install hadoop-hdfs-secondarynamenode sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-map reduce sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver sudo yum install hadoop-client

http://www.strongswan.org/docs/install.htm

Vous aimerez peut-être aussi