Friday, 24 May 2013

Data Pipeline Ecosystem


Data I/O
  • Avro
  • Java Serialization

Data Streaming
  • Kafka
  • ActiveMQ
  • RabbitMQ

Data Stores  (No-SQL) 
  • HBase
  • Cassandra
  • MongoDB
  • CouchDB
 
Data Storage 
  • HDFS
Data Transfer
  • Sqoop
  • Flume

Data Exploring/Retrieval

  • HIVE
  • PIG
Data Analysis/Computations
  • Map/Reduce YARN
  • Aggragation
  • CQL


Job Schedulers/Co-ordinators
  • Oozie
  • Zoo keeper
  • Apache Mesos

Data Management [New]
  • Apache Falcon

No comments:

Post a Comment