Getting Started Using Hadoop, Part 3: Loading Data


In part 2 of the "Getting Started Using Hadoop" series, I discussed how to build a Hadoop cluster on Amazon EC2 using Cloudera CDH. This post will cover how to get your data into the Hadoop Distributed File System (HDFS) using the publicly available

Innovation Will Never Be At The Push Of A Button

@randyzwitch @benjamingaines @usujason I am envisioning the data science equivalent of an autonomous vehicle pileup. — Todd Belcher (@toddmetrics) May 16, 2013   Recently, I've been getting my blood pressure up reading (marketing)

Getting Started Using Hadoop, Part 2: Building a Cluster

In Part 1 of this series, I discussed some of the basic concepts around Hadoop, specifically when it's appropriate to use Hadoop to solve your data engineering problems and the terminology of the Hadoop eco-system. This post will cover how to install