Five Hard-Won Lessons Using Hive

I've been spending a ton of time lately on the data engineering side of 'data science', so I've been writing a lot of Hive queries. Hive is a great tool for querying large amounts of data, without having to know very much about the underpinnings of … [Continue reading]

Building JSON in R: Three Methods

When I set out to build RSiteCatalyst, I had a few major goals: learn R, build a CRAN-worthy package and learn the Adobe Analytics API. As I reflect back on how the package has evolved over the past two years and what I've learned, I think my … [Continue reading]

Using SQL Workbench with Apache Hive


If you've spent any non-trivial amount of time working with Hadoop and Hive at the command line, you've likely wished that you could interact with Hadoop like you would any other database. If you're lucky, your Hadoop administrator has already … [Continue reading]