RSiteCatalyst Version 1.3 Release Notes

Version 1.3 of the RSiteCatalyst package to access the Adobe Analytics API is now available on CRAN! Changes include: Search via regex functionality in QueueRanked/QueueTrended functions Support for Realtime API reports: Overtime and … [Continue reading]

Getting Started With Hadoop, Final: Analysis Using Hive & Pig

simple-hive-query

We've finally made it to the final post in this tutorial! In my prior posts about getting started with Hadoop, we've covered the entire lifecycle from how to set up a small cluster using Amazon EC2 and Cloudera through how to load data using Hue. … [Continue reading]

Quickly Create Dummy Variables in a Data Frame

On Quora, a question was asked about how to fix the error of the randomForest package in R not being able to handle more than 32 levels in a categorical variable. Seeing as how I've seen this question asked on Kaggle forums, StackOverflow and … [Continue reading]