Quickly Create Dummy Variables in a Data Frame

On Quora, a question was asked about how to fix the error of the randomForest package in R not being able to handle more than 32 levels in a categorical variable. Seeing as how I've seen this question asked on Kaggle forums, StackOverflow and … [Continue reading]

Adobe Analytics Implementation Documentation in 60 Seconds

When I was working as a digital analytics consultant, no question quite had the ability to cause belly laughs AND angst as, "Can you send me an updated copy of your implementation documentation?" I saw companies that were spending … [Continue reading]

Using Amazon EC2 with IPython Notebook

Last week, I wrote a guest blog post at Bad Hessian about how to use IPython Notebook along with Amazon EC2 as your data science & analytics platform. I won't reproduce the whole article here, but if you are interested in step-by-step instruction … [Continue reading]