Four Tactics For Well Thought Out Business Requirements

One of the most common issues in business (especially large corporations) is trying to nail down the requirements for a given analysis request. The “business people” on the front-lines are talking to their higher-ups about what they think are important questions for the business to solve, but by the time the question gets to the analyst or developer, it sounds something like:

It would be interesting to model using SAS how our customers shop for our merchandise by channel and what overlaps there are between demographics, geography, product type and tenure. But we also have to timebox this, we can’t be boiling-the-ocean just looking for needles-in-a-haystack.

Say WHAT? Mr. Business Person, I cannot help you if you do not run that mess through Unsuck-It first.

In all seriousness, I’ve found there are a few great ways for an analyst to refine a “question” like the one above into an actionable plan of attack. So the next time you get a jargon-filled, completely generic analysis request such as the one above, try these four tactics.

1. All Requests Should Be Phrased In The Form Of A Question

The first thing to notice about the mock interaction above is that there are no question marks; it’s not a question! For an analyst or developer to work effectively, questions need to be presented, not bland statements. For example, a refinement series of questions from the analyst might include:

  • You need a model? What type of model? Do you mean a predictive model, a decision tree for understanding, a PivotTable for you to poke at, a one-page PowerPoint slide to give your boss?
  • You specified four attributes (demographics, geography, product type and tenure). Do you have a hypothesis around these attributes (or are you just brain-blabbing)?
  • What is meant by “shop”? Do you mean how do customers browse our goods online and in stores, the purchase cycle, what goods are frequently purchased together or something else?

Note that in all three of the refinement questions above, you are taking a generic idea and really drilling into what is needed. It is the analyst that is the expert in the techniques for analyzing data, so the analyst should be helping the business person to take a raw analysis request and make it into answerable questions.

2. Separate The Tools From The Question

The second thing to notice in the mock interaction above is the statement “using SAS”. I didn’t write that to pick on SAS, but rather, this exact statement was said to me early in my career. I had a boss who would try and guess which tool was appropriate for the question he was asking. I presume that he was trying to gauge how hard he thought the problem was, or try to signal to me how hard he thought the problem was. In the end, a plain SQL query with the results copied into an Excel table was all that was necessary.

As the analyst, confirm whether the tool is actually part of the deliverable. Meaning, if you need to deliver a Tableau workbook, ok, specifying “use Tableau” is an important part of the business question. But if the requirement is “production-quality visualizations”, Tableau may or may not be the right tool or might just be one part of a larger workflow.

3. Every Question Is Interesting To Someone. Solve The Valuable Ones.

Paraphrasing the aphorism “The path to hell is paved with good intentions”, the path to doing low-value work your entire career is answering questions that start ”Wouldn’t it be interesting if…”.

The basis for these statements are often tangents in other meetings, where high-level executives think there is information that should just be available at everyone’s fingertips. But if you were to ask “What business action would you take if you knew this piece of information?” or “Is it worth me stopping a project worth $1 million in Pre-Tax Profit per month to answer this for you?”, you’ll suddenly the question becomes a lot less interesting.

So always have estimates of the business impact of what you are currently working on and ask for the same estimate of those who ask for your time. Projects that are valuable to the business are “interesting”, everything else is just making work for other people.

4. Don’t Just Solved The Stated Question. Solve The Unstated Question Too.

Finally, when I read the mock interaction above, there are actually two questions:

  • Stated: Do we understand our customer’s purchasing behaviors?
  • Unstated: How do we optimize our business to take into account our customer’s purchasing behaviors?

For sure, a deep understanding of the customer base is important no matter the product. But the unstated question of “What are we doing to do about it?” is so much more valuable to answer (i.e. tactic #3).

So even if the refined question becomes ’Build a customer segmentation based on past purchases’, go one step further and figure out how to implement your findings. Create a test plan for increasing email click-through-rates based on the segments or optimize your display bidding, maybe build a recommender system for your website…implementation of new ideas is always going to be more valuable than just analyzing the past.

Always Be Assertive.

If the key to sales is “Always Be Closing”, the key to quality analysis is “Always Be Assertive”. Ask questions. Make people think about what they are doing, what they ask of others and what can be done to improve the business. It’s a rare, ego-centric co-worker who doesn’t appreciate collaborating to get to a better quality question (and answer!) than they originally started with.

Being able to read into what other people are asking for, estimating its value, then delivering more than they even knew they were asking for has helped me tremendously throughout my career. Hopefully by doing some or all of the tactics above, you’ll see a marked improvement in your analysis and career as well!


RSiteCatalyst Version 1.4.5 Release Notes

It’s only been a month since the last RSiteCatalyst update, and this update is also a pretty minor update in terms of functionality.

Set Your Own Endpoint

For the overseas users (or companies with weird setups), you can now use the endpoint argument in the SCAuth() function to specify your API endpoint. For the most part, this is not recommended, as RSiteCatalyst pings the Adobe Analytics API to evaluate the proper API endpoint to use, but if for some reason you are having issues, you can override what the Adobe API says.

New Functions

For this release, I briefly looked through the API explorer to see if  there were any useful methods that had been missed. GetFunctions (Get definitions of all formula/functions in Adobe Analytics), QueueSummary (Get summary metrics for numerous report suites at once), GetPrivacySettings (Privacy Settings at a report suite level), and GetTemplate (Get template that a current report suite was built from). With the exception of QueueSummary(), none of these functions will likely get you much in the way of additional analytics capabilities, but they are there should you want to use them.

Feature Requests/Bugs

As always, if you come across bugs or have feature requests, please continue to use the RSiteCatalyst GitHub Issues page to submit issues. Don’t worry about cluttering up the page with tickets, please fill out a new issue for anything you encounter (with code you’ve already tried and is failing), unless you are SURE that it is the same problem someone else is facing.

Outside of patching really serious bugs, I will likely not spend any more time improving this package in the future; my interests have changed, and RSiteCatalyst is pretty much complete as far as I’m concerned. That said, contributors are also very welcomed. If there is a feature you’d like added, and especially if you can fix an outstanding issue reported at GitHub, we’d love to have your contributions. Willem and I are both parents of young children and have real jobs outside of open-source software creation, so we welcome any meaningful contributions to RSiteCatalyst that anyone would like to contribute.


JuliaCon 2015: Everyday Analytics and Visualization (video)

At long last, here’s the video of my presentation from JuliaCon 2015, discussion common analytics tasks and visualization. This is really two talks, the first being an example of using the citibike NYC API to analyze ridership of their public bike program, and the second a discussion of the Vega.jl package.

Speaking at JuliaCon 2015 at MIT CSAIL is the professional highlight of my year; hopefully even more of you will attend next year.

Enjoy!

Edit: For those of you who would like to follow-along using the actual presentation code, it is available on GitHub.

CitiBank Bike Data

Vega.jl Presentation


  • RSiteCatalyst Version 1.4.16 Release Notes
  • Using RSiteCatalyst With Microsoft PowerBI Desktop
  • RSiteCatalyst Version 1.4.14 Release Notes
  • RSiteCatalyst Version 1.4.13 Release Notes
  • RSiteCatalyst Version 1.4.12 (and 1.4.11) Release Notes
  • Self-Service Adobe Analytics Data Feeds!
  • RSiteCatalyst Version 1.4.10 Release Notes
  • WordPress to Jekyll: A 30x Speedup
  • Bulk Downloading Adobe Analytics Data
  • Adobe Analytics Clickstream Data Feed: Calculations and Outlier Analysis
  • Adobe: Give Credit. You DID NOT Write RSiteCatalyst.
  • RSiteCatalyst Version 1.4.8 Release Notes
  • Adobe Analytics Clickstream Data Feed: Loading To Relational Database
  • Calling RSiteCatalyst From Python
  • RSiteCatalyst Version 1.4.7 (and 1.4.6.) Release Notes
  • RSiteCatalyst Version 1.4.5 Release Notes
  • Getting Started: Adobe Analytics Clickstream Data Feed
  • RSiteCatalyst Version 1.4.4 Release Notes
  • RSiteCatalyst Version 1.4.3 Release Notes
  • RSiteCatalyst Version 1.4.2 Release Notes
  • Destroy Your Data Using Excel With This One Weird Trick!
  • RSiteCatalyst Version 1.4.1 Release Notes
  • Visualizing Website Pathing With Sankey Charts
  • Visualizing Website Structure With Network Graphs
  • RSiteCatalyst Version 1.4 Release Notes
  • Maybe I Don't Really Know R After All
  • Building JSON in R: Three Methods
  • Real-time Reporting with the Adobe Analytics API
  • RSiteCatalyst Version 1.3 Release Notes
  • Adobe Analytics Implementation Documentation in 60 Seconds
  • RSiteCatalyst Version 1.2 Release Notes
  • Clustering Search Keywords Using K-Means Clustering
  • RSiteCatalyst Version 1.1 Release Notes
  • Anomaly Detection Using The Adobe Analytics API
  • (not provided): Using R and the Google Analytics API
  • My Top 20 Least Useful Omniture Reports
  • For Maximum User Understanding, Customize the SiteCatalyst Menu
  • Effect Of Modified Bounce Rate In Google Analytics
  • Adobe Discover 3: First Impressions
  • Using Omniture SiteCatalyst Target Report To Calculate YOY growth
  • ODSC webinar: End-to-End Data Science Without Leaving the GPU
  • PyData NYC 2018: End-to-End Data Science Without Leaving the GPU
  • Data Science Without Leaving the GPU
  • Getting Started With OmniSci, Part 2: Electricity Dataset
  • Getting Started With OmniSci, Part 1: Docker Install and Loading Data
  • Parallelizing Distance Calculations Using A GPU With CUDAnative.jl
  • Building a Data Science Workstation (2017)
  • JuliaCon 2015: Everyday Analytics and Visualization (video)
  • Vega.jl, Rebooted
  • Sessionizing Log Data Using data.table [Follow-up #2]
  • Sessionizing Log Data Using dplyr [Follow-up]
  • Sessionizing Log Data Using SQL
  • Review: Data Science at the Command Line
  • Introducing Twitter.jl
  • Code Refactoring Using Metaprogramming
  • Evaluating BreakoutDetection
  • Creating A Stacked Bar Chart in Seaborn
  • Visualizing Analytics Languages With VennEuler.jl
  • String Interpolation for Fun and Profit
  • Using Julia As A "Glue" Language
  • Five Hard-Won Lessons Using Hive
  • Using SQL Workbench with Apache Hive
  • Getting Started With Hadoop, Final: Analysis Using Hive & Pig
  • Quickly Create Dummy Variables in a Data Frame
  • Using Amazon EC2 with IPython Notebook
  • Adding Line Numbers in IPython/Jupyter Notebooks
  • Fun With Just-In-Time Compiling: Julia, Python, R and pqR
  • Getting Started Using Hadoop, Part 4: Creating Tables With Hive
  • Tabular Data I/O in Julia
  • Hadoop Streaming with Amazon Elastic MapReduce, Python and mrjob
  • A Beginner's Look at Julia
  • Getting Started Using Hadoop, Part 3: Loading Data
  • Innovation Will Never Be At The Push Of A Button
  • Getting Started Using Hadoop, Part 2: Building a Cluster
  • Getting Started Using Hadoop, Part 1: Intro
  • Instructions for Installing & Using R on Amazon EC2
  • Video: SQL Queries in R using sqldf
  • Video: Overlay Histogram in R (Normal, Density, Another Series)
  • Video: R, RStudio, Rcmdr & rattle
  • Getting Started Using R, Part 2: Rcmdr
  • Getting Started Using R, Part 1: RStudio
  • Learning R Has Really Made Me Appreciate SAS