Authenticated API Testing Using Travis CI

As I’ve become more serious about contributing in the open-source community, having quality tests for my packages has been something I’ve spent much more time on than when I was just writing quick-and-dirty code for my own purposes. My most used open-sourced package is RSiteCatalyst, which accesses the Adobe Analytics (authenticated) API, which poses a problem: how do you maintain a project on GitHub with a full test suite, while at the same time not hard-coding your credentials in plain sight for everyone to see?

The answer ends up being using encrypted environment variables within Travis CI.

Testthat!

In terms of a testing framework, Hadley Wickham provides a great testing framework in testthat; while I wouldn’t go as far as he does to say that the package makes testing fun, it certainly makes testing easy. Let’s take a look at some of the tests in RSiteCatalyst from the QueueOvertime function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
test_that("Validate QueueOvertime using legacy credentials", {

  skip_on_cran()

  #Correct [masked] credentials
  SCAuth(Sys.getenv("USER", ""), Sys.getenv("SECRET", ""))

  #Single Metric, No granularity (summary report)
  aa <- QueueOvertime("zwitchdev",
                      "2014-01-01",
                      "2014-12-31",
                      "visits",
                      "")

  #Validate returned value is a data.frame
  expect_is(aa, "data.frame")

  #Single Metric, Daily Granularity
  bb <- QueueOvertime("zwitchdev",
                      "2014-01-01",
                      "2014-12-31",
                      "visits",
                      "day")

  #Validate returned value is a data.frame
  expect_is(bb, "data.frame")

  #Single Metric, Week Granularity
  cc <- QueueOvertime("zwitchdev",
                      "2014-01-01",
                      "2014-12-31",
                      "visits",
                      "week")

  #Validate returned value is a data.frame
  expect_is(cc, "data.frame")

  #Two Metrics, Week Granularity
  dd <- QueueOvertime("zwitchdev",
                      "2014-01-01",
                      "2014-12-31",
                      c("visits", "pageviews"),
                      "week")

  #Validate returned value is a data.frame
  expect_is(dd, "data.frame")

  #Two Metrics, Month Granularity, Social Visitors
  ee <- QueueOvertime("zwitchdev",
                      "2014-01-01",
                      "2014-12-31",
                      c("visits", "pageviews"),
                      "month",
                      "5433e4e6e4b02df70be4ac63")

  #Validate returned value is a data.frame
  expect_is(ee, "data.frame")

  #Two Metrics, Day Granularity, Social Visitors, Anomaly Detection
  ff <- QueueOvertime("zwitchdev",
                      "2014-01-01",
                      "2014-12-31",
                      c("visits", "pageviews"),
                      "day",
                      "5433e4e6e4b02df70be4ac63",
                      anomaly.detection = "1")

  #Validate returned value is a data.frame
  expect_is(ff, "data.frame")



})

From the code above, you can see the tests are fairly simplistic; for a given number of permutations of arguments of the function, I test to see if a data frame was returned. This is because, for the most part, RSiteCatalyst is just a means of generating JSON calls, submitting them to the Adobe Analytics API, then parsing the results into an R data frame.

Since there is very little additional logic in the package, I don’t spend a bunch of time testing what data is actually returned (i.e. what is returned depends on the Adobe Analytics API, not R). What is interesting is line 6; I reference Sys.getenv() twice in order to pass in my username and key for the Adobe Analytics API, which feels very “interactive R”, but the goal is automated testing. Filling in those two environment variables is where Travis CI comes in.

Travis CI Configuration

In order to have any automation using Travis CI, you need to create a .travis.yml configuration file. While you can read the Travis docs to create the .travis.yml file for R, you’re probably better off just using the use_travis function from devtools (also from Hadley, little surprise!) to create the file for you. In terms of creating encrypted keys to use with Travis, you’ll need to use the Travis CLI tool, which is distributed as a Ruby gem (i.e. package).  If you view the RSiteCatalyst .travis.yml file, you can see that I define two global “secure” variables, the value of which are the output from running a command similar to the following in the Travis CLI tool:

$ travis encrypt RANDY=ZWITCH
Please add the following to your .travis.yml file:

  secure: "b6S4dBc7arvox8UpuFqkz+VP2UmAW/S/B/vgaAdZiZQqUp78YDR6VYdAYN3WisCK1VLGjOVVPQvGxLik0pQokF8FU3sjX0ekH6vSJeqg4utrEZmVtNvdDLEVAmagFy8Fyduow3U4CPW7rzXqvAE4cIVqGR5Lv2KLf8ANUGn+y3E="

Pro Tip: You can add it automatically by running with --add.

Note that if this seems insecure, every time you run the encrypt command with the same arguments, you get a different value; Travis CI is creating new public and private RSA keys each time.

Setting Up Authenticated Testing Locally

If you get as far as setting up encrypted Travis CI keys and tests using testthat, the final step is really for convenience. With the .travis.yml file, Travis CI sets the R environment variables on THEIR system; on your local machine, the environment variables aren’t set. Even if the environment variables were set, they would be set to the Travis CI hashed values, which is not what I want to pass to my authentication function in my R package.

To set the authentication variables locally, so that each time you hit ‘check’ to build and check against CRAN errors, you just need to modify the .Renviron file for R:

USER="myusername"
SECRET="mysecret"

With that minor change, in addition to the .travis.yml file, you’ll have a seamless environment for developing and testing R packages.

Testing Is Like Flossing…

As easy as the testthat and devtools packages make testing, and as inexpensively as Travis CI is as a service (free for open source projects!), there’s really no excuse to provide packaged-up code and not include tests. Hopefully this blog post has demonstrated that it’s possible to include tests even when authentication is necessary without compromising your credentials.

So let’s all be sure to include tests, not just pay lip service to the idea that testing is useful. Code testing only works if you actually do it 🙂

  • RSiteCatalyst Version 1.4.16 Release Notes
  • Using RSiteCatalyst With Microsoft PowerBI Desktop
  • RSiteCatalyst Version 1.4.14 Release Notes
  • RSiteCatalyst Version 1.4.13 Release Notes
  • RSiteCatalyst Version 1.4.12 (and 1.4.11) Release Notes
  • Self-Service Adobe Analytics Data Feeds!
  • RSiteCatalyst Version 1.4.10 Release Notes
  • WordPress to Jekyll: A 30x Speedup
  • Bulk Downloading Adobe Analytics Data
  • Adobe Analytics Clickstream Data Feed: Calculations and Outlier Analysis
  • Adobe: Give Credit. You DID NOT Write RSiteCatalyst.
  • RSiteCatalyst Version 1.4.8 Release Notes
  • Adobe Analytics Clickstream Data Feed: Loading To Relational Database
  • Calling RSiteCatalyst From Python
  • RSiteCatalyst Version 1.4.7 (and 1.4.6.) Release Notes
  • RSiteCatalyst Version 1.4.5 Release Notes
  • Getting Started: Adobe Analytics Clickstream Data Feed
  • RSiteCatalyst Version 1.4.4 Release Notes
  • RSiteCatalyst Version 1.4.3 Release Notes
  • RSiteCatalyst Version 1.4.2 Release Notes
  • Destroy Your Data Using Excel With This One Weird Trick!
  • RSiteCatalyst Version 1.4.1 Release Notes
  • Visualizing Website Pathing With Sankey Charts
  • Visualizing Website Structure With Network Graphs
  • RSiteCatalyst Version 1.4 Release Notes
  • Maybe I Don't Really Know R After All
  • Building JSON in R: Three Methods
  • Real-time Reporting with the Adobe Analytics API
  • RSiteCatalyst Version 1.3 Release Notes
  • Adobe Analytics Implementation Documentation in 60 Seconds
  • RSiteCatalyst Version 1.2 Release Notes
  • Clustering Search Keywords Using K-Means Clustering
  • RSiteCatalyst Version 1.1 Release Notes
  • Anomaly Detection Using The Adobe Analytics API
  • (not provided): Using R and the Google Analytics API
  • My Top 20 Least Useful Omniture Reports
  • For Maximum User Understanding, Customize the SiteCatalyst Menu
  • Effect Of Modified Bounce Rate In Google Analytics
  • Adobe Discover 3: First Impressions
  • Using Omniture SiteCatalyst Target Report To Calculate YOY growth
  • ODSC webinar: End-to-End Data Science Without Leaving the GPU
  • PyData NYC 2018: End-to-End Data Science Without Leaving the GPU
  • Data Science Without Leaving the GPU
  • Getting Started With OmniSci, Part 2: Electricity Dataset
  • Getting Started With OmniSci, Part 1: Docker Install and Loading Data
  • Parallelizing Distance Calculations Using A GPU With CUDAnative.jl
  • Building a Data Science Workstation (2017)
  • JuliaCon 2015: Everyday Analytics and Visualization (video)
  • Vega.jl, Rebooted
  • Sessionizing Log Data Using data.table [Follow-up #2]
  • Sessionizing Log Data Using dplyr [Follow-up]
  • Sessionizing Log Data Using SQL
  • Review: Data Science at the Command Line
  • Introducing Twitter.jl
  • Code Refactoring Using Metaprogramming
  • Evaluating BreakoutDetection
  • Creating A Stacked Bar Chart in Seaborn
  • Visualizing Analytics Languages With VennEuler.jl
  • String Interpolation for Fun and Profit
  • Using Julia As A "Glue" Language
  • Five Hard-Won Lessons Using Hive
  • Using SQL Workbench with Apache Hive
  • Getting Started With Hadoop, Final: Analysis Using Hive & Pig
  • Quickly Create Dummy Variables in a Data Frame
  • Using Amazon EC2 with IPython Notebook
  • Adding Line Numbers in IPython/Jupyter Notebooks
  • Fun With Just-In-Time Compiling: Julia, Python, R and pqR
  • Getting Started Using Hadoop, Part 4: Creating Tables With Hive
  • Tabular Data I/O in Julia
  • Hadoop Streaming with Amazon Elastic MapReduce, Python and mrjob
  • A Beginner's Look at Julia
  • Getting Started Using Hadoop, Part 3: Loading Data
  • Innovation Will Never Be At The Push Of A Button
  • Getting Started Using Hadoop, Part 2: Building a Cluster
  • Getting Started Using Hadoop, Part 1: Intro
  • Instructions for Installing & Using R on Amazon EC2
  • Video: SQL Queries in R using sqldf
  • Video: Overlay Histogram in R (Normal, Density, Another Series)
  • Video: R, RStudio, Rcmdr & rattle
  • Getting Started Using R, Part 2: Rcmdr
  • Getting Started Using R, Part 1: RStudio
  • Learning R Has Really Made Me Appreciate SAS