An Afternoon With Edward Tufte


There had to be 400+ people in the seminar!

Yesterday, I had the opportunity to attend the “Presenting Data and Information” seminar hosted by Edward Tufte in Philadelphia.  A world-renowned expert in the field of data presentation/visualization, Edward Tufte has written seven books outlining terrible and fantastic examples of data display (and how to make sure your charts and tables fall into the latter!)

Unfortunately, as great as each of these books are at explaining methods for data visualization, the seminar was little more than a topical discussion of his book material, rather than a concise summary of what pitfalls to avoid. However, for the relatively low cost of the seminar ($380) and receiving hardcover editions of 4 of 7 of Tufte’s works, there are many worse ways to spend your time and money if you are a data enthusiast!

Course materials

Each attendee received a copy of the following books:

These books are so dense with information that it will probably take me a month or more to read each book!

If your data are boring, you’ve got the wrong data

Of the many positives of this seminar, I appreciated how Tufte hammered on a few main topics, the most important of which is 'If your data is boring, you've got the wrong data'.  I think this is often overlooked when thinking about success in business; if your meetings are dull and people dread when you send out a meeting request, you need better content!  It's (usually) not a visualization problem, and in many ways, it's not a presentation style problem.  If you've got great data, people will overlook an annoying presenter.  But without content that speaks to what the audience is interested in knowing, you might as well not give a presentation at all.

Don't fall into the PowerPoint trap!

The other main point that Tufte really hammered on was if you let the limitations of a tool like PowerPoint dictate how you perform and present analysis, then you've failed as an analyst.  Humans have an extraordinary ability to process dense amounts of information; by limiting yourself to presenting your analysis in 3 bullets and 10 words per page, you are just perpetuating the 'stupidity' (his words) of that 'authoritarian form of communication'.

As an alternative to PowerPoint style charts and graphs, the seminar really focused on hand-drawn illustrations from Galileo's Sunspot discovery and examples from cartographers about how to present multi-variate data structures.  Even though paper (or a PowerPoint slide on-screen) is limited in two dimensions, there are many ways to increase the information density to six or more dimensions.  Sparklines were also discussed in detail, to keep the data in-line with text and to be able to show data trends where the actual numbers aren't necessarily important (or are presented elsewhere).

Simultaneously negative and pie-in-the-sky

I'm not going to focus too much on the negatives here, but one thing that really surprised me about this seminar was how negative in tone the presentation seemed.  I realize part of it was sarcasm (and possibly an affectation), but I would've preferred approaching the topic as what can/should be done to advance the cause, instead of what 'sucks'. Everyone in the room is acutely aware of what sucks in the PowerPoint culture of the business world; moving past that is what everyone was there to learn.

Simultaneously, when talking about improvements, most of them seemed to be unrealistic to actually implement in the real world (not all of us live in Ivory Tower academia, Dr. Tufte 🙂 ).  Suggestions like stripping slides of 'administrative overhead' like corporate logos and style sheets, bringing the level of a presentation WAY up as if everyone is as smart as the presenter, and writing long prose instead of highlighting comments are just unrealistic for most workers.  Most of the suggestions are corporate culture issues, and ones that a lowly data analyst isn't going to be able to change.

Summary: The right data should be able to 'sell' any presentation

In the end, a 5-hour seminar isn't going to change the business world or turn anyone into a super-analyst. But hearing Dr. Tufte speak about elegant design in data visualizations reminded me that I'm the one that controls the outcome of any presentation.  With the right data, shown properly, I should be able to 'sell' anyone on an idea without having to do any salesmanship at all.  The data is what sells an idea, not slick talking and 3 bullets per page.

  • Using RSiteCatalyst With Microsoft PowerBI Desktop
  • RSiteCatalyst Version 1.4.14 Release Notes
  • RSiteCatalyst Version 1.4.13 Release Notes
  • RSiteCatalyst Version 1.4.12 (and 1.4.11) Release Notes
  • Self-Service Adobe Analytics Data Feeds!
  • RSiteCatalyst Version 1.4.10 Release Notes
  • WordPress to Jekyll: A 30x Speedup
  • Bulk Downloading Adobe Analytics Data
  • Adobe Analytics Clickstream Data Feed: Calculations and Outlier Analysis
  • Adobe: Give Credit. You DID NOT Write RSiteCatalyst.
  • RSiteCatalyst Version 1.4.8 Release Notes
  • Adobe Analytics Clickstream Data Feed: Loading To Relational Database
  • Calling RSiteCatalyst From Python
  • RSiteCatalyst Version 1.4.7 (and 1.4.6.) Release Notes
  • RSiteCatalyst Version 1.4.5 Release Notes
  • Getting Started: Adobe Analytics Clickstream Data Feed
  • RSiteCatalyst Version 1.4.4 Release Notes
  • RSiteCatalyst Version 1.4.3 Release Notes
  • RSiteCatalyst Version 1.4.2 Release Notes
  • Destroy Your Data Using Excel With This One Weird Trick!
  • RSiteCatalyst Version 1.4.1 Release Notes
  • Visualizing Website Pathing With Sankey Charts
  • Visualizing Website Structure With Network Graphs
  • RSiteCatalyst Version 1.4 Release Notes
  • Maybe I Don't Really Know R After All
  • Building JSON in R: Three Methods
  • Real-time Reporting with the Adobe Analytics API
  • RSiteCatalyst Version 1.3 Release Notes
  • Adobe Analytics Implementation Documentation in 60 Seconds
  • RSiteCatalyst Version 1.2 Release Notes
  • Clustering Search Keywords Using K-Means Clustering
  • RSiteCatalyst Version 1.1 Release Notes
  • Anomaly Detection Using The Adobe Analytics API
  • (not provided): Using R and the Google Analytics API
  • My Top 20 Least Useful Omniture Reports
  • For Maximum User Understanding, Customize the SiteCatalyst Menu
  • Effect Of Modified Bounce Rate In Google Analytics
  • Adobe Discover 3: First Impressions
  • Using Omniture SiteCatalyst Target Report To Calculate YOY growth
  • Google Analytics Individual Qualification (IQ) - Passed!
  • Google Analytics SEO reports: Not Ready For Primetime?
  • An Afternoon With Edward Tufte
  • Google Analytics Custom Variables: A Page-Level Example
  • Xchange 2011: Think Tank and Harbor Cruise
  • Google Analytics for WordPress: Two Methods
  • WordPress Stats or Google Analytics? Yes!
  • Getting Started With MapD, Part 2: Electricity Dataset
  • Getting Started With MapD, Part 1: Docker Install and Loading Data
  • Parallelizing Distance Calculations Using A GPU With CUDAnative.jl
  • Building a Data Science Workstation (2017)
  • JuliaCon 2015: Everyday Analytics and Visualization (video)
  • Vega.jl, Rebooted
  • Sessionizing Log Data Using data.table [Follow-up #2]
  • Sessionizing Log Data Using dplyr [Follow-up]
  • Sessionizing Log Data Using SQL
  • Review: Data Science at the Command Line
  • Introducing Twitter.jl
  • Code Refactoring Using Metaprogramming
  • Evaluating BreakoutDetection
  • Creating A Stacked Bar Chart in Seaborn
  • Visualizing Analytics Languages With VennEuler.jl
  • String Interpolation for Fun and Profit
  • Using Julia As A "Glue" Language
  • Five Hard-Won Lessons Using Hive
  • Using SQL Workbench with Apache Hive
  • Getting Started With Hadoop, Final: Analysis Using Hive & Pig
  • Quickly Create Dummy Variables in a Data Frame
  • Using Amazon EC2 with IPython Notebook
  • Adding Line Numbers in IPython/Jupyter Notebooks
  • Fun With Just-In-Time Compiling: Julia, Python, R and pqR
  • Getting Started Using Hadoop, Part 4: Creating Tables With Hive
  • Tabular Data I/O in Julia
  • Hadoop Streaming with Amazon Elastic MapReduce, Python and mrjob
  • A Beginner's Look at Julia
  • Getting Started Using Hadoop, Part 3: Loading Data
  • Innovation Will Never Be At The Push Of A Button
  • Getting Started Using Hadoop, Part 2: Building a Cluster
  • Getting Started Using Hadoop, Part 1: Intro
  • Instructions for Installing & Using R on Amazon EC2
  • Video: SQL Queries in R using sqldf
  • Video: Overlay Histogram in R (Normal, Density, Another Series)
  • Video: R, RStudio, Rcmdr & rattle
  • Getting Started Using R, Part 2: Rcmdr
  • Getting Started Using R, Part 1: RStudio
  • Learning R Has Really Made Me Appreciate SAS