Advertisement

Data Science Birds of a Feather

By on

Click to learn more about author Steve Miller.

I had lunch with Diego Klabjan the other day. Diego’s a professor of Industrial Engineering and Management Sciences at Northwestern University and progenitor/director of the now five year old Master of Science in Analytics (MSIA) program at the school. MSIA was recently rated one of the top six Masters in Data Science programs in the country by Forbes magazine.

Over  the past eight years, I’ve engaged with half a dozen graduate Analytics/Data Science programs, writing blogs evaluating program emphases and curricula. I first met Diego four years ago in Evanston when the initial MSIA cohort had just completed year one of the year and a half curriculum. Like I’ve done with all those I’ve engaged, I asked Diego about the computational foci of the new program. And like the responses from earlier analytics program directors, his answer that there were no pure programming courses, but rather a two week bootcamp before the official program start, along with appropriate computational emphasis in the statistical classes – wasn’t encouraging.

In the early days especially, academics often equated analytics with statistics/Machine Learning, giving short shrift to nuts and bolts Data Management/computation. A grizzled 40 year data/stats veteran, I well appreciate the computational preoccupation of the work I’ve done over the years, which probably summarizes to about 80% programming/Data Management and 20 percent stats.

Before Diego and I met again the following winter at Strata in Santa Clara, I noticed on the MSIA website that a required course in Java/Python had been added. When I kidded him for changing his position, Diego said the decision on the new course was a no brainer – that students, employers, and the program advisory board had asked for the class. Nice! Compared to other program directors I’d spoken with, that responsiveness was quite refreshing.

Indeed, MSIA has continuously added/subtracted/modified classes, generally moving to a more computational – and Data Science (DS) – focus. There’re now database retrieval, Data Visualization, Big Data, and optimization courses, all of which involve intense computation/programming. And the specific topics of the courses change each time they’re taught, reflecting the latest technology advances. For example, the Big Data course at first devoted 6 of the 10 class weeks to Hadoop/Map Reduce; now HMR is little more than a one week historical footnote. The faculty continues to tinker based on input from their constituencies, which include a healthy dose of leading edge Silicon Valley companies and practitioners. At this point the MSIA seems more a Data Science program than Analytics, though it’s so well branded now I don’t expect a name change anytime soon.

As we finished our lunches, Diego and I shared thoughts on what constitutes “Data Science”. His point of departure, like that of seminal DS writer Mike Loukides, is “that merely using data isn’t really what we mean by ‘Data Science.’ A data application acquires its value from the data itself, and creates more data as a result. It’s not just an application with data; it’s a data product. Data Science enables the creation of data products.”  In this view, Data Science includes most business functions required to design, build and deploy data products.

My conceptual foundation for DS borrows from Stanford professor David Donoho, who sees DS “as the science of learning from data, with all that this entails.”  For Donoho, the mandates for a Data Science discipline are expansive, including:

“1. Data Exploration and Preparation 2. Data Representation and Transformation 3. Computing with Data 4. Data Modeling 5. Data Visualization and Presentation 6. Science about Data Science.”


In the end, both of us see a very broad role for Data Science. I think I can speak for Diego by saying we feel the Data Science foundation is business/science knowledge, enabled by requisite skills in statistical and Machine Learning, computation, technology, and Data Management – fueled  by an obsession with design methodology that includes curiosity, creativity, skepticism, and story-telling.

Of course, you’re limited in what you can teach in a year and a half, so we discussed curriculum trade-offs – introducing something new while mothballing something old. And we jointly appreciate the critical importance of continuing education in a technical world where one needs to reinvent every few years. Diego’s now assembling short curricula on the latest data and analytics tools and techniques for MSIA grads, while my company, Inquidia Consulting, has committed to producing training materials on the newest in Cloud Analytics while encouraging staff to take Coursera and other online training courses to enhance their skills.

Very enjoyable rendezvous. Looking forward to meeting again in the winter.

Leave a Reply