The Data Scientist has been called the sexiest job of the 21st Century and thus has started attracting many young, ambitious contenders for this high-paying and “glamorous” post. However, there is a growing concern among employers and business leaders that candidates are getting enticed by just the “title” without building the proper mindset and skills to succeed on the job. As no licensing or regulatory body oversees the hiring process for Data Scientists, many types of candidates—some genuine and some fake—arrive with tremendous hope and enthusiasm at job interviews.
The unprepared Data Scientists are often those that show expertise in one academic discipline without comprehending that this complex science is multi-disciplinary in nature. Candidates with only advanced statistics or Machine Learning knowledge may slip through the interview cracks, but will not last long as Data Scientists unless they can demonstrate equal expertise in business practices and enterprise communication.
The Confessions of a Fake Data Scientist reveals that in the real world, 10 Data Scientists may be populating an organization, but only 3 are actually engaged in Data Science work. The others can be data analysts, data engineers, or some other data-related staff bidding their time under a false pretense. The true Data Scientists are distinct from data analysts or data engineers; they frequently perform modeling or Machine Learning work to deliver business solutions. Those who spend their time writing SQL code or dashboard applications are doing a great dishonor to the title “Data Scientist” and should not have been hired into this position on the first place.
So, to drastically improve the interviewing methods for Data Scientists, this post suggests some practical tips based on current available literature.
Data Science is Much More than Smart Algorithms
Data Science is not just about applying statistical theories and designing smart algorithms; Data Scientists have to appreciate and understand the role of smart systems within a large business framework to deliver effective solutions and then communicate the solution to the business leaders and peers.
As SAS Institute’s Patrick Hall, Senior Machine Learning Scientist, points out:
“Data visualization and storytelling are two important ways to communicate results. And communicating up the chain of command is very important.”
The goal of job interviewers should be to asses prospective Data Scientists on the basis of their technical, business, and communication skills as all these skills play important roles in real-world jobs.
The Early Confusion About the Data Scientist Role
According to 4 Ways to Spot a Fake Data Scientist, the Data Scientist job title was not well defined in 2013 as a result of which much confusion and misinterpretation showed up in the job market. By 2015, this trend increased resulting in many fresh Statisticians or Computer Scientists joining the super-inflated bandwagon with the lure of high-paying jobs and challenging work prospects. Many of these hired Data Scientists later turned out to be grossly unprepared for their industry roles and incapable of working in cross-functional teams. The truth is that without intense quantitative, Machine Learning, and analytical acumen, most would-be Data Scientists screened through the interview process will either be reduced to a data analyst or to a programmer position, doing nothing close to “scientific data exploration” for the business.
The above article includes some common techniques to screen out unsuitable interview candidates. Those screen-out criteria are:
- Lack of experience with unstructured data
- Lack of business savvy in spite of having exceptional technical skills
- Superficial tool skills like Google Analytics, SPSS, or even Excel, which do not indicate Data Science expertise.
Designing Effective Interviews for Data Scientists
While structuring effective interviewing methods for Data Scientists, employers must first differentiate between what makes a Data Scientists vs a Data Analyst vs a Data Engineer. This article suggests that one of the practical ways to judge the interviewee is to explore their direct involvement with the industry. A wise interviewer will try to find out if a candidate is familiar with the well-known Data Scientists and their contributions to the industry.
KDNugget’s 20 Questions to Detect Fake Data Scientists offers statistics- and business-based questions that will probably make the interviewee think and use highly specific statistical theories coupled with business practices.
Inclusive Range of the Interview Questionnaire
However, as the following Dataist article rebuts, KDNugget’s interview questions may be useful for determining only a small part of a Data Scientist’s skill set. The Dataist article titled Let’s Stop Using the Fake Data Scientist Label offers an excellent rejoinder to KDNugget’s list of 20 questions. The Dataist article challenges KDNugget’s interview questionnaire by stating that the questions do not evaluate a candidates interdisciplinary skills across domain expertise, computer science, and applied mathematics. KDNuggets’s questionnaire, according to Dataist, completely overlooks domain expertise and computer disciplines. This article proposes another set of interview questions that could probably better test a Data Scientist’s inter-disciplinary skills.
In response to a growing demand in the market for accurate interview questions with answers for Data Scientists, KDNuggets further offered probable interview questions with answers. Here is the 2nd part of the questions and answers.
The blog post titled Real vs Fake Data Scientists: What You Really Need to Know brings forth an excellent point; someone not having adequate experience in quantitative analysis will not make a good Data Scientist. Thus, the Data Scientist’s role must be stressed upon for improving the interviewing methods for Data Scientists. This post also stresses that the current debate is not so much between a “real” and a “fake” Data Scientist as one between the “beginner” and the “advanced” Data Scientist. A Licensing Board can drastically improve the hiring and interviewing process by creating accurate differentiations between different levels of data professionals including Data Scientists.
During an interview with a Data Scientist, the SAS Institute applies a unique assessment strategy that includes a bundle of technical, experience-oriented, and communication related questions to evaluate the holistic qualifications of a candidate. The interviewee must be able to answer questions on statistics, Machine Learning, and practical applications while demonstrating superior communication skills to pass the interview. An example of experience-oriented question may be explaining data cleaning techniques that a job candidate has used in the past. This article suggests that the communication skills of the Data Scientist must be sufficient enough to communicate solutions and convince others about the solution.
Testing the Data Scientist’s Mindset
Data Scientists must be mentally prepared for uncertainty claims How to Consistently Hire Remarkable Data Scientists. As Data Science constantly requires data simulation, test, and validation, a good Data Scientist must demonstrate certain skepticism and curiosity to probe data beyond the surface. Further, a Data Scientist has to display openness and team-playing skills in order to succeed on the job. On the other hand, job interviewers must ensure that their decision to hire a certain candidate is taken in a team environment, with inputs from multiple experts.
The article proposes a six-step interviewing method for Data Scientists, which includes pre-screening, home test, sales pitch, open competition, decision making, and direct communication.
As interview success or failure can often make or break a Data Scientist these days, it is probably a good idea for prospective Data Science candidates to review sample projects listed on DataScienceBootcamps.com to self-test the professional skills. Additionally, check this blog post on hiring Data Scientists.