The Data Scientist - the WPC Healthcare Blog

What Is Data Science Anyway?

Facebooktwittergoogle_plusredditpinterestlinkedinmail

By Damian Mingle

 

what-is-data-science-anywayAttempting to describe data science is a bit like describing a beautiful Hawaiian sunset – it seems easy enough, but it turns out to be pretty challenging to find the precise words.

The phrase “data science” first appeared in computer science literature as early as the 1960s. However, it was not until the 1990s that the field, as it is known today, began to materialize from statistics and data mining communities. In 2001, data science first stood on its own as an independent discipline.

Since 2001 many organizations have been experiencing a need for innovative multidisciplinary teams to approach their problems of the day. For those that formed data science teams, there has been immediate insight and analytic paths solidified that validate the field of data science.

Turning an organization’s data into something that can be acted upon is more accurately what the spirit of data science is all about. This plays out in numerous ways, with the primary intent being provision of actionable information without exposing decision makers to the underlying data or analytics.

Execution of data science requires the extraction of timely, actionable information from varied data sources to drive data products (e.g., flu trend predictions, health diagnoses, production process improvements, movie recommendations).

Samples of the types of questions that data science attempts to answer include:

  • “Which of my products should I advertise more heavily to increase profit?”
  • “Which shoppers will become repeat buyers?”
  • “How can I improve my compliance program, while reducing costs?”
  • “What manufacturing process change will allow me to build a better product?”
  • “What’s the probability of default by a consumer on a credit product?”

The key to answering these types of questions is in understanding the data an organization has and what the data inductively expresses.

If the leadership in your organization declares that they are simply renaming your Business Intelligence group the Data Science group, please share with them how this cannot be the solution.

Here are some reasons why:

Business Intelligence Data Science
Content/Tools Decision Support System Lineage Statistical Science Lineage
Relational Database-Centric Cloud-Centric
Reporting/Dashboard Focus Statistics/Experiments Focus
OLAP Machine Learining
Business IT-Owned Analytics-Owned
Division of Labor Jack-of-All-Trades
Data Complete Data Missing Data
Absolute Approximate
Structured Data Structured + Unstructed Data

Damian Mingle is the Director of Data Science for WPC.

Sources:

Fayyad, Usama, Gregory Piatetsky-Shapiro, and Padhraic Smyth. From Data Mining to Knowledge Discovery in Databases. AI Magazine 17.3 (1996): 37-54. Print.

Mining Data for Nuggets of Knowledge. Knowledge@Wharton, 1999. Web. Accessed 16 October 2013.

Cleveland, William S. Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.International Statistical Review 69.1 (2001): 21-26. Print