Doug has a Ph.D. in the life sciences and works for a large corporation in its Research & Development division. Skeptical by nature, he wonders if these latest data trends are akin to snake oil. But Doug mutes his skepticism in the workplace, especially around his new director who wears a “Data is the New Bacon” t-shirt; he doesn't want to be viewed as a data luddite. At the same time, he's feeling left behind and decides to learn what all the fuss is about.
Regina is a C-level executive who is well-aware of the latest trends in data science. She oversees her company's new Data Science Division and interacts with senior data scientists on a regular basis. Regina trusts her data scientists and champions their work, but she'd like to have a deeper understanding of what they do because she's frequently presenting and defending her team's work to the company's board of directors. Regina is also tasked with vetting new technology software for the company. She suspects some of the vendors’ claims about “artificial intelligence” are too good to be true and wants to arm herself with more technical knowledge to separate marketing claims from reality.
Nelson manages three data scientists in his new role. A computer scientist by training, Nelson knows how to write software and work with data, but he's new to statistics (other than one class he took in college) and machine learning. Given his somewhat related technical background, he's willing and able to learn the details, but simply can't find time. His management has also been pushing his team to “do more machine learning,” but at this point, it all seems like a magic black box. Nelson is searching for material to help him build credibility within his team and recognize what problems can and cannot be solved with machine learning.
Hopefully, you can identify with one or more of these personas. The common thread among them, and likely you, is the desire to become a better “consumer” of the data and analytics you come across.
We also created an avatar to represent people who should read this book but probably won't (because every story needs a villain):
George: A mid-level manager, George reads the latest business articles about artificial intelligence and forwards his favorites up and down his management chain as evidence of his technical trendiness. But in the boardroom, he prides himself on “going with his gut.” George likes his data scientists to spoon-feed him the numbers in one or two slides, max. When the analysis agrees with what he (and his gut) decided before he commissioned the study, he moves it up the chain and boasts to his peers about enabling an “Artificial Intelligence Enterprise.” If the analysis disagrees with his gut feeling, he interrogates his data scientists with a series of nebulous questions and sends them on a wild goose chase until they find the “evidence” he needs to push his project forward.
Don't be like George. If you know a “George,” recommend this book and say they reminded you of “Regina.”
We think a lot of people, like our avatars, want to learn about data and don't know where to start. Existing books in data science and statistics span a wide spectrum. On one side of the spectrum are non-technical books extolling the virtues and promise of data. Some of them are better than others. Even the best ones feel like the modern-day business books. But many of them are written by journalists looking to add drama around the rise of data.
These books describe how specific business problems were solved by looking at a problem through the lens of data. And they might even use words like artificial intelligence, machine learning, and the like. Don't get us wrong, these books create awareness. However, they don't delve deeply into what was done, instead focusing specifically on the problem and the solution at a high level.
On the other side of the spectrum are highly technical books. These hardbound, 500-page tomes are as intimidating physically as the content inside is intimidating mentally.
The far sides of this spectrum have mountains of books. This perpetuates the communication gap—most people either read just the business books or just the technical books. Not both.
Thankfully, the gorge between the two extremes contains a handful of excellent books. Two of our favorites are:
Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, by Foster Provost and Tom Fawcett (O'Reilly Media, 2013)
Data Smart: Using Data Science to Transform Information into Insight, by John W. Foreman (Wiley, 2013)
We want to add one more to this list by writing a book you can read casually without a computer or pad of paper nearby. If you enjoy our book, we highly recommend taking the next step by reading one or both of the books listed to solidify your understanding. You won't regret it.
Plus, we love this stuff. If we can convey that to you and motivate you to learn more about data and analytics—and inspire you to want to learn more—we'll consider this book a success.
This book will help you construct a mental model of data science, statistics, and machine learning. What is a mental model? It's “a simplified representation of the most important parts of some problem domain that is good enough to enable problem solving.” 6 Think of it as a new storage room in your brain where you can put information.
Some books and articles start with a list of definitions: “Machine Learning is …”, “Deep Learning is …”, etc. Seeing a list of technical definitions without a mental model to fit the information into would be like someone dropping off boxes of clothes when you don't have a place to store them. Sooner or later, it's all going to end up in the garbage.
But with a newly constructed mental model, you will learn how to think, speak, and understand data. You'll become a Data Head .
Specifically, by reading this book, you will be able to:
Think statistically and understand the role variation plays in your life and decision making.
Become data literate—speak intelligently and ask the right questions about the statistics and results you encounter in the workplace.
Understand what's really going on with machine learning, text analytics, deep learning, and artificial intelligence.
Avoid common pitfalls when working with and interpreting data.
HOW THIS BOOK IS ORGANIZED
Data Heads are people who know how to think critically about data, regardless of their official role. A Data Head can be the analyst behind the keyboard doing the work, or the person at the head of the boardroom table reviewing the work of others. This book will put you, the Data Head, in various roles at different points.
While the “story” of the book is chronological, each chapter is effectively a standalone lesson and could be read out of order. But we recommend reading the book from beginning to end to help construct your mental model to go from the basics to deep learning.
The book is organized into four parts:
Part I: Thinking Like a Data Head In this part, you'll learn to think like a Data Head—to think critically and ask the right questions about the data projects your organization takes on; what data is and the right lingo to use; and, how to view the world through a statistical lens.
Part II: Speaking Like a Data Head Data Heads are active participants in important data conversations. This part will teach you how to “argue” with data and what questions to ask to make sense of the statistics you encounter. You'll be exposed to basic statistics and probability concepts required to understand and challenge the results you see.
Читать дальше