It may come as a surprise—or a relief—that both warnings involve technology and how it should not be included when defining the problem. To be clear, at some point in the project, methodologies and deliverables enter the picture. To start, however, the problem should be in direct, clear language everyone can understand. Which is why we recommend you scrap the technical terminology and marketing rhetoric. Start with the problem to be solved, not the technology to be used.
Why does this matter? We've noticed project teams have a mix of people who are enamored by data or intimidated by it. Once the problem definition conversation steers toward analysis methods or technology, two things happen. First, anyone intimidated by data might freeze up and stop contributing to the discussion—defining the business problem. Second, those enamored by data quickly splinter the problem into technical subproblems that may or may not align to an actual business objective. Once the business problem morphs into data science subproblems, it may be weeks or months before failure is discovered. No one will want to revisit the main problem once the project work starts.
Fundamentally, teams must answer “Is this a real business problem that is worth solving, or are we doing data science for its own sake?” This is a good, albeit blunt, question to ask, especially now during the hype and confusion around data science and related fields.
Who Does This Problem Affect?
The next question you'll want to ask is, “Who does this problem affect?” The spirit of the problem is not only asking who this affects, but how that person's work will be different going forward.
You should think of all layers of the organization (and perhaps its clients, if any). We don't mean the data scientist who works on the problem or the engineering team who may have to maintain software. The Data Head needs to know who the end users are. Often, it's more than just the people in the room crafting the problem, so it's super important for you to find the people whose daily work will be affected and bring them into the meeting .
We suggest you name names. Whose work will be different if the question gets answered? If it's many people, bring in a small group to represent them. Create a list of these people and understand how they will be affected by the project. You'll want to tie these answers back to the last question.
An exercise to help you think through this is to do a solution trial run. Assume you can answer the question, and then ask your team:
Can we use the answer?
Whose work will change?
This, of course, assumes you even had the right data to answer the question. (As we'll see in Chapter 4, this can be a huge assumption.) But you should answer these questions and go through several scenarios where the problem has been solved. In many cases, answering these questions can strengthen the project and its impact, or may identify a project with no business benefit.
What If We Don't Have the Right Data?
Every dataset has a limited amount of information inside it, and at a certain point, no technology or analysis method will help you go any further.
In the authors' experience, not asking “What if we don't have the right data?” is where companies make some of the biggest mistakes—mistakes that could be avoided if only they were considered before the project started. Because what happens is this: everyone who has worked so far on the project now wants to take it to completion no matter what. Data Heads enter the project knowing that not having the right data is a possibility. They create contingencies to pivot to collecting better data to answer the question. Or, if the data doesn't exist, they go back to the original question and attempt to redefine the project scope.
When Is the Project Over?
Many of us have been part of projects that went on too long. When expectations aren't clear before the project starts, teams wind up attending meetings out of habit and generating reports no one bothers to read. Asking “When is the project over?” before the project starts can break this trend.
The question strikes at the heart of why the project was initiated and aligns expectations. Important problems are posed because some information or product is needed in the future that does not exist today. Find out what that final deliverable is. Doing this will rekindle conversations about the project's potential return on investment and whether the team has an agreed-upon metric to measure the project's impact.
So, gather project stakeholders and identify reasons the project could end. Some reasons are obvious, like when a project ends from a lack of funding or waning interest. Set those obvious failures aside and focus on what needs to be delivered to answer the business question and conclude the project. For data projects, the final deliverable is typically an insight (e.g., “how effective was the company's last marketing campaign?”) or an application (e.g., a predictive model that forecasts next week's shipping volume). Many projects will require additional work: perhaps ongoing support and maintenance, but this needs to be communicated to the team up front.
Don't assume you know the answer to this question until you've asked it.
What If We Don't Like the Results?
The last question a Data Head should ask prepares the stakeholders for something they'd rather overlook—the possibility their assumptions were wrong. “What if we don't like the results?” imagines you are at the point of no return. You've spent hours on a project only to find out the results show something different. Notice this is different from having data that can't answer the question. Here, the data can answer the question, perhaps quite confidently, but the answer is not what the stakeholders wanted.
It's never easy to get to the end of a project only to find out the results were not what you expected. This all too real scenario happens more often than we'd like to admit. Thinking first about the possibility that the project might reach an unwanted conclusion will ensure you have a plan in motion when you have to deliver the bad news.
Asking this question will also expose differences in how individuals will accept the results of the project. For instance, consider our avatar George from the introduction. George is the type of person who would ignore the results if they don't align to his beliefs, while simultaneously promoting favorable results that do. The question will hopefully uncover his bias early on before the project starts.
You don't want to start a project where you know there's only one accepted result.
UNDERSTANDING WHY DATA PROJECTS FAIL
Projects can fail for a host of reasons: lack of funding, tight timelines, wrong expertise, unreasonable expectations, and the like. Add data and analysis methods into the mix, and the list of possible failures not only grows but becomes obscured behind the analysis. A project team might apply an analysis method they can't explain on data they don't understand to solve a problem that doesn't matter—and still think they've succeeded.
Let's look at a scenario.
You work for a Fortune 10 company, Company X, that recently received negative media attention for a socially insensitive marketing campaign. You've been assigned to a project to monitor “customer perception.”
The project team consists of the following:
The project manager (you)
The project sponsor (the person paying for it)
Two marketing professionals (who don't have data backgrounds)
A young data scientist (fresh out of college and eager to apply the techniques they learned)
Читать дальше