Alan R. Simon - Data Lakes For Dummies

Здесь есть возможность читать онлайн «Alan R. Simon - Data Lakes For Dummies» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Data Lakes For Dummies: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Data Lakes For Dummies»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Take a dive into data lakes  “Data lakes” is the latest buzz word in the world of data storage, management, and analysis. 
decodes and demystifies the concept and helps you get a straightforward answer the question: “What exactly is a data lake and do I need one for my business?” Written for an audience of technology decision makers tasked with keeping up with the latest and greatest data options, this book provides the perfect introductory survey of these novel and growing features of the information landscape. It explains how they can help your business, what they can (and can’t) achieve, and what you need to do to create the lake that best suits your particular needs. 
With a minimum of jargon, prolific tech author and business intelligence consultant Alan Simon explains how data lakes differ from other data storage paradigms. Once you’ve got the background picture, he maps out ways you can add a data lake to your business systems; migrate existing information and switch on the fresh data supply; clean up the product; and open channels to the best intelligence software for to interpreting what you’ve stored. 
Understand and build data lake architecture Store, clean, and synchronize new and existing data Compare the best data lake vendors Structure raw data and produce usable analytics Whatever your business, data lakes are going to form ever more prominent parts of the information universe every business should have access to. Dive into this book to start exploring the deep competitive advantage they make possible—and make sure your business isn’t left standing on the shore.

Data Lakes For Dummies — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Data Lakes For Dummies», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Figure 2-4 shows how you can leave that strategic planning data mart in place alongside the new data lake. You’re essentially isolating that data mart from the new epicenter of your enterprise analytics. True, some data feeds will be duplicated between the strategic planning data mart and the data lake. But that’s okay! And over time, maybe you’ll decide to incorporate the strategic planning data mart into the data lake itself.

FIGURE 24Leaving a data mart intact and alongside your data lake Data mart - фото 53

FIGURE 2-4:Leaving a data mart intact and alongside your data lake.

Data mart incorporation

The primary difference between isolating an existing data mart (refer to Figure 2-4) and incorporating that data mart into the data lake (see Figure 2-5) is that you eliminate the duplicate data feeds between the two.

FIGURE 25Incorporating a data mart into your data lake Suppose your data - фото 54

FIGURE 2-5:Incorporating a data mart into your data lake.

Suppose your data feeds for your strategic planning data mart are exceptionally well architected. Why not move them over to bring data into the data lake? Chances are, other analytical needs for accounting, finance, HR, marketing, and other organizations and functions within your enterprise can also leverage that data. At the same time, all the great work that your organization did to consolidate and organize data for your annual strategic planning can become part of your overall data lake.

Eliminating Future Stand-Alone Data Marts

Data Lakes For Dummies - изображение 55Even after getting your data mart proliferation under control as part of your data lake efforts, beware: History can easily repeat itself!

Make no mistake about it: Just because you’re now in the data lake era rather than the earlier data warehouse era, business organizations will still likely want to create their own smaller-scale data marts for their specific analytics needs.

Your data lake gives you a carrot-and-stick, one-two punch to help prevent the proliferation of future data marts.

First the stick, and then the carrot.

Establishing a blockade

Your company’s top leadership needs to help you establish a blockade against new data marts springing into existence. Your chief information officer (CIO) needs to make this policy crystal clear, in concert with their counterparts on the business side: the chief operating officer (COO), chief financial officer (CFO), and others in your company’s executive ranks.

Data Lakes For Dummies - изображение 56Ideally, even your chief executive officer (CEO) should sign a declaration that another round of data mart proliferation won’t be tolerated.

Should a “no proliferation” edict be written in stone? Probably not. Some departments within your company will inevitably come up with some unique, time-is-of-the-essence analytical need that is better met through a stand-alone data mart than through the data lake.

Data Lakes For Dummies - изображение 57However, the proponents of a new data mart should be required to prove their case and have their data mart project approved as an exception to the “no proliferation” rule. They need to declare the following:

What the business imperative is for building a new stand-alone data mart (for example, to address some sort of business crisis or to take advantage of a market opportunity that must be addressed immediately)

Why their analytical needs can’t be met using the data lake in the same time frame that it would take to build their new data mart

Whether their planned data mart will be used only for a short period of time and be retired or if it will subsequently be incorporated into the data lake

Providing a path of least resistance

Business users around your organization build new stand-alone data marts because that’s what they’ve done for a long, long time. They realize that the best way to bring data-driven insights into the way they do business is to take charge of their own fate and build an end-to-end solution. Old habits are extremely difficult to break!

Data Lakes For Dummies - изображение 58Beyond a blockade on new data mart development, your data lake can give these business users a path of least resistance. Make it easier for them to go to the data lake for the data they need instead of doing everything on their own.

Suppose that a new chief people officer (CPO) is hired to lead your company’s HR organization. Jan, the new CPO, is a big believer in applying super-advanced analytics, such as machine learning and artificial intelligence, to numerous HR functions: employee evaluations, salary adjustments and promotions, succession planning, and more.

Jan appoints an analytics team within HR and tells them that, within the next three months, they need to have some initial machine learning models built in time for the semiannual employee evaluation cycle. Raul, the analytics teamleader, has been with your company for 15 years and has built several HR-specific data marts in the past for similar needs.

Raul assigns two of the team members, Julia and Dhiraj, to analyze the HR data in Workday (a cloud-based HR and financial management system) to figure out what data needs to be brought into the machine learning model. Raul also assigns another team member, Tamara, to start designing an Amazon Redshift database to store the HR data and support the machine learning algorithms.

Not so fast, Raul!

Raul submits his budget request for the new HR employee incentive evaluation and involvement operations (EIEIO) data mart and is surprised to learn that he needs to present his business case to the company’s new data mart exception board. Raul starts preparing his PowerPoint slides, and comes across item number 2: “State why your analytical needs cannot be met through existing data lake content.”

“Hmm … a data lake,” Raul thinks. “I wonder if the data we need is already in there?”

Sure enough, Raul goes browsing through the data lake catalog and finds that the data lake already has a ton of HR data from Workday that is regularly refreshed. He asks Julia and Dhiraj to match up the work that they’ve done so far with what the data lake catalog shows. Within two hours, they report back with the fantastic news: “Everything we need is in the data lake already!”

A well-constructed data lake offers business users a path of least resistance when it comes to gathering the data they need for their analytical needs. Raul’s team will still need to build the machine learning models to produce the analytics that Jan, your CPO, wants to apply to the next evaluation cycle. But they no longer need to proceed with analytics on a business-as-usual basis, constantly acquiring and storing the same data over and over in different data marts.

Over time, as familiarity with the data lake spreads throughout your organization, fewer unnecessary data mart requests such as Raul’s will need to be redirected back to the data lake. Raul wasn’t deliberately trying to do everything on his own; he just wasn’t familiar enough with what the data lake provided, not only to HR but to your company as a whole.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Data Lakes For Dummies»

Представляем Вашему вниманию похожие книги на «Data Lakes For Dummies» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Data Lakes For Dummies»

Обсуждение, отзывы о книге «Data Lakes For Dummies» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x