Alan R. Simon - Data Lakes For Dummies

Здесь есть возможность читать онлайн «Alan R. Simon - Data Lakes For Dummies» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Data Lakes For Dummies: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Data Lakes For Dummies»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Take a dive into data lakes  “Data lakes” is the latest buzz word in the world of data storage, management, and analysis. 
decodes and demystifies the concept and helps you get a straightforward answer the question: “What exactly is a data lake and do I need one for my business?” Written for an audience of technology decision makers tasked with keeping up with the latest and greatest data options, this book provides the perfect introductory survey of these novel and growing features of the information landscape. It explains how they can help your business, what they can (and can’t) achieve, and what you need to do to create the lake that best suits your particular needs. 
With a minimum of jargon, prolific tech author and business intelligence consultant Alan Simon explains how data lakes differ from other data storage paradigms. Once you’ve got the background picture, he maps out ways you can add a data lake to your business systems; migrate existing information and switch on the fresh data supply; clean up the product; and open channels to the best intelligence software for to interpreting what you’ve stored. 
Understand and build data lake architecture Store, clean, and synchronize new and existing data Compare the best data lake vendors Structure raw data and produce usable analytics Whatever your business, data lakes are going to form ever more prominent parts of the information universe every business should have access to. Dive into this book to start exploring the deep competitive advantage they make possible—and make sure your business isn’t left standing on the shore.

Data Lakes For Dummies — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Data Lakes For Dummies», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

The best data lakes are those that satisfy the needs of a broad range of constituencies — basically, something for everyone to make the results well worth the effort.

Carpe Diem: Seizing the Day with Big Data

Maybe your organization has been dabbling in the world of big data for a while, going back to when Hadoop was one of the hottest new technologies. You’ve built some pretty nifty predictive analytics models, and now you’re fairly adept at discovering important patterns buried in mountains of data.

So far, though, your AAA — adventures in advanced analytics — have been highly fragmented. In fact, your analytical data is all over the place. You don’t have consistent approaches to cleansing and refining raw data to get the data ready for analytics; different groups do their own thing. It’s like the Wild West out there!

The concept of a data lake helps you harness the power of big data technology to the benefit of your entire organization. By following emerging best practices, avoiding traps and pitfalls, and building a solidly architected data lake, you can seize the day and help take your organization to new heights when it comes to analytics and data-driven insights.

Data Lakes For Dummies - изображение 45You’ll achieve economies of scale for the data side of analytics throughout your organization, which means that you’ll get “more bang for your buck” when it comes to acquiring, consolidating, preparing, and storing your analytical data on behalf of your enterprise as a whole rather than repetitively doing so for numerous smaller groups.

Managing Equal Opportunity Data

Your data lake’s big data foundation presents you with an opportunity that, not too long ago, was out of reach for most organizations. You can store, manage, and analyze all three types of data — structured, unstructured, and semi-structured — within a single environment, and without having to jump through hoops to do so!

Many of the business questions you ask of your data will only require structured data. Suppose you work in the supply chain organization within your company. You’ll definitely want your data lake to provide insight into the following:

Who among your strategic suppliers has the best combination of on-time component production and also very low problem rates?

Which third-party logistics firms have the best — or worst — on-time shipping performance?

What’s the percentage of product spoilage among all internal and third-party warehouses during the past six months?

Other critical business analytics may involve unstructured or semi-structured data. You’ll want to know the following:

What percentage of tweets from your customers represent a positive sentiment about your product quality? Negative sentiment? What “hot spots” are showing up in blogs, tweets, and other social media posts, as well as YouTube videos, that can mean profitability and market share problems for you down the road?

Your reports show a dramatic increase in breakage in Warehouse #2. You have surveillance cameras in all your facilities. Is there anything that shows up on video that could indicate one or more root causes for this breakage that you can address through procedural changes?

Data Lakes For Dummies - изображение 46Your data lake gives you one-stop shopping for structured, unstructured, and semi-structured data in a logically centralized, cohesive environment.

BACK TO THE FUTURE, PART 2

In the first edition of Data Warehousing For Dummies (Wiley), back in 1996, I included a chapter about the future directions of data warehousing. One of the forecasts I made was that the first-generation data warehousing of that time would eventually evolve into what I called “multimedia data warehousing” and would include not only structured data but also video and audio content. I made this prediction on the basis that “not all of the business questions we need to ask out of a data warehouse will come from numbers, dates, and character strings; sometimes we need information from images and other multimedia content as well.”

Guess what? You can think of a data lake as the modern incarnation of that “multimedia data warehouse” that I wrote about more than a quarter-century ago. It’s here!

Building Today’s — and Tomorrow’s — Enterprise Analytical Data Environment

Building an all-new analytical data environment around big data technology sounds like a great idea, right? You may be worried, though, that your organization can invest a ton of money over the next couple of years, only to find that your data lake is obsolete because of an entirely new generation of technology.

In other words, can your data lake be not just today’s but also tomorrow’s go-to platform for more and more analytical data and data-driven insights? Absolutely!

Constructing a bionic data environment

Maybe you’ve heard of a B-52. No, not a member of the American new wave music group (so don’t start singing “Love Shack”) but rather the U.S. Air Force plane.

The B-52 first became operational in 1952. The normal life span for an Air Force plane is around 28 years before it’s shuffled off to retirement, which means that B-52s should’ve gone out of service around 1980. Instead, the B-52 will eventually be retired sometime in the 2050s. That’s a hundred years — an entire century!

However, a B-52 today bears only a slight resemblance to one made in the ’50s or ’60s. Sure, if you were to put one of the original B-52s side by side with one of today’s planes, the two aircraft would look nearly identical. But the engines, the avionics, the flight controls … pretty much every major subsystem has been significantly upgraded and replaced in each operational B-52 at least a couple times over the years.

Better yet, a B-52 isn’t just some old plane that you may see flying at an airshow but that otherwise doesn’t have much purpose due to the passage of time. Not only is the B-52 still a viable, operational plane, but its mission has continually expanded over the years thanks to new technologies and capabilities.

In fact, you can think of a B-52 as sort of a bionic airplane. Its components and subsystems have been — and will continue to be — swapped out and substantially upgraded on a regular basis, giving the plane a planned life span of almost four times the normal longevity of the typical Air Force plane. Talk about an awe-inspiring feat of engineering!

However, all those enhancements and modifications to the B-52 happened gradually over time, not all at once. Plus, the changes were all carefully planned and implemented with longevity and continued viability top of mind.

Your data lake should follow the same model: a “bionic” enterprise-scale analytical data environment that regularly incorporates new and improved technologies to replace older ones, as well as enhancing overall function. You almost certainly won’t get an entire century’s usage out of a data lake that you build today, but if you do a good job with your planning and implementation, 10 or even 20 years of value from your data lake is certainly achievable.

More important, your data lake won’t be just another aging system hanging around long past when it should’ve been retired. You almost certainly have plenty of those antiquated systems stashed in your company’s overall IT portfolio. That’s why the B-52 is the perfect analogy for the data lake, with a “bionic” approach to regularly replacing major subsystems helping to keep your data lake viable for years to come.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Data Lakes For Dummies»

Представляем Вашему вниманию похожие книги на «Data Lakes For Dummies» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Data Lakes For Dummies»

Обсуждение, отзывы о книге «Data Lakes For Dummies» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x