LibCat » Книги » Приключения » unrecognised » Seifedine Kadry - Big Data

Seifedine Kadry - Big Data

Здесь есть возможность читать онлайн «Seifedine Kadry - Big Data» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Big Data
Автор:
Seifedine Kadry
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
3 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 60
- 1
- 2
- 3
- 4
- 5

Big Data: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Big Data»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Learn Big Data from the ground up with this complete and up-to-date resource from leaders in the field Big Data: Concepts, Technology, and Architecture You’ll learn about the creation of structured, unstructured, and semi-structured data, data storage solutions, traditional database solutions like SQL, data processing, data analytics, machine learning, and data mining. You’ll also discover how specific technologies like Apache Hadoop, SQOOP, and Flume work.
Big Data Accessibly organized,
includes illuminating case studies throughout the material, showing you how the included concepts have been applied in real-world settings. Some of those concepts include:
The common challenges facing big data technology and technologists, like data heterogeneity and incompleteness, data volume and velocity, storage limitations, and privacy concerns Relational and non-relational databases, like RDBMS, NoSQL, and NewSQL databases Virtualizing Big Data through encapsulation, partitioning, and isolating, as well as big data server virtualization Apache software, including Hadoop, Cassandra, Avro, Pig, Mahout, Oozie, and Hive The Big Data analytics lifecycle, including business case evaluation, data preparation, extraction, transformation, analysis, and visualization Perfect for data scientists, data engineers, and database managers,
also belongs on the bookshelves of business intelligence analysts who are required to make decisions based on large volumes of information. Executives and managers who lead teams responsible for keeping or understanding large datasets will also benefit from this book.

Big Data — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Big Data», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

8 _______ is the process of dividing the data set and distributing the data over multiple servers.VerticalShardingPartitionAll of the mentionedAnswer:bExplanation: Sharding is the process of partitioning very large data sets into smaller and easily manageable chunks called shards.

9 A sharded cluster is _______ to provide high availability.ReplicatedPartitionedClusteredNone of the aboveAnswer:aExplanation: Replication makes the system fault tolerant since the data is not lost when an individual node fails as the data is redundant across the nodes.

10 NoSQL databases exhibit ______ properties.ACIDBASEBoth a and bNone of the aboveAnswer:b

Conceptual Short Questions with Answers

1 What is a distributed file system? A distributed file system is an application that stores the files across cluster nodes and allows the clients to access the files from the cluster. Though physically the files are distributed across the nodes, logically it appears to the client as if the files are residing on their local machine.

2 What is failover? Failover is the process of switching to a redundant node upon the abnormal termination or failure of a previously active node.

3 What is the difference between failover and switch over? Failover is an automatic mechanism that does not require any human intervention. This differentiates it from the switch over operation, which essentially requires human intervention.

4 What are the types of cluster? There are types of clusterHigh‐availability clusterLoad‐balancing cluster

5 What is a high‐availability cluster? High availability clusters are designed to minimize downtime and provide uninterrupted service when nodes fail. Nodes in a highly available cluster must have access to a shared storage. Such systems are often used for failover and backup purposes.

6 What is a load‐balancing cluster? Load balancing clusters are designed to distribute workloads across different cluster nodes to share the service load among the nodes. The main objective of load balancing is to optimize the use of resources, minimize response time, maximize throughput, and avoid overload on any one of the resources.

7 What is a symmetric cluster? Symmetric cluster is a type of cluster structure in which each node functions as an individual computer capable of running applications.

8 What is an asymmetric cluster? Asymmetric cluster is a type of cluster structure in which one machine acts as the head node, and it serves as the gateway between the user and the remaining nodes.

9 What is sharding? Sharding is the process of partitioning very large data sets into smaller and easily manageable chunks called shards. The partitioned shards are stored by distributing them across multiple machines called nodes. No two shards of the same file are stored in the same node, each shard occupies separate nodes, and the shards spread across multiple nodes collectively constitute the data set.

10 What is Replication? Replication is the process of copying the same data blocks across multiple nodes to overcome the loss of data when a node crashes. The copy of a data block is called replica. Replication makes the system fault tolerant since the data is not lost when an individual node fails as the data is redundant across the nodes.

11 What is the difference between replication and sharding? Replication copies the same data blocks across multiple nodes whereas sharding copies different data across different nodes.

12 What is the master‐slave model? Master‐slave configuration is a model where one centralized device known as the master controls one or more devices known as slaves.

13 What is the peer‐to‐peer model? In a peer‐to‐peer configuration there is no master‐slave concept, all the nodes have the same responsibility and are at the same level.

14 What is scaling up? Scaling‐up, the vertical scalability, adds more resources to the existing server to increase its capacity to hold more data. The resources can be computation power, hard drive, RAM, and so on. This type of scaling is limited to the maximum scaling capacity of the server.

15 What is Scaling‐out? Scaling out, the horizontal scalability, adds new servers or components to meet the demand. The additional component added is termed as node. Big data technologies work on the basis of scaling out storage. Horizontal scaling enables the system to scale wider to meet the increasing demand. Scaling out storage uses low cost commodity hardware and storage components. The components can be added as required without much complexity. Multiple components connect together to work as a single entity.

16 What is a NewSQL database? A NewSQL database is designed to provide scalable performance similar to that of NoSQL systems combining the ACID (atomicity, consistency, isolation, and durability), properties of a traditional database management system.

Конец ознакомительного фрагмента.

Текст предоставлен ООО «ЛитРес».

Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.

Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.