Data Mining and Machine Learning Applications
Здесь есть возможность читать онлайн «Data Mining and Machine Learning Applications» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.
- Название:Data Mining and Machine Learning Applications
- Автор:
- Жанр:
- Год:неизвестен
- ISBN:нет данных
- Рейтинг книги:3 / 5. Голосов: 1
-
Избранное:Добавить в избранное
- Отзывы:
-
Ваша оценка:
- 60
- 1
- 2
- 3
- 4
- 5
Data Mining and Machine Learning Applications: краткое содержание, описание и аннотация
Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Data Mining and Machine Learning Applications»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.
The book elaborates in detail on the current needs of data mining and machine learning and promotes mutual understanding among research in different disciplines, thus facilitating research development and collaboration.
Audience
Data Mining and Machine Learning Applications — читать онлайн ознакомительный отрывок
Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Data Mining and Machine Learning Applications», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.
Интервал:
Закладка:
Keywords: Data mining, KDD, clustering, classification, Python, KNIME
1.1 Introduction
1.1.1. Data Mining
‘Mining’—extracts the meaningful information from the databases. This method helps the researchers, students, and other IT professionals remove the exact significant details and develop the desired applications [1, 2]. It is also known as Knowledge Discovery from databases—KDD. The applications of KDD may include medical/hospitals, Marketing, Educational systems, Scientific applications, E-commerce, Retail industries, Biological analysis, Counterterrorism, use in data-warehouse, in the energy sector for decision making, Spatial data mining, and Logistics [4–6].
1.2 Knowledge Discovery in Database (KDD)
It helps detect the new patterns of previously unknown data, i.e., extracting the hidden patterns, data from the massive volume of datasets [3, 6]. Figure 1.1gives an idea about Knowledge discovery in Database—KDD, which consists of the following phases:
Data cleaning: This step can be defined as removing irrelevant data. Removing irrelevant data is nothing but unwanted data; records can be removed. Data collection may consist of missing values which must be either needs to be removed or should impute the missing information [7]. Figure 1.1 Knowledge discovery in Database—KDD.
Data integration: Data is collected from heterogeneous sources and integrated into a common source like data-warehouse (DW). A very common technique, Extract-Transform-Load (ETL), is beneficial in this regard. Integrating the data from multiple sources requires proper synchronization between the systems [2].
Data selection & transformation: Once the required data is selected, the next task is data transformation. As its name suggests transformation, it is nothing but transforming it into the desired mining procedure [8, 9].
Pattern evaluation: Evaluation is based on some measures; once these measures are applied, retrieved results are strictly compared/evaluated based on the stored patterns [9–11].
Knowledge representation: It is nothing but representing the processed data into the required formats such as tables and reports. One can say knowledge representation generates the rules, and using the exact visualization is possible [10].
1.2.1 Importance of Data Mining
◦ Useful in predictive analysis.
◦ They are storing and managing data in multidimensional systems.
◦ They are identifying the hidden patterns.
◦ Knowledge representation in desired formats, etc. [11].
1.2.2 Applications of Data Mining
Fraud Detection◦ Data mining identifies patterns, i.e., user-specific patterns, and builds a model based on valid and invalid states. Using data mining techniques, one can classify records based on fraudulent and non-fraudulent patterns [14].
Marketing Analysis◦ It is based on Association mining, i.e., identifying user’s preferences. With such techniques, one can identify purchasing habits of the users. Using this technique, one can compare different items, pricing of the items, etc. [13].
Customer Relationship Management◦ Every organization is keenly observing and maintains this segment which is popularly known as CRM. In this segment, one can distinguish users/customers based on loyalty towards the organization. User’s/Customer’s data can be collected and analyzed to get desired results [13].
Banking and Finance◦ The banking and finance sector holds huge data related to clients. Banking and financial software systems help different managers to identify the correct client segment, loyal clients. These software systems process ‘n’ transactions which a person cannot handle manually. Such soft-ware systems stores process a large volume of data and produce desired results less time [13].
Healthcare Industries◦ Everyone concerns about health. Different parameters and values help the health care professionals to diagnose the disease. The number of patients, diseases and symptoms can be processed to get an accurate prediction. Software systems used in the health care industry process a large chunk of observed values and compare them with the stored patterns to draw an accurate conclusion [13].
Educational Purpose◦ Using data mining, one can identify the student’s interests in different fields. It also helps in improving teaching methodology with new trends [13].
Crime Investigation◦ Data mining helps in identifying different patterns applied in other crimes. Crimes, criminals, and their crime characteristics are analyzed under this category. A large volume of (stored data) can be processed to identify different relationships with criminals. In this category, face recognition, fingerprint recognition, etc., are considered and used in the investigation [14].
1.2.3 Databases
It is a collection of records. With databases and their structures, records may vary with the applications. Here are the following types of databases that can be used in many applications [15].
Transactional Database: It is a popular type of database that consists of rows and columns, i.e., known as transactions. The transaction has the following parameters.Transaction idTimestampList of itemsItem descriptionThe transaction id is a unique identifier generated by the system. Transactional databases are mostly related to financial matters such as banking transactions, booking a movie ticket, booking a flight, etc. [16].
Multimedia Database: The data integration phase from the KDD process integrates data from multiple sources, and that data could be in the form of text, document, video, image, audio, etc. Storing these different data types (multimedia data) requires high dimensional space, which is a characteristic of a multimedia database [17]. Its examples areVideo-on-demandDigital librariesAnimationsImages.
Spatial Database: Similar to multimedia and transactional database, there is a spatial database which can store geographical information. This information maps, positioning of the object, etc. Geographic coordinates are handy in determining the topographic data [17]. Figure 1.2 Time series database.
Time-Series Database: As its name suggests time-series database—holds information related to a specific item w.r.t. time. E.g., weekly, monthly, yearly, etc. Such patterns help predict the trends and movements of an item in a particular time zone and are represented in Figure 1.2.
1.3 Issues in Data Mining
Data mining consists of tasks like user interfacing, mining, security, performance, and data source. The following is a discussion on various tradeoffs of data mining [3–5, 14].
◦ User interface designAs discussed in the KDD process where discovered knowledge needs to be represented using good, accurate visualization. The user interface design issue addresses the interaction required within users and the systems, information rendering. This issue requires analysts, programmers to work on different conceptual levels.
◦ Mining methodologies issuesThis issue addresses the following sub-points:Algorithms to be usedError-free dataLess time complexityMetadata processing.
◦ Security issuesSecurity is a very important issue in data mining. Data collection, data processing requires maintaining the integrity, confidentiality of the data. Data mining systems deal with the private and sensitive information of the users and hence providing security to this data is a primary objective of this method.
◦ Performance issuesThere are many data mining applications existing in the market that are used in different sectors. These applications process a large volume of data and hence data mining algorithms; applications must process this data without compromising the performance of the system.
Читать дальшеИнтервал:
Закладка:
Похожие книги на «Data Mining and Machine Learning Applications»
Представляем Вашему вниманию похожие книги на «Data Mining and Machine Learning Applications» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.
Обсуждение, отзывы о книге «Data Mining and Machine Learning Applications» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.