1 Cover
2 Title Page Data Science in Theory and Practice Techniques for Big Data Analytics and Complex Data Sets Maria Cristina Mariani University of Texas, El Paso El Paso, United States Osei Kofi Tweneboah Ramapo College of New Jersey Mahwah, United States Maria Pia Beccar-Varela University of Texas, El Paso El Paso, United States
3 Copyright
4 List of Figures
5 List of Tables
6 Preface
7 1 Background of Data Science1.1 Introduction 1.2 Origin of Data Science 1.3 Who is a Data Scientist? 1.4 Big Data
8 2 Matrix Algebra and Random Vectors2.1 Introduction 2.2 Some Basics of Matrix Algebra 2.3 Random Variables and Distribution Functions 2.4 Problems
9 3 Multivariate Analysis3.1 Introduction 3.2 Multivariate Analysis: Overview 3.3 Mean Vectors 3.4 Variance–Covariance Matrices 3.5 Correlation Matrices 3.6 Linear Combinations of Variables 3.7 Problems
10 4 Time Series Forecasting4.1 Introduction 4.2 Terminologies 4.3 Components of Time Series 4.4 Transformations to Achieve Stationarity 4.5 Elimination of Seasonality via Differencing 4.6 Additive and Multiplicative Models 4.7 Measuring Accuracy of Different Time Series Techniques 4.8 Averaging and Exponential Smoothing Forecasting Methods 4.9 Problems
11 5 Introduction to R5.1 Introduction 5.2 Basic Data Types 5.3 Simple Manipulations – Numbers and Vectors 5.4 Problems
12 6 Introduction to Python6.1 Introduction 6.2 Basic Data Types 6.3 Number Type Conversion 6.4 Python Conditions 6.5 Python File Handling: Open, Read, and Close 6.6 Python Functions 6.7 Problems
13 7 Algorithms7.1 Introduction 7.2 Algorithm – Definition 7.3 How to Write an Algorithm 7.4 Asymptotic Analysis of an Algorithm 7.5 Examples of Algorithms 7.6 Flowchart 7.7 Problems
14 8 Data Preprocessing and Data Validations8.1 Introduction 8.2 Definition – Data Preprocessing 8.3 Data Cleaning 8.4 Data Transformations 8.5 Data Reduction 8.6 Data Validations 8.7 Problems
15 9 Data Visualizations9.1 Introduction 9.2 Definition – Data Visualization 9.3 Data Visualization Techniques 9.4 Data Visualization Tools 9.5 Problems
16 10 Binomial and Trinomial Trees10.1 Introduction 10.2 The Binomial Tree Method 10.3 Binomial Discrete Model 10.4 Trinomial Tree Method 10.5 Problems
17 11 Principal Component Analysis11.1 Introduction 11.2 Background of Principal Component Analysis 11.3 Motivation 11.4 The Mathematics of PCA 11.5 How PCA Works 11.6 Application 11.7 Problems
18 12 Discriminant and Cluster Analysis12.1 Introduction 12.2 Distance 12.3 Discriminant Analysis 12.4 Cluster Analysis 12.5 Problems
19 13 Multidimensional Scaling13.1 Introduction 13.2 Motivation 13.3 Number of Dimensions and Goodness of Fit 13.4 Proximity Measures 13.5 Metric Multidimensional Scaling 13.6 Nonmetric Multidimensional Scaling 13.7 Problems
20 14 Classification and Tree‐Based Methods14.1 Introduction 14.2 An Overview of Classification 14.3 Linear Discriminant Analysis 14.4 Tree‐Based Methods 14.5 Applications 14.6 Problems
21 15 Association Rules15.1 Introduction 15.2 Market Basket Analysis 15.3 Terminologies 15.4 The Apriori Algorithm 15.5 Applications 15.6 Problems
22 16 Support Vector Machines16.1 Introduction 16.2 The Maximal Margin Classifier 16.3 Classification Using a Separating Hyperplane 16.4 Kernel Functions 16.5 Applications 16.6 Problems
23 17 Neural Networks17.1 Introduction 17.2 Perceptrons 17.3 Feed Forward Neural Network 17.4 Recurrent Neural Networks 17.5 Long Short‐Term Memory 17.6 Application 17.7 Significance of Study 17.8 Problems
24 18 Fourier Analysis18.1 Introduction 18.2 Definition 18.3 Discrete Fourier Transform 18.4 The Fast Fourier Transform (FFT) Method 18.5 Dynamic Fourier Analysis 18.6 Applications of the Fourier Transform 18.7 Problems
25 19 Wavelets Analysis19.1 Introduction 19.2 Discrete Wavelets Transforms 19.3 Applications of the Wavelets Transform 19.4 Problems
26 20 Stochastic Analysis20.1 Introduction 20.2 Necessary Definitions from Probability Theory 20.3 Stochastic Processes 20.4 Examples of Stochastic Processes 20.5 Measurable Functions and Expectations 20.6 Problems
27 21 Fractal Analysis – Lévy, Hurst, DFA, DEA21.1 Introduction and Definitions 21.2 Lévy Processes 21.3 Lévy Flight Models 21.4 Rescaled Range Analysis (Hurst Analysis) 21.5 Detrended Fluctuation Analysis (DFA) 21.6 Diffusion Entropy Analysis (DEA) 21.7 Application – Characterization of Volcanic Time Series 21.8 Problems
28 22 Stochastic Differential Equations22.1 Introduction 22.2 Stochastic Differential Equations 22.3 Examples 22.4 Multidimensional Stochastic Differential Equations 22.5 Simulation of Stochastic Differential Equations 22.6 Problems
29 23 Ethics: With Great Power Comes Great Responsibility23.1 Introduction 23.2 Data Science Ethical Principles 23.3 Data Science Code of Professional Conduct 23.4 Application 23.5 Problems
30 Bibliography
31 Index
32 End User License Agreement
1 Chapter 2 Table 2.1 Examples of random vectors.
2 Chapter 3 Table 3.1 Ramus Bone Length at Four Ages for 20 Boys.
3 Chapter 4 Table 4.1 Time series data of the volume of sales of over a six hour period. Table 4.2 Simple moving average forecasts. Table 4.3 Time series data used in Example 4.6. Table 4.4 Weighted moving average forecasts. Table 4.5 Trend projection of weighted moving average forecasts. Table 4.6 Exponential smoothing forecasts of volume of sales. Table 4.7 Exponential smoothing forecasts from Example 4.9.Table 4.8 Adjusted exponential smoothing forecasts.
4 Chapter 6Table 6.1 Numbers.Table 6.2 Files mode in Python.
5 Chapter 7Table 7.1 Common asymptotic notations.
6 Chapter 9Table 9.1 Temperature versus ice cream sales.
7 Chapter 12Table 12.1 Events information.Table 12.2 Discriminant scores for earthquakes and explosions groups.Table 12.3 Discriminant scores for Lehman Brothers collapse and Flash crash ...Table 12.4 Discriminant scores for Citigroup in 2009 and IAG stock in 2011.
8 Chapter 13Table 13.1 Data matrix.Table 13.2 Distance matrix.Table 13.3 Stress and goodness of fit.Table 13.4 Data matrix.
9 Chapter 14Table 14.1 Models' performances on the test dataset with 23 variables using ...Table 14.2 Top 10 variables selected by the Random forest algorithm.Table 14.3 Performance for the four models using the top 10 features from mo...
10 Chapter 15Table 15.1 Market basket transaction data.Table 15.2 A binary representation of market basket transaction data.Table 15.3 Grocery transactional data.Table 15.4 Transaction data.
11 Chapter 16Table 16.1 Models performances on the test dataset.
12 Chapter 18Table 18.1 Percentage of power for Discover data.Table 18.2 Percentage of power for JPM data.Table 18.3 Percentage of power for Microsoft data.Table 18.4 Percentage of power for Walmart data.
13 Chapter 19Table 19.1 Determining and for .Table 19.2 Percentage of total power (energy) forAlbuquerque, New Mexico (A...Table 19.3 Percentage of total power (energy) forTucson, Arizona (TUC) seis...
14 Chapter 21Table 21.1 Moments of the Poisson distribution with intensity .Table 21.2 Moments of the distribution.Table 21.3 Scaling exponents of Volcanic Data time series.
Читать дальше