2.14 Rank-aLASSO trace of the diabetes data set showing variable importance.
2.15 Diabetes data set showing variable ordering and adjusted R 2plot.
2.16 Rank-aLASSO cleaning followed by rank-ridge estimation.
2.17 R-ridge traces and CV scheme with optimal λ 2.
2.18 MSE and MAE plots for five-fold CV scheme producing similar optimal λ 2.
2.19 LS-Enet traces for α = 0.0, 0.2, 0.4, 0.8, 1.0.
2.20 LS-Enet traces and five-fold CV results for α = 0.6 from glmnet().
3.1 Key shrinkage R-estimators to be considered.
3.2 The ADRE of the shrinkage R-estimator using the optimal c and URE.
3.3 The ADRE of the preliminary test (or hard threshold) R-estimator for different Δ 2based on λ*=2ln(2).
3.4 The ADRE of nEnet R-estimators.
3.5 Figure of the ADRE of all R-estimators for different Δ 2.
4.1 Boxplot and Q–Q plot using ANOVA table data.
4.2 LS-ridge and ridge R traces for fertilizer problem from ANOVA table data.
4.3 LS-LASSO and LASSOR traces for the fertilizer problem from the ANOVA table data.
4.4 Effect of variance on shrinkage using ridge and LASSO traces.
4.5 Hard threshold and positive-rule Stein–Saleh traces for ANOVA table data.
8.1 Left: the qq-plot for the diabates data sets; Right: the distribution of the residuals.
11.1 Sigmoid function.
11.2 Outlier in the context of logistic regression.
11.3 LLR vs. RLR with one outlier.
11.4 LLR vs. RLR with no outliers.
11.5 LLR vs. RLR with two outliers.
11.6 Binary classification – nonlinear decision boundary.
11.7 Binary classification comparison – nonlinear boundary.
11.8 Ridge comparison of number of correct solutions with n = 337.
11.9 LLR-ridge regularization showing the shrinking decision boundary.
11.10 LLR, RLR and SVM on the circular data set with mixed outliers.
11.11 Histogram of passengers: (a) age and (b) fare.
11.12 Histogram of residuals associated with the null, LLR, RLR, and SVM cases for the Titanic data set. SVM probabilities were extracted from the sklearn.svm package.
11.13 RLR-ridge trace for Titanic data set.
11.14 RLR-LASSO trace for the Titanic data set.
11.15 RLR-aLASSO trace for the Titanic data set.
12.1 Computational unit (neuron) for neural networks.
12.2 Sigmoid and relu activation functions.
12.3 Four-layer neural network.
12.4 Neural network example of back propagation.
12.5 Forward propagation matrix and vector operations.
12.6 ROC curve and random guess classifier line based on the RLR classifier on the Titanic data set of Chapter 11.
12.7 Neural network architecture for the circular data set.
12.8 LNNs and RNNs on the circular data set ( n = 337) with nonlinear decision boundaries.
12.9 Convergence plots for LNNs and RNNs for the circular data set.
12.10 ROC plots for LNNs and RNNs for the circular data set.
12.11 Typical setup for supervised learning methods. The training set is used to build the model.
12.12 Examples from test data set with cat = 1, dog = 0.
12.13 Unrolling of an RGB image into a single vector.
12.14 Effect of over-fitting, under-fitting and regularization.
12.15 Convergence plots for LLN and RNNs (test size = 35).
12.16 ROC plots for LLN and RNNs (test size = 35).
12.17 Ten representative images from the MNIST data set.
12.18 LNN and RNN convergence traces – loss vs. iterations (Χ100).
12.19 Residue histograms for LNNs (0 outliers) and RNNs (50 outliers).
12.20 These are 49 potential outlier images reported by RNNs.
12.21 LNN (0 outliers) and RNN (144 outliers) residue histograms.
12.22 LNN and RNN confusion matrices and MCC scores. 418
1.1 Comparison of mean and median on three data sets.
1.2 Examples comparing order and rank statistics.
1.3 Belgium telephone data set.
1.4 Comparison of LS and Theil estimations of Figures 1.1(a) and (d).
1.5 Walsh averages for the set {0.1, 1.2, 2.3, 3.4, 4.5, 5.0, 6.6, 7.7, 8.8, 9.9, 10.5}.
1.6 The individual terms that are summed in Dn ( β ) and Ln ( β ) for the telephone data set.
1.7 The terms that are summed in Dn ( θ ) and Ln ( θ ) for the telephone data set.
1.8 The LS and R estimations of slope and intercept for Figure 1.1 cases.
1.9 Interpretation of L 1/L 2loss and penalty functions
2.1 Swiss fertility data set.
2.2 Swiss fertility data set definitions.
2.3 Swiss fertility estimates and standard errors for least squares (LS) and rank (R).
2.4 Swiss data subset ordering using | t .value |
2.5 Swiss data models with adjusted R 2values.
2.6 Estimates with outliers from diabetes data before standardization.
2.7 Estimates. MSE and MAE for the diabetes data
2.8 Enet estimates, training MSE and test MSE as a function of α for the diabetes data
3.1 The ADRE values of ridge for different values of Δ 2
3.2 Maximum and minimum guaranteed ADRE of the preliminary test R-estimator for different values of α .
3.3 The ADRE values of the Saleh-type R-estimator for λmax*=2π and different Δ 2
3.4 The ADRE values of the positive-rule Saleh-type R-estimator for λmax*=2π and different Δ 2
3.5 The ADRE of all R-estimators for different Δ 2
4.1 Table of (hypothetical) corn crop yield from six different fertilizers.
4.2 Table of p -values from pairwise comparisons of fertilizers.
8.1 The VIF values of the diabetes data set.
8.2 Estimations for the diabetes data *. (The numbers in parentheses are the corresponding standard errors).
11.1 LLR algorithm.
11.2 RLR algorithm.
11.3 Car data set.
11.4 Ridge accuracy vs. λ 2with n = 337 (six outliers).
11.5 RLR-LASSO estimates vs. λ 1with number of correct predictions.
11.6 Sample of Titanic training data.
11.7 Specifications for the Titanic data set.
11.8 Number of actual data entries in each column.
11.9 Cross-tabulation of survivors based on sex.
11.10 Cross-tabulation using Embarked for the Titanic data set.
11.11 Sample of Titanic numerical training data.
11.12 Number of correct predictions for Titanic training and test sets.
11.13 Train/test set accuracy for LLR-ridge. Optimal value at (*).
11.14 Train/test set accuracy for RLR-ridge. Optimal value at (*).
11.15 Train/Test set accuracy for LLR-LASSO. Optimal value at (*).
11.16 Train/test set accuracy for RLR-LASSO. Optimal value at (*).
12.1 RNN-ridge algorithm.
12.2 Interpretation of the confusion matrix.
12.3 Confusion matrix for Titanic data sets using RLR (see Chapter 11).
12.4 Number of correct predictions (percentages) and AUROC of LNN-ridge.
12.5 Input ( x ij), output ( y i) and predicted values p ~( x i) for the image classification problem.
12.6 Confusion matrices for RNNs and LNNs (test size = 35).
12.7 Accuracy metrics for RNNs vs. LNNs (test size = 35).
12.8 Train/test set accuracy for LNNs. F 1score is associated with the test set.
12.9 Train/test set accuracy for RNNs. F 1score is associated with the test set.
12.10 Confusion matrices for RNNs and LNNs (test size = 700).
12.11 Accuracy metrics for RNNs vs. LNNs (test size = 700).
12.12 MNIST training with 0 outliers.
12.13 MNIST training with 90 outliers.
12.14 MNIST training with 180 outliers.
12.15 MNIST training with 270 outliers.
12.16 Table of responses and probability outputs.
Читать дальше