LibCat » Книги » Приключения » unrecognised » Machine Learning Algorithms and Applications

Machine Learning Algorithms and Applications

Здесь есть возможность читать онлайн «Machine Learning Algorithms and Applications» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Machine Learning Algorithms and Applications
Автор:
Неизвестный Автор
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Machine Learning Algorithms and Applications: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Machine Learning Algorithms and Applications»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Machine Learning Algorithms The book discusses many methods based in different fields, including statistics, pattern recognition, neural networks, artificial intelligence, sentiment analysis, control, and data mining, in order to present a unified treatment of machine learning problems and solutions. All learning algorithms are explained so that the user can easily move from the equations in the book to a computer program.

Machine Learning Algorithms and Applications — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Machine Learning Algorithms and Applications», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

2. Testing Model: Under testing, new data was fetched using API. It was passed to the respective places LSTM. Future values of all parameters were predicted by the LSTM. This was passed as input to the SVM and the final result was prediction of air quality and assignment of AQI was done.

1.4 Results and Discussions

The open data is being provided by OpenAQ organization [13]. Their aim is to help people fight air pollution by providing open data and open-source tools. The data is obtained from government bodies as well as research groups and aggregated by OpenAQ. OpenAQ API was used to fetch the latest data in data frame and saved in .csv format for computations. Figure 1.3shows the screenshot of data fetched on 6th June, 2020 for Visakhapatnam, India.

Table 1.1 Range of AQI categories.

AQI category (range)	PM10 (24hr)	PM2.5 (24hr)	NO2 (24hr)	O3 (8hr)	CO (8hr)	SO2 (24hr)	NH3 (24hr)	Pb (24hr)
Good (0–50)	0–50	0–30	0–40	0–50	0–1.0	0–40	0–200	0–0.5
Satisfactory (51–100)	51–100	31–60	41–80	51–100	1.1–2.0	41–80	201–400	0.5–1.0
Moderately polluted (101–200)	101–250	61–90	81–180	101–168	2.1–10	81–380	401–800	1.1–2.0
Poor (201–300)	251–350	91–120	181–280	169–208	10–17	381–800	801–1200	2.1–3.0
Very poor (301–400)	351–430	121–250	281–400	209–748	17–34	801–1,600	1,200–1,800	3.1–3.5
Severe (401–500)	430+	250+	400+	748+	34+	1600+	1800+	3.5+

1. K-Means Clustering Outcomes: As explained in the methodology section, we applied K-means clustering to determine the classes via clusters for our unsupervised data. In order to find out the optimal number of clusters required, Silhouette coefficient was calculated. The Silhouette coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample where b is the distance between a sample and the nearest cluster that the sample is not a part of. The value of Silhouette coefficient for a sample is (b – a)/max (a, b). For our experiments, we kept it equal to 7 using the Elbow method. After clustering, the clustered data were assigned labels for air quality using the AQI table. The required range for different air control parameters is shown in Table 1.1.

We worked on six parameters, namely, NO2, O3, PM10, PM2.5, SO2, and CO. To build the LSTM model, we trained our model for 14 different places in India, namely, Visakhapatnam (GVMC Ram Nagar), Ajmer (Civil Lines), Alwar, Vasundhara (Ghaziabad), Gurgaon (Vikas Sadan), Bandra (Maharashtra), Bhiwadi Industrial Area, Bengaluru (BWSSB Kadabesanaha), Amritsar (Golden Temple), Anand Vihar, R K Puram, Punjabi Bagh, NSIT (Dwarka), and Sector 62 Noida. First of all, K-means clustering was applied.

2. SVM outcomes: The data values (1,870) were divided into training and testing sets. We took 80% for the training set and 20% for the testing set. The clustered data was trained on SVM against air quality so that air quality could be determined based on the values of all parameters. Sklearn library was used for it [14]. SVM was cross-validated using GridSearchCV (k = 10) technique. Results on 374 test samples could be seen in Table 1.2. Best parameter set found was {c: 0.1, gamma: 0.001, kernel: linear}.

3. LSTM outcomes: To build the LSTM model, we trained our model for 14 different places in India, namely, Visakhapatnam (GVMC Ram Nagar), Ajmer (Civil Lines), Alwar, Vasundhara (Ghaziabad), Gurgaon (Vikas Sadan), Bandra (Maharashtra), Bhiwadi Industrial Area, Bengaluru (BWSSB Kadabesanaha), Amritsar (Golden Temple), Anand Vihar, R K Puram, Punjabi Bagh, NSIT (Dwarka), and Sector 62 Noida. Five thousand samples were used for training and 500 samples for testing of each model.

Each model had different values for different parameters like kernel initializer, batch size, and epochs during hyper parameter tuning. We used Keras library in Python [15]. The performance was evaluated with two metrics: Mean Absolute Error (MAE), Root Mean Square Error (RMSE). Table 1.3shows the MAE and RMSE values received. MAE is calculated by (∑|y − x|)/n, and RMSE is calculated by √(∑y − x)2/n where y is predicted value and x is actual value.

Figure 1.4shows the prediction values for Bengaluru City at present hour as well as for 2 days 3 hours after 13th December, 2017. Figure 1.5shows the prediction values for 2 days 3 hours after 6th June, 2020. We observed that on an average Bengaluru is a cleaner city as compared to other cities even during November and December. It was realized that it could have been due to rainy weather. Bengaluru gets rain almost every day and due to which the majority of air pollutants get washed down thus resulting into reduced air pollution.

Figure 1.6shows the predicted values at present hour and for future one day 3 hours for Anand Vihar, New Delhi, after 13th December, 2017. New Delhi suffers from heavy pollution and therefore the quality of observed air was very poor. PM2.5 level remains high, making the air not only toxic but also prone to causing breathing problems. We have also generated advisory for the users of the app. Figure 1.7shows the predicted values for 1 day and 3 hours for Anand Vihar, New Delhi, after 6th June, 2020. It could clearly be seen that pollution levels have drastically reduced and air quality has also become better due to imposed lockdown as there is less traffic and industrial waste emissions.

The experiments were performed for batch sizes of 10, 24, 15, 8, and 6 with epochs of 10 and 100. The MAE Scores for LSTM Hyper Parameters for NO2, O3, PM10, PM2.5, and SO2 are shown in ( Table 1.4), and after careful analysis of the LSTM Hyper Parameter scores, we zeroed in on the batch size with minimum bias.

4. Data Visualization: One of the main objectives of the project was to provide better visualizations to the normal people who are not able to interpret the relations between different values of the air pollutants. We therefore generated the Heat Maps of different parameters. Individual Heat Maps for the parameters as well as combined Heat Maps for the parameters have been provided.

Figure 1.8shows the Heat Map for Ozone gas O3 for 12th and 13th December, 2017. From the map, we could observe that O3 suffers maximum fluctuations between day and night intervals. O3 levels reduce at midnight and are very high on 13th December evening time. This could be due to heavy vehicular traffic during evening hours. Figure 1.9shows Heat Map for O3 for 6th to 8th June, 2020 which clearly shows reduction in O3 levels during less vehicular traffic and reduced industrial emissions.

Figure 1.10shows the Heat Map for all the parameters for the days 11th, 12th, and 13th December, 2017, at Sector 62, Noida. From the Heat Maps it could be observed that PM2.5 is the main pollution causing parameter in the Air. It could also be observed that it remains at dangerous levels on all days and during Days as well as Nights. Figure 1.11shows the Heat Map for all the parameters for the days 6th, 7th and 8th June, 2020 at Sector 62, Noida. The reduced levels of all pollutants could clearly be seen from the Heat Map as a result of imposed lockdown. However, PM2.5 still remains the top contributing factor toward pollution in the area.