LibCat » Книги » Приключения » unrecognised » Amit Konar - Multi-Agent Coordination

Amit Konar - Multi-Agent Coordination

Здесь есть возможность читать онлайн «Amit Konar - Multi-Agent Coordination» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Multi-Agent Coordination
Автор:
Amit Konar
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Multi-Agent Coordination: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Multi-Agent Coordination»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Discover the latest developments in multi-robot coordination techniques with this insightful and original resource
Multi-Agent Coordination: A Reinforcement Learning Approach
Multi-Agent Coordination: A Reinforcement Learning Approach

Multi-Agent Coordination — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Multi-Agent Coordination», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

Table of Contents

1 Cover

2 Title Page Multi‐Agent Coordination A Reinforcement Learning Approach Arup Kumar Sadhu Amit Konar

3 Copyright Page

4 Preface

5 Acknowledgments

6 About the Authors

7 1 Introduction 1.1 Introduction 1.2 Single Agent Planning 1.3 Multi‐agent Planning and Coordination 1.4 Coordination by Optimization Algorithm 1.5 Summary References

8 2 Improve Convergence Speed of Multi‐Agent Q‐Learning for Cooperative Task Planning 2.1 Introduction 2.2 Literature Review 2.3 Preliminaries 2.4 Proposed MAQL 2.5 Proposed FCMQL Algorithms and Their Convergence Analysis 2.6 FCMQL‐Based Cooperative Multi‐agent Planning 2.7 Experiments and Results 2.8 Conclusions 2.9 Summary 2.A More Details on Experimental Results References

9 3 Consensus Q‐Learning for Multi‐agent Cooperative Planning 3.1 Introduction 3.2 Preliminaries 3.3 Consensus 3.4 Proposed CoQL and Planning 3.5 Experiments and Results 3.6 Conclusions 3.7 Summary References

10 4 An Efficient Computing of Correlated Equilibrium for Cooperative Q‐Learning‐Based Multi‐Robot Planning 4.1 Introduction 4.2 Single‐Agent Q‐Learning and Equilibrium‐Based MAQL 4.3 Proposed Cooperative MAQL and Planning 4.4 Complexity Analysis 4.5 Simulation and Experimental Results 4.6 Conclusion 4.7 Summary Appendix 4.A Supporting Algorithm and Mathematical Analysis References

11 5 A Modified Imperialist Competitive Algorithm for Multi‐Robot Stick‐Carrying Application 5.1 Introduction 5.2 Problem Formulation for Multi‐Robot Stick‐Carrying 5.3 Proposed Hybrid Algorithm 5.4 An Overview of FA 5.5 Proposed ICFA 5.6 Simulation Results 5.7 Computer Simulation and Experiment 5.8 Conclusion 5.9 Summary Appendix 5.A Additional Comparison of ICFA References

12 6 Conclusions and Future Directions 6.1 Conclusions 6.2 Future Directions

13 Index

14 End User License Agreement

List of Tables

1 Chapter 1 Table 1.1 Trace of Dijkstra's algorithm for Figure 1.11. Table 1.2 Trace of A *algorithm from Figure 1.10. Table 1.3 Trace of D* algorithm from Figure 1.12. Table 1.4 Expected reward of R1 and R2 at MSNE.

2 Chapter 2Table 2.1 List of acronyms.Table 2.2 Details of 10 × 10 grid maps.Table 2.3 Run‐time complexity of Algorithm 2.3 over reference algorithms in d...Table 2.4 Run‐time complexity of Algorithm 2.3 over reference algorithms in s...Table 2.5 Time taken by Khepera‐II mobile robots to reach a team‐goal with diffe...Table 2.A.1 Number of joint state–action pair converged in deterministic situ...Table 2.A.2 Number of joint state–action pair converged in stochastic situati...Table 2.A.3 Count of team‐goal explored in the deterministic situation for tw...Table 2.A.4 Count of team‐goal explored in the stochastic situation for two a...

3 Chapter 3Table 3.1 List of acronyms.Table 3.2 Planning performance.

4 Chapter 4Table 4.1 Average of the percentage (%) of joint state–action pair converged ...Table 4.2 Average run‐time complexity of different learning algorithms (secon...Table 4.3 Average run‐time complexity of different planning algorithms (secon...Table 4.4 Average run‐time complexity of different planning algorithms (secon...Table 4.A.1 Time‐complexity analysis.

5 Chapter 5Table 5.1 Comparative analysis of performance of the proposed ICFA with other...Table 5.2 Comparative analysis of performance of the proposed ICFA with other...Table 5.3 Average rankings obtained through Friedman's testTable 5.4 Comparison of number of steps, average path traversed, and average ...Table 5.A.1 No of successful runs out of 25 runs and success performance in p...

List of Illustrations

1 Chapter 1 Figure 1.1 Single agent system. Figure 1.2 Three discrete states in an environment. Figure 1.3 Robot executing action Right (R) at state s 1 and moves to the nex... Figure 1.4 Deterministic state‐transition. Figure 1.5 Stochastic state‐transition. Figure 1.6 Two‐dimensional 5 × 5 grid environment. Figure 1.7 Refinement approach in robotics. Figure 1.8 Hierarchical tree. Figure 1.9 Hierarchical model. Figure 1.10 Two‐dimensional 3 × 3 grid environment. Figure 1.11 Corresponding graph of Figure 1.10. Figure 1.12 Two‐dimensional 3 × 3 grid environment with an obstacle. Figure 1.13 Structure of reinforcement learning. Figure 1.14 Variation of average reward with the number of trial for differe... Figure 1.15 Correlation between the RL and DP. Figure 1.16 Single agent Q‐learning. Figure 1.17 Possible next state in stochastic situation. Figure 1.18 Single agent planning. Figure 1.19 Multi‐agent system with m agents. Figure 1.20 Robots executing joint action at joint state <1, 8> and m... Figure 1.21 Classification of multi‐robot systems. Figure 1.22 Hands gestures in rock‐paper‐scissor game: (a) rock, (b) paper, ... Figure 1.23 Rock‐paper‐scissor game. Figure 1.24 Reward mapping from joint Q‐table to reward matrix. Figure 1.25 Pure strategy Nash equilibrium evaluation. (a) Fix A 1 = L and A 2... Figure 1.26 Evaluation of mixed strategy Nash equilibrium. Figure 1.27 Reward matrix for tennis game. Figure 1.28 Reward matrix of in a common reward two‐agent static game. Figure 1.29 Pure strategy Egalitarian equilibrium, which is one variant of C... Figure 1.30 Game of chicken. Figure 1.31 Reward matrix in the game of chicken. Figure 1.32 Constant‐sum game. Figure 1.33 Matching pennies. Figure 1.34 Reward matrix in Prisoner's Dilemma game. Figure 1.35 Correlation among the MARL, DP, and GT. Figure 1.36 Classification of multi‐agent reinforcement learning.Figure 1.37 The climbing game reward matrix.Figure 1.38 The penalty game reward matrix.Figure 1.39 The penalty game reward matrix.Figure 1.40 Individual Q‐values obtained in the climbing game reward matrix ...Figure 1.41 The penalty game reward matrix.Figure 1.42 Individual Q‐values obtained in the penalty game reward matrix b...Figure 1.43 Reward matrix of a three‐player coordination game.Figure 1.44 Reward matrix in a two‐player two‐agent game.Figure 1.45 Nonstrict EDNP in normal‐form game.Figure 1.46 Multistep negotiation process between agent A and B.Figure 1.47 Multi‐robot coordination for the well‐known stick‐carrying probl...Figure 1.48 Multi‐robot local planning by swarm/evolutionary algorithm.Figure 1.49 Surface plot of (1.97).Figure 1.50 Surface plot of (1.98).Figure 1.51 Steps of Differential evolution (DE) algorithm [132].

2 Chapter 2Figure 2.1 Block diagram of reinforcement leaning (RL).Figure 2.2 Experimental workspace for two agents during the learning phase....Figure 2.3 Convergence plot of NQLP12 and reference algorithms for two agent...Figure 2.4 Average of average reward (AAR) plot of NQLP12 and reference algo...Figure 2.5 Joint action selection strategy in EQLP12 and reference algorithm...Figure 2.6 Cooperative path planning to carry a triangle by three robots in ...Figure 2.7 Cooperative path planning to carry a stick by two Khepera‐II mobi...Figure 2.8 Cooperative path planning to carry a stick by two Khepera‐II mobi...Figure 2.A.1 Convergence plot of FCMQL and reference algorithms for two agen...Figure 2.A.2 Convergence plot of EQLP12 and reference algorithms for three a...Figure 2.A.3 Convergence plot of EQLP12 and reference algorithms for four ag...Figure 2.A.4 CR versus learning epoch plot for FCMQL and reference algorithm...Figure 2.A.5 Average of average reward (AAR) plot of FCMQL and reference alg...Figure 2.A.6 Average of average reward (AAR) plot of EQLP12 and reference al...Figure 2.A.7 Average of average reward (AAR) plot of EQLP12 and reference al...Figure 2.A.8 Joint action selection strategy in EQLP12 and reference algorit...Figure 2.A.9 Joint action selection strategy in EQLP12 and reference algorit... Figure 2.A.10 Path planning with stick in deterministic situation by: (a) NQ...Figure 2.A.11 Path planning with stick in stochastic situation by: (a) NQIMP...Figure 2.A.12 Path planning with triangle in stochastic situation by: (a) NQ...Figure 2.A.13 Path planning with square in stochastic situation by: (a) NQIM...Figure 2.A.14 Path planning with square in deterministic situation by: (a) N...