Reinforcement Learning for Resilient Control in Cooperative and Adversarial Multi-agent Networks:CPS Applications in Microgrid and Human-Robot Interactions

时间:

2015-11-05 10:00:00

来源:

自动化科学与工程学院

作者:

报告人：F. L. Lewis

报告时间：2015年11月05日上午10：00

报告地点：华南理工大学自动化科学与工程学院6楼会议室

**********************************************************

摘要：

The interactions of multiple dynamical agents have applications in vehicle formation control, trust propagation in autonomous

teams, synchronization phenomena in complex systems, aerospace and satellite coordination, and the relations of multiple interacting

human-robotic systems. The relations between the local feedback control system design for each agent and the restrictions

imposed by allowed communication topologies are intriguing and can result in either beneficial emergent behaviors or

unexpected detrimental overall performance of the integrated autonomous team. The concept of Optimality provides an organizational

principle for behavior that can be exploited to increase the resilience of multiple interacting dynamical agents in communication

networks.In fact, it was shown by Charles Darwin that cognitive learning principles organized along the lines of optimality

and resource conservation over long timescales are responsible for the phenomenon of Natural Selection of multiple species.

简介：

In this discussion we show that formulating team performance objectives in terms of optimality principles allows the unification

of ideas from optimal control and adaptive control by using cognitive reinforcement learning principles. This leads to a new class

of cognitive multi-agent controllers that allow each agent to learn optimal solutions online using real-time measurements of the

actions of neighboring agents in the network.These ideas are rooted in Hamilton’s principle in classical mechanics and allow

applications to cooperative multi-player graphical games.Applications are then made to cyberphysical systems, in particular to

the resilient and efficient distributed control of renewable energy microgrids.

Next, we show how to solve the optimal leader-follower tracker problem online using data measured in real time by using reinforcement

learning methods.This allows applications in heterogeneous multi-agent systems consisting of multiple vehicles with dynamics that

may not be the same. Using optimal tracker design, a two-layer design method is then developed for CPS humanrobot interactions based on

observed experimental human factors studies.

Finally, some ideas for modeling and control of adversarial networks are discussed, where the interactions among multiple agents may be

antagonistic.Included are basic principles of zero sum differential games and a new notion of bipartite graphs.In bipartite graphs,

distrust is captured by negative communication weights between agents.Some rather intriguing behaviors are observed in bipartite

multi-agent interactions, including structural balance which leads to the emergence of exactly two opposing adversarial teams,

as observed in sports, warfare, and the interactions of international conglomerations in macroeconomics.