[
  {
    "path": "README.md",
    "content": "# MARL Resources Collection\nThis is a collection of Multi-Agent Reinforcement Learning (MARL) Resources. The purpose of this repository is to give beginners a better understanding of MARL and accelerate the learning process. Note that some of the resources are written in Chinese and only important papers that have a lot of citations were listed. \n\nI will continually update this repository and I welcome suggestions. (missing important papers, missing important resources, invalid links, etc.) This is only a first draft so far and I'll add more resources in the next few months.\n\nThis repository is not for commercial purposes.\n\nMy email: chenhao915@mails.ucas.ac.cn\n\n\n## Overview\n* [Courses](https://github.com/TimeBreaker/MARL-resources-collection#courses)\n* [Important Conferences](https://github.com/TimeBreaker/MARL-resources-collection#important-conferences)\n* [Reviews](https://github.com/TimeBreaker/MARL-resources-collection#reviews)\n* [Books](https://github.com/TimeBreaker/MARL-resources-collection#books)\n* [Open Source Environments](https://github.com/TimeBreaker/MARL-resources-collection#open-source-environments)\n* [Research Groups](https://github.com/TimeBreaker/MARL-resources-collection#research-groups)\n* [Companies](https://github.com/TimeBreaker/MARL-resources-collection#companies)\n* [Paper List](https://github.com/TimeBreaker/MARL-resources-collection#paper-list)\n* [Talks](https://github.com/TimeBreaker/MARL-resources-collection#talks)\n* [Useful Resources](https://github.com/TimeBreaker/MARL-resources-collection#useful-links)\n* [TODO](https://github.com/TimeBreaker/MARL-resources-collection#todo)\n\n\n## Courses\n* [RLChina](https://rlchina.org/)\n* [UCL Multi-agent AI](https://www.bilibili.com/video/BV1fz4y1S72S)\n* [SJTU Multi-Agent Reinforcement Learning Tutorial](http://wnzhang.net/tutorials/marl2018/index.html)\n* [SJTU Reinforcement Learning](https://hrl.boyuai.com/slides/)\n\n\n## Important Conferences\n* AAMAS, AAAI, IJCAI, ICLR, ICML, NIPS\n* Sorted by difficulty (roughly)\n\n\n## Reviews\n### Recent Reviews (Since 2019)\n* [A Survey and Critique of Multiagent Deep Reinforcement Learning](https://arxiv.org/pdf/1810.05587v3)\n* [An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective](https://arxiv.org/abs/2011.00583v2)\n* [Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms](https://arxiv.org/abs/1911.10635v1)\n* [A Review of Cooperative Multi-Agent Deep Reinforcement Learning](https://arxiv.org/abs/1908.03963)\n* [Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning](https://arxiv.org/abs/1906.04737)\n* [A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity](https://arxiv.org/abs/1707.09183v1)\n* [Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications](https://arxiv.org/pdf/1812.11794.pdf)\n* [A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems](https://www.researchgate.net/publication/330752409_A_Survey_on_Transfer_Learning_for_Multiagent_Reinforcement_Learning_Systems)\n\n### Other Reviews (Before 2019)\n* [If multi-agent learning is the answer, what is the question?](https://ai.stanford.edu/people/shoham/www%20papers/LearningInMAS.pdf)\n* [Multiagent learning is not the answer. It is the question](https://core.ac.uk/download/pdf/82595758.pdf)\n* [Is multiagent deep reinforcement learning the answer or the question? A brief survey](https://arxiv.org/abs/1810.05587v1)   Note that [A Survey and Critique of Multiagent Deep Reinforcement Learning](https://arxiv.org/pdf/1810.05587v3) is an updated version of this paper with the same authors.\n* [Evolutionary Dynamics of Multi-Agent Learning: A Survey](https://www.researchgate.net/publication/280919379_Evolutionary_Dynamics_of_Multi-Agent_Learning_A_Survey)\n* (Worth reading although they're not recent reviews.)\n\n\n## Books\n* [Multiagent systems: Algorithmic, game-theoretic, and logical foundations](http://www.masfoundations.org/download.html)\n* [Multi‐Agent Machine Learning A Reinforcement Approach](https://www.engineerrefe.com/multi-agent-machine-learning/)\n\n\n## Open Source Environments\n* StarCraft Micromanagement Environment\n   * [pymarl](https://github.com/oxwhirl/pymarl) is the original environment mentioned in the paper [The StarCraft Multi-Agent Challenge](https://arxiv.org/abs/1902.04043). Note that pymarl is based on [SMAC](https://github.com/oxwhirl/smac).\n   * [MARL-Algorithms](https://github.com/starry-sky6688/MARL-Algorithms) is a simplified implementation of [pymarl](https://github.com/oxwhirl/pymarl)\n   * [EPyMARL](https://github.com/uoe-agents/epymarl) is a extended python MARL framework with more environments (Level Based Foraging, Multi-Robot Warehouse, Multi-Agent Particle Environment) and more algorithms. [Paper](https://link.zhihu.com/?target=https%3A//arxiv.org/abs/2006.07869)\n   * [pymarl2](https://github.com/hijkzzz/pymarl2) added code-level tricks to the original pymarl. [Paper](https://arxiv.org/abs/2102.03479)\n* [Multi-Agent Particle Environment](https://github.com/openai/multiagent-particle-envs)  [PyTorch Implementation](https://github.com/shariqiqbal2810/maddpg-pytorch)\n* [Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents](https://github.com/openai/neural-mmo)\n* [OpenSpiel: A Framework for Reinforcement Learning in Games](https://github.com/deepmind/open_spiel)\n* [Hanabi-learning-environment](https://github.com/deepmind/hanabi-learning-environment)\n* [RoboCup 2D Half Field Offense](https://github.com/LARG/HFO)\n* [Pommerman](https://www.pommerman.com/)\n* [Multi-agent-emergence-environments](https://github.com/openai/multi-agent-emergence-environments)\n* [Google Research Football](https://github.com/google-research/football)\n* [MAgent](https://github.com/PettingZoo-Team/MAgent) Note that [the original project](https://github.com/geek-ai/MAgent) is no longer maintained.\n* [DI-engine](https://github.com/opendilab/DI-engine)\n* [MARLlib](https://github.com/Replicable-MARL/MARLlib) is a MARL Extension for RLlib\n* [Multiagent Mujoco](https://github.com/schroederdewitt/multiagent_mujoco)\n* [PettingZoo](https://github.com/Farama-Foundation/PettingZoo)  [website](https://www.pettingzoo.ml/)\n* [Safe Policy Optimization (SafePO)](https://github.com/PKU-MARL/Safe-Policy-Optimization)\n* (I personally recommend the first two environments for beginners, especially EPyMARL.)\n\n\n## Research Groups\nOrganization|Reaearcher|Lab homepage (if any)\n--|:--:|--:\nOxford|[Shimon Whiteson](https://www.cs.ox.ac.uk/people/shimon.whiteson/), [Jakob N. Foerster](https://www.jakobfoerster.com/)|[link](http://whirl.cs.ox.ac.uk/ ) \nUniversity College London (UCL)|[Jun Wang](http://www0.cs.ucl.ac.uk/staff/Jun.Wang/)|\nTsinghua University (THU)|[Chongjie Zhang](http://people.iiis.tsinghua.edu.cn/~zhang/)|[link](http://group.iiis.tsinghua.edu.cn/~milab/index.html)\nTsinghua University (THU)|[Yi Wu](http://jxwuyi.weebly.com/)|\nPeking University (PKU)|[Zongqing Lu](https://z0ngqing.github.io/)|\nHUAWEI|[Hangyu Mao](https://maohangyu.github.io/)|\nNanjing University (NJU)|[Yang Yu](http://www.lamda.nju.edu.cn/yuy/)|\nFacebook|[Yuandong Tian](http://yuandong-tian.com/)|\nTianjin University (TJU)|[Jianye Hao](http://faculty.tju.edu.cn/156102/zh_CN/index/24194/list/index.htm)|[link](http://www.icdai.org/)\nUniversity of Illinois at Urbana-Champaign (UIUC)|[Kaiqing Zhang](https://kzhang66.github.io/index.html)|\nPeking University (PKU)|[Yaodong Yang](https://www.yangyaodong.com)|[Link](https://github.com/PKU-MARL)\nNanyang Technological University (NTU)|[Bo An](https://personal.ntu.edu.sg/boan/index.html)|\nShanghai Jiao Tong University (SJTU)|[Weinan Zhang](http://wnzhang.net/)|[link](http://apex.sjtu.edu.cn/)\nUniversity of Chinese Academy of Sciences (UCAS)|[Haifeng Zhang](https://pkuzhf.github.io/)|[link](http://marl.ia.ac.cn/index.html)\nUniversity of Edinburgh|[Stefano V. Albrecht](https://www.turing.ac.uk/people/researchers/stefano-albrecht)|[link](https://agents.inf.ed.ac.uk/) [GitHub](https://github.com/uoe-agents)\nUniversity College London (UCL)|UCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab |[Link](https://dark.cs.ucl.ac.uk/)\nUniversity of Maryland|[Furong Huang](http://furong-huang.com/)|[Link](http://furong-huang.com/)\n\n\n## Companies\n* [DeepMind](https://deepmind.com/)\n* [OpenAI](https://openai.com/)\n* [Facebook](https://ai.facebook.com/)\n* [Tencent](https://ai.tencent.com/ailab/zh/index)\n* [NetEase](https://fuxi.163.com/#/home)\n* [Huawei](https://www.noahlab.com.hk/#/home)\n* [Parametrix.ai](https://chaocanshu.cn/)\n* [Inspir.ai](http://www.inspirai.com/)\n\n\n## Paper Lists\n* https://github.com/TimeBreaker/Multi-Agent-Reinforcement-Learning-papers\n* https://github.com/TimeBreaker/MARL-papers-with-code\n* https://github.com/LantaoYu/MARL-Papers\n\n\n## Talks\n### In English\n* https://www.youtube.com/watch?v=W_9kcQmaWjo\n* https://www.youtube.com/watch?v=TMTT2z8lifA\n* https://www.youtube.com/watch?v=Yd6HNZnqjis\n* https://www.youtube.com/watch?v=ufFue5_gR4c\n\n### In Chinese\n* https://www.techbeat.net/talk-info?id=501\n* https://www.bilibili.com/video/av457780236/\n* https://space.bilibili.com/551888585/channel/detail?cid=167587\n* https://www.bilibili.com/video/BV1ig4y1v7xU\n* https://www.bilibili.com/video/BV18z411q7Kc\n* https://www.bilibili.com/video/BV1k5411V7ue\n\n\n## Useful Resources\n### In English\n* https://dblp.uni-trier.de/\n* https://paperswithcode.com/\n* https://www.connectedpapers.com\n* https://deeplearn.org\n* https://spinningup.openai.com/\n* https://github.com/openai/spinningup\n* https://github.com/Jinjiarui/hrl-papers\n\n### In Chinese\n* http://www.neurondance.com/\n* https://www.zhihu.com/question/376068768\n* https://www.zhihu.com/question/323584412\n* https://zhuanlan.zhihu.com/p/372558232\n* https://space.bilibili.com/4801051?spm_id_from=333.788.b_765f7570696e666f.2\n* https://www.zhihu.com/people/tian-yuan-dong\n* https://www.zhihu.com/people/eyounx\n* https://www.zhihu.com/people/wan-shang-zhu-ce-de\n* Wechat public account: AIORHHC; RLCN\n* https://www.bilibili.com/video/av925922430/\n* https://www.bilibili.com/video/av626777400/\n* https://github.com/NeuronDance/DeepRL\n\n\n## TODO\n* The Research Groups part needs to be completed\n* The Companies part needs to be completed\n* The Useful Resources part needs to be perfected\n\n\n## Citation\n\nIf you find this repository useful, please cite our repo:\n```\n@misc{chen2021collection,\n  author={Chen, Hao},\n  title={A Collection of Multi-Agent Reinforcement Learning Resources},\n  year={2021}\n  publisher = {GitHub},\n  journal = {GitHub Repository},\n  howpublished = {\\url{https://github.com/TimeBreaker/MARL-resources-collection}}\n}\n```\n"
  }
]