AnnaMats's picture
Second Push
05c9ac2

์œ ๋‹ˆํ‹ฐ ML-Agents ํˆดํ‚ท

docs badge

license badge

(latest release) (all releases)

์œ ๋‹ˆํ‹ฐ ๊ธฐ๊ณ„ํ•™์Šต ์—์ด์ „ํŠธ ํˆดํ‚ท (ML-Agents) ์€ ๊ฒŒ์ž„ ์ปจํ…์ธ  ๋ฐ ๊ฒŒ์ž„์„ ํฌํ•จํ•œ ๋‹ค์–‘ํ•œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•œ ์ง€๋Šฅํ˜• ์—์ด์ „ํŠธ๋ฅผ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•˜๋Š” ์˜คํ”ˆ ์†Œ์Šค ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ML-Agents๋Š” ๊ฒŒ์ž„ ๊ฐœ๋ฐœ์ž ๋“ค์ด 2D, 3D ๋ฐ ๊ฐ€์ƒํ˜„์‹ค/์ฆ๊ฐ•ํ˜„์‹ค ๊ฒŒ์ž„์—์„œ ์ง€๋Šฅํ˜• ์—์ด์ „ํŠธ๋ฅผ ์‰ฝ๊ฒŒ ๊ต์œกํ•  ์ˆ˜ ์žˆ๋„๋ก ์ตœ์‹  ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ตฌํ˜„(PyTorch ๊ธฐ๋ฐ˜)์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ฐ„๋‹จํ•œ ํŒŒ์ด์ฌ API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ•ํ™” ํ•™์Šต, ๋ชจ๋ฐฉ ํ•™์Šต, ์‹ ๊ฒฝ ์ง„ํ™” ๋“ฑ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ์—์ด์ „ํŠธ๋ฅผ ๊ต์œกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต๋œ ์—์ด์ „ํŠธ๋Š” NPC ํ–‰๋™ ์ œ์–ด(๋‹ค์ค‘ ์—์ด์ „ํŠธ ๋ฐ ์ ๋Œ€์  ์—์ด์ „ํŠธ์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์„ค์ •), ๊ฒŒ์ž„ ๋นŒ๋“œ ํ…Œ์ŠคํŠธ ์ž๋™ํ™”, ๊ทธ๋ฆฌ๊ณ  ์ถœ์‹œ ์ „ ๊ฒŒ์ž„ ์„ค๊ณ„(๋ฐธ๋Ÿฐ์Šค) ๊ฒ€์ฆ ๋“ฑ์„ ํฌํ•จํ•œ ๋‹ค์–‘ํ•œ ์šฉ๋„๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ML-Agents ํˆดํ‚ท์€ ์œ ๋‹ˆํ‹ฐ์˜ ์ž์œ ๋กœ์šด ํ™˜๊ฒฝ์—์„œ ์ธ๊ณต์ง€๋Šฅ ์—์ด์ „ํŠธ๋ฅผ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•œ ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•˜๋ฉฐ, ์ดํ‹€ ํ†ตํ•ด ์—ฐ๊ตฌ์ž ๋ฐ ๊ฒŒ์ž„ ๊ฐœ๋ฐœ์ž ๋“ฑ ๊ด‘๋ฒ”์œ„ํ•œ ์ปค๋ฎค๋‹ˆํ‹ฐ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ฒŒ์ž„ ๊ฐœ๋ฐœ์ž์™€ ์ธ๊ณต์ง€๋Šฅ ์—ฐ๊ตฌ์› ๋ชจ๋‘์—๊ฒŒ ์ƒํ˜ธ ์ด์ต์ด ๋ฉ๋‹ˆ๋‹ค.

ํŠน์ง•

  • 15+ ์œ ๋‹ˆํ‹ฐ ํ™˜๊ฒฝ ์˜ˆ์ œ
  • ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ ๊ตฌ์„ฑ ๋ฐ ๊ต์œก ์‹œ๋‚˜๋ฆฌ์˜ค ์ง€์›
  • ๊ฒŒ์ž„์ด๋‚˜ ์ปค์Šคํ…€ ์œ ๋‹ˆํ‹ฐ ์”ฌ์— ํ†ตํ•ฉ๋  ์ˆ˜ ์žˆ๋Š” ์œ ์—ฐํ•œ ์œ ๋‹ˆํ‹ฐ SDK
  • Proximal Policy Optimization (PPO) ์™€ Soft Actor-Critic (SAC) ์˜ ๋‘ ๊ฐ€์ง€ ์‹ฌ์ธต ๊ฐ•ํ™” ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•œ ํ›ˆ๋ จ
  • Behavioral Cloning ์ด๋‚˜ Generative Adversarial Imitation Learning ์„ ํ†ตํ•œ ๋ชจ๋ฐฉ ํ•™์Šต์— ๋Œ€ํ•œ ๋‚ด์žฅ ์ง€์›
  • ์ ๋Œ€์ (Adversarial) ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์—์ด์ „ํŠธ๋ฅผ ๊ต์œกํ•˜๊ธฐ ์œ„ํ•œ Self-play ๋ฉ”์ปค๋‹ˆ์ฆ˜
  • ๋ณต์žกํ•œ ์ž‘์—…์— ๋Œ€ํ•ด ์‰ฝ๊ฒŒ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ปค๋ฆฌํ˜๋Ÿผ ํ•™์Šต ์‹œ๋‚˜๋ฆฌ์˜ค
  • ํ™˜๊ฒฝ ๋žœ๋คํ™”๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ•๋ ฅํ•œ ์—์ด์ „ํŠธ ํ•™์Šต
  • ์˜จ ๋””๋งจ๋“œ ์˜์‚ฌ ๊ฒฐ์ •์„ ํ†ตํ•œ ์œ ์—ฐํ•œ ์—์ด์ „ํŠธ ์ œ์–ด
  • ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์œ ๋‹ˆํ‹ฐ ํ™˜๊ฒฝ ์ธ์Šคํ„ด์Šค๋ฅผ ๋™์‹œ์— ์‚ฌ์šฉํ•˜๋Š” ํ•™์Šต
  • ๋„ค์ดํ‹ฐ๋ธŒ ํฌ๋กœ์Šค ํ”Œ๋žซํผ์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ์œ ๋‹ˆํ‹ฐ ์ถ”๋ก (Inference) ์—”์ง„ ์ด์šฉ
  • ์œ ๋‹ˆํ‹ฐ ํ™˜๊ฒฝ ํŒŒ์ด์ฌ์—์„œ ์ œ์–ด
  • gym ๊ณผ ๊ฐ™์€ ์œ ๋‹ˆํ‹ฐ ํ•™์Šต ํ™˜๊ฒฝ ์ œ๊ณต

์ด ๋ชจ๋“  ๊ธฐ๋Šฅ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์€ ML-Agents ๊ฐœ์š” ํŽ˜์ด์ง€๋ฅผ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

๋ฆด๋ฆฌ์ฆˆ & ์„ค๋ช…์„œ

์ตœ์‹ ์˜ ์•ˆ์ •์  ๋ฆด๋ฆฌ์ฆˆ๋Š” Release 12 ์ž…๋‹ˆ๋‹ค. ํด๋ฆญํ•ด์„œ ML-Agents์˜ ์ตœ์‹  ๋ฆด๋ฆฌ์Šค๋ฅผ ์‹œ์ž‘ํ•˜์„ธ์š”. ์—ฌ๊ธฐ

์•„๋ž˜ ํ‘œ์—๋Š” ํ˜„์žฌ ๊ฐœ๋ฐœ์ด ์ง„ํ–‰ ์ค‘์ด๋ฉฐ ๋ถˆ์•ˆ์ •ํ•  ์ˆ˜ ์žˆ๋Š” master ๋ธŒ๋žœ์น˜๋ฅผ ํฌํ•จํ•œ ๋ชจ๋“  ๋ฆด๋ฆฌ์Šค๊ฐ€ ๋‚˜์™€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ช‡ ๊ฐ€์ง€ ์œ ์šฉํ•œ ์ง€์นจ:

  • ๋ฒ„์ „ ๊ด€๋ฆฌ ํŽ˜์ด์ง€ ๋Š” GitHub ๋ฆด๋ฆฌ์ฆˆ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ๊ฐ ML-Agents ๊ตฌ์„ฑ ์š”์†Œ์— ๋Œ€ํ•œ ๋ฒ„์ „ ๊ด€๋ฆฌ ํ”„๋กœ์„ธ์Šค๋ฅผ ๊ฐ„๋žตํžˆ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฆด๋ฆฌ์ฆˆ ํŽ˜์ด์ง€ ๋Š” ๋ฆด๋ฆฌ์Šค ๊ฐ„์˜ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ์„ธ๋ถ€ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜(Migration) ํŽ˜์ด์ง€ ๋Š” ์ด์ „ ๋ฆด๋ฆฌ์Šค์˜ ML-Agents ํˆดํ‚ท์—์„œ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์„ธ๋ถ€ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์•„๋ž˜ ํ‘œ์˜ ์„ค๋ช…์„œ ๋งํฌ์—๋Š” ๊ฐ ๋ฆด๋ฆฌ์Šค์— ๋Œ€ํ•œ ์„ค์น˜ ๋ฐ ์‚ฌ์šฉ ์ง€์นจ์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ ์ค‘์ธ ๋ฆด๋ฆฌ์Šค ๋ฒ„์ „์— ํ•ด๋‹นํ•˜๋Š” ์„ค๋ช…์„œ๋ฅผ ํ•ญ์ƒ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๋ฒ„์ „ ๋ฆด๋ฆฌ์ฆˆ ๋‚ ์งœ ์†Œ์Šค ์„ค๋ช…์„œ ๋‹ค์šด๋กœ๋“œ
master (unstable) -- source docs download
Release 12 December 22, 2020 source docs download
Release 11 December 21, 2020 source docs download
Release 10 November 18, 2020 source docs download
Release 9 November 4, 2020 source docs download
Release 8 October 14, 2020 source docs download
Release 7 September 16, 2020 source docs download
Release 6 August 12, 2020 source docs download
Release 5 July 31, 2020 source docs download

์ธ์šฉ

์ธ๊ณต์ง€๋Šฅ ํ”Œ๋žซํผ์œผ๋กœ์„œ์˜ ์œ ๋‹ˆํ‹ฐ์— ๋Œ€ํ•œ ๋…ผ์˜์— ๊ด€์‹ฌ์ด ์žˆ๋Š” ์—ฐ๊ตฌ์ž๋ผ๋ฉด, ํ”„๋ฆฌํ”„๋ฆฐํŠธ๋ฅผ ์ฐธ์กฐํ•˜์‹œ์˜ค. ๋ฐ ML-Agents ํˆดํ‚ท์— ๋Œ€ํ•œ ์ฐธ์กฐ ๋ฌธ์„œ.

์œ ๋‹ˆํ‹ฐ ๋˜๋Š” ML-Agents ํˆดํ‚ท์„ ์‚ฌ์šฉํ•˜์—ฌ ์—ฐ๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ, ๋‹ค์Œ ๋…ผ๋ฌธ์„ ์ฐธ์กฐ ์ž๋ฃŒ๋กœ ์ธ์šฉํ•  ๊ฒƒ์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. Juliani, A., Berges, V., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., Lange, D. (2020). Unity: A General Platform for Intelligent Agents. arXiv preprint arXiv:1809.02627. https://github.com/Unity-Technologies/ml-agents.

์ถ”๊ฐ€ ๋ฆฌ์†Œ์Šค

์œ ๋‹ˆํ‹ฐ ๋ฐ ML-Agents ํˆดํ‚ท์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์†Œ๊ฐœํ•˜๋Š” ์œ ๋‹ˆํ‹ฐ ํ•™์Šต ๊ณผ์ •์ด ์žˆ์Šต๋‹ˆ๋‹ค. ML-Agents: ๋ฒŒ์ƒˆ

๋˜ํ•œ CodeMonkeyUnity์™€ ์ œํœดํ•˜์—ฌ ML-Agents ํˆดํ‚ท์˜ ๊ตฌํ˜„ ๋ฐ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ํŠœํ† ๋ฆฌ์–ผ ๋น„๋””์˜ค๋„ ์ œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ ML-Agents ๊ด€๋ จ ๋ธ”๋กœ๊ทธ ๊ฒŒ์‹œ๋ฌผ๋„ ๊ฒŒ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค.

์ปค๋ฎค๋‹ˆํ‹ฐ ๊ทธ๋ฆฌ๊ณ  ํ”ผ๋“œ๋ฐฑ

ML-Agents ํˆดํ‚ท์€ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์ด๋ฉฐ ์ปจํŠธ๋ฆฌ๋ทฐ์…˜์„ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ๋งŒ์•ฝ ์ปจํŠธ๋ฆฌ๋ทฐ์…˜์„ ์›ํ•˜์‹œ๋Š” ๊ฒฝ์šฐ ์ปจํŠธ๋ฆฌ๋ทฐ์…˜ ๊ฐ€์ด๋“œ๋ผ์ธ ๊ณผ ํ–‰๋™ ๊ทœ์น™ ์„ ๊ฒ€ํ† ํ•ด์ฃผ์‹ญ์‹œ์˜ค.

ML-Agents ํˆดํ‚ท ์„ค์น˜ ๋ฐ ์„ค์ •๊ณผ ๊ด€๋ จ๋œ ๋ฌธ์ œ ๋˜๋Š” ์—์ด์ „ํŠธ๋ฅผ ๊ฐ€์žฅ ์ž˜ ์„ค์ •ํ•˜๊ฑฐ๋‚˜ ๊ต์œกํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋…ผ์˜๋Š” ์œ ๋‹ˆํ‹ฐ ML-Agents ํฌ๋Ÿผ ์— ์ƒˆ ์Šค๋ ˆ๋“œ๋ฅผ ์ž‘์„ฑํ•˜์‹ญ์‹œ์˜ค. ๊ฐ€๋Šฅํ•œ ๋งŽ์€ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ํฌํ•จํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ML-Agents ํˆดํ‚ท์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค๋ฅธ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๊ฑฐ๋‚˜ ํŠน์ • ๊ธฐ๋Šฅ ์š”์ฒญ์ด ์žˆ๋Š” ๊ฒฝ์šฐ ์ด์Šˆ ์ œ์ถœ ๋ถ€ํƒํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๋Ÿฌ๋ถ„์˜ ์˜๊ฒฌ์€ ์ €ํฌ์—๊ฒŒ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์œ ๋‹ˆํ‹ฐ ML-Agents ํˆดํ‚ท์— ๊ด€๋ จ๋œ ์—ฌ๋Ÿฌ๋ถ„์˜ ์˜๊ฒฌ์„ ํ†ตํ•ด์„œ ์ €ํฌ๋Š” ๊ณ„์†ํ•ด์„œ ๋ฐœ์ „ํ•˜๊ณ  ์„ฑ์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹จ ๋ช‡ ๋ถ„๋งŒ ์‚ฌ์šฉํ•˜์—ฌ ์ €ํฌ์—๊ฒŒ ์•Œ๋ ค์ฃผ์„ธ์š”.

๋‹ค๋ฅธ ์˜๊ฒฌ๊ณผ ํ”ผ๋“œ๋ฐฑ์€ ML-Agents ํŒ€๊ณผ ์ง์ ‘ ์—ฐ๋ฝ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค. (ml-agents@unity3d.com)

๊ฐœ์ธ์ •๋ณด

Unity ML-Agents ํˆดํ‚ท์— ๋Œ€ํ•œ ๊ฐœ๋ฐœ์ž ๊ฒฝํ—˜์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด, ์šฐ๋ฆฌ๋Š” ์—๋””ํ„ฐ ๋‚ด๋ถ€ ๋ถ„์„์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. ์œ ๋‹ˆํ‹ฐ ๊ฐœ์ธ ์ •๋ณด ๋ณดํ˜ธ ์ •์ฑ… ์˜ "Unity๊ฐ€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์ˆ˜์ง‘ํ•˜๋Š” ์ •๋ณด"๋ฅผ ์ฐธ์กฐํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

๋ผ์ด์„ผ์Šค

Apache License 2.0

ํ•œ๊ธ€ ๋ฒˆ์—ญ

์œ ๋‹ˆํ‹ฐ ML-Agents ๊ด€๋ จ ๋ฌธ์„œ์˜ ํ•œ๊ธ€ ๋ฒˆ์—ญ์€ [์žฅํ˜„์ค€(Hyeonjun Jang)][https://github.com/JangHyeonJun], ๋ฏผ๊ทœ์‹ (Kyushik Min)์— ์˜ํ•ด ์ง„ํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด์šฉ์ƒ ์˜ค๋ฅ˜๋‚˜ ์˜คํƒˆ์ž๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ ๊ฐ ๋ฌธ์„œ์˜ ๋ฒˆ์—ญ์„ ์ง„ํ–‰ํ•œ ์‚ฌ๋žŒ์˜ ์ด๋ฉ”์ผ์„ ํ†ตํ•ด ์—ฐ๋ฝ์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

์žฅํ˜„์ค€: totok682@naver.com

๋ฏผ๊ทœ์‹: kyushikmin@gmail.com

์ตœํƒœํ˜: chlxogur_@naver.com