lucadipalma commited on
Commit
2d2d677
·
1 Parent(s): 983e23f

update readme, add video

Browse files
Files changed (4) hide show
  1. README.md +102 -1
  2. graph.png +0 -0
  3. pages/home.py +2 -10
  4. support/game_settings.py +43 -24
README.md CHANGED
@@ -13,4 +13,105 @@ tags:
13
  - mcp-in-action-track-creative
14
  ---
15
 
16
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - mcp-in-action-track-creative
14
  ---
15
 
16
+
17
+ # 🧠 Agentic Codenames Arena
18
+
19
+ ![Meme](assets/meme.png)
20
+
21
+ **Watch, or join, LLMs battling it out in Codenames.**
22
+
23
+ [Demo on YouTube](https://youtu.be/DKIfJ-j-GEg?si=sRXrr5XtP0MOvq1T)
24
+
25
+ [My post on LinkedIn](https://www.linkedin.com/posts/luca-di-palma-99024a1b7_most-of-us-use-llms-to-create-reports-write-activity-7400225424770932736-OTPU?utm_source=share&utm_medium=member_desktop&rcm=ACoAADJnVPwBh-8LoV25AQVeclIBTKNuOP6rr08)
26
+
27
+ ---
28
+
29
+ ## 🧩 What This App Does
30
+
31
+ **Agentic Codenames Arena** is an interactive dashboard where teams of LLMs compete in the game of *Codenames*.
32
+ Two team, **Red** and **Blue**, face off in a **4v4 setup**, with each team composed of:
33
+
34
+ * **1 Boss**: Provides the clue and clue number for each turn.
35
+ * **1 Captain**: Coordinates the team’s reasoning, synthesizes the agents’ suggestions, and ultimately selects the final words to “touch”.
36
+ * **2 Players**: Collaborate with the Captain, proposing interpretations, evaluating associations, and contributing to the team’s final decisions.
37
+
38
+
39
+ The internal **communication and coordination architecture is built using LangGraph**, enabling structured multi-agent reasoning and transparent agent-to-agent interactions.
40
+
41
+ Below is the LangGraph diagram illustrating how the different roles communicate during each turn:
42
+
43
+ ![LangGraph Architecture](graph.png)
44
+
45
+ You can either **sit back and watch fully autonomous LLM teams play**, or **step in as a human Boss** to lead your AI teammates with your own clues.
46
+
47
+ ---
48
+
49
+ ## 🤖 How It Works
50
+
51
+ ### **LLM Teams**
52
+
53
+ Build teams from several providers: OpenAI, Google, Anthropic, HuggingFace...
54
+ Each model plays autonomously using its own reasoning chain and game strategy.
55
+
56
+ ### **Two Gameplay Modalities**
57
+
58
+ #### **1️⃣ Observation Mode — Watch AIs Battle**
59
+
60
+ Sit back and spectate.
61
+ See how different models reason about clues, decide associations, and occasionally produce *hilariously misaligned* guesses.
62
+
63
+ You'll see:
64
+
65
+ * Model-to-model conversations
66
+ * Reasoning traces
67
+ * Turn-by-turn decisions
68
+ * How each team coordinates across multiple rounds
69
+
70
+ Perfect for AI benchmarking, research, or just entertainment.
71
+
72
+ #### **2️⃣ Human Boss Mode — Enter the Fight**
73
+
74
+ Become the Boss for either team and give your own clue + number.
75
+ Your AI teammates will interpret your hint and take their guesses.
76
+
77
+ ---
78
+
79
+ ## 🧠 Why It’s Interesting
80
+
81
+ * **Compare LLM reasoning styles:**
82
+ Watch how different models interpret associations, analogies, and subtle semantic cues.
83
+
84
+ * **Analyze team dynamics:**
85
+ Some models coordinate beautifully. Others… not so much.
86
+ Observe emergent cooperation, miscommunication, or unexpected strategies.
87
+
88
+ * **Experiment with human–AI collaboration:**
89
+ Test how effective your clues are with LLM teammates.
90
+ Try pushing the limits with creative, cryptic, or minimalist hints.
91
+
92
+ ---
93
+
94
+ ## 🕹️ Main Features
95
+
96
+ * **Create & customize teams** using any mix of LLMs
97
+ * **Switch between AI vs AI** and **Human vs AI** modes
98
+ * **Detailed per-turn logs** for all model decisions
99
+ * **Transparent reasoning chains**
100
+ * **Interactive UI** for watching matches play out
101
+ * **Match history & analytics dashboard**
102
+
103
+ ---
104
+
105
+ ## 📊 Stats & Analytics
106
+
107
+ All games played in the Arena are stored in a database.
108
+ The Stats section of the app includes:
109
+
110
+ * **Model win/loss rates** across all recorded matches
111
+ * **Performance comparisons** between model families (OpenAI vs Google vs …)
112
+ * **Historical match logs** for replay & analysis
113
+ * **Leaderboards** highlighting the best-performing models
114
+
115
+ This turns the Arena into a dynamic benchmarking tool for evaluating LLM semantic reasoning, coordination abilities, and reliability under pressure.
116
+
117
+
graph.png DELETED
Binary file (36.2 kB)
 
pages/home.py CHANGED
@@ -1,19 +1,11 @@
1
  import gradio as gr
2
- from support.game_settings import APP_DESCRIPTION, GAME_RULES_HTML
3
 
4
  with gr.Blocks(fill_width=True) as demo:
5
  # Rules section with HTML
6
  with gr.Row(elem_id="row_description", equal_height=True):
7
- gr.Markdown(APP_DESCRIPTION, elem_id="app_description")
8
  gr.HTML(GAME_RULES_HTML, elem_id="rules_accordion")
9
 
10
- # with gr.Row():
11
- # gr.HTML("""
12
- # <div style="text-align: center; margin: 2rem 0;">
13
- # <p style="font-size: 1.2rem;">Ready to start playing?</p>
14
- # <p style="color: #666;">Navigate to the <strong>Play</strong> page to begin your game!</p>
15
- # </div>
16
- # """)
17
-
18
  if __name__ == "__main__":
19
  demo.launch()
 
1
  import gradio as gr
2
+ from support.game_settings import APP_DESCRIPTION_HTML, GAME_RULES_HTML
3
 
4
  with gr.Blocks(fill_width=True) as demo:
5
  # Rules section with HTML
6
  with gr.Row(elem_id="row_description", equal_height=True):
7
+ gr.HTML(APP_DESCRIPTION_HTML, elem_id="app_description")
8
  gr.HTML(GAME_RULES_HTML, elem_id="rules_accordion")
9
 
 
 
 
 
 
 
 
 
10
  if __name__ == "__main__":
11
  demo.launch()
support/game_settings.py CHANGED
@@ -1,26 +1,45 @@
1
- APP_DESCRIPTION = """
2
- ### 🧩 What This App Does
3
-
4
- This dashboard lets you watch (or join!) teams of Large Language Models (LLMs) play **Codenames** against each other.
5
- Two teams — **Red** and **Blue** — face off in a 4v4 format. Each team has a **Boss** and **three Agents** working together to identify their team’s words before the other side does.
6
-
7
- ### 🤖 How It Works
8
-
9
- * **LLM Teams:** You can assemble teams using different LLMs (e.g., GPT, Claude, Gemini, or OpenSource models...).
10
- * **Human Mode:** You can also jump in as a **Boss** yourself, giving clues to your AI teammates and seeing how well they interpret your hints.
11
- * **Observation Mode:** Prefer to just watch? Sit back and enjoy the game unfold, analyzing how different models reason, cooperate, and sometimes hilariously misfire.
12
-
13
- ### 🧠 Why It’s Interesting
14
-
15
- * **Compare LLM reasoning styles:** See how different models interpret subtle associations and language cues.
16
- * **Team Dynamics:** Watch how collaboration (or confusion) emerges between AIs when they have to coordinate across multiple turns.
17
- * **Human-AI Interaction:** Experiment with leading a team of LLMs and discover how clearly (or creatively) you need to communicate to win.
18
-
19
- ### 🕹️ Main Features
20
-
21
- * Create and customize teams with any available LLMs.
22
- * Switch between **AI vs AI** and **Human vs AI** modes.
23
- * View reasoning and chat logs for each model’s decisions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  """
25
 
26
  GAME_RULES_HTML = """
@@ -509,7 +528,7 @@ ALL_MODELS = sorted({
509
  # Custom header HTML
510
  custom_header = """
511
  <div class="custom-navbar">
512
- <div class="navbar-title">🕵️ Agentic Codenames</div>
513
  <div class="navbar-links">
514
  <a href="#" class="nav-link active" data-tab-id="home_id">Home</a>
515
  <a href="#" class="nav-link" data-tab-id="play_id">Play</a>
 
1
+ APP_DESCRIPTION_HTML = """
2
+ <div style="display: flex; flex-direction: column; gap: 20px;">
3
+ <div><h3>🎥 Watch the Demo</h3></div>
4
+ <div style="display: flex; justify-content: center; margin-top: 20px;">
5
+ <iframe
6
+ width="560"
7
+ height="315"
8
+ src="https://www.youtube.com/embed/DKIfJ-j-GEg"
9
+ frameborder="0"
10
+ style="border-radius:12px; width:100%; max-width:560px;"
11
+ allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
12
+ allowfullscreen>
13
+ </iframe>
14
+ </div>
15
+ <div>
16
+ <h3>🧩 What This App Does</h3>
17
+ <p>This dashboard lets you watch (or join!) teams of Large Language Models (LLMs) play <strong>Codenames</strong> against each other.
18
+ Two teams — <strong>Red</strong> and <strong>Blue</strong> — face off in a 4v4 format. Each team has a <strong>Boss</strong> and <strong>three Agents</strong> working together to identify their team's words before the other side does.</p>
19
+
20
+ <h3>🤖 How It Works</h3>
21
+ <ul>
22
+ <li><strong>LLM Teams:</strong> You can assemble teams using different LLMs (e.g., GPT, Claude, Gemini, or OpenSource models...).</li>
23
+ <li><strong>Human Mode:</strong> You can also jump in as a <strong>Boss</strong> yourself, giving clues to your AI teammates and seeing how well they interpret your hints.</li>
24
+ <li><strong>Observation Mode:</strong> Prefer to just watch? Sit back and enjoy the game unfold, analyzing how different models reason, cooperate, and sometimes hilariously misfire.</li>
25
+ </ul>
26
+
27
+ <h3>🧠 Why It's Interesting</h3>
28
+ <ul>
29
+ <li><strong>Compare LLM reasoning styles:</strong> See how different models interpret subtle associations and language cues.</li>
30
+ <li><strong>Team Dynamics:</strong> Watch how collaboration (or confusion) emerges between AIs when they have to coordinate across multiple turns.</li>
31
+ <li><strong>Human-AI Interaction:</strong> Experiment with leading a team of LLMs and discover how clearly (or creatively) you need to communicate to win.</li>
32
+ <li><strong>Benchmarking & Analytics:</strong> All games are stored in a database. The Stats section includes model win/loss rates, performance comparisons between model families and leaderboards</li>
33
+ </ul>
34
+
35
+ <h3>🕹️ Main Features</h3>
36
+ <ul>
37
+ <li>Create and customize teams with any available LLMs.</li>
38
+ <li>Switch between <strong>AI vs AI</strong> and <strong>Human&AI vs AI</strong> modes.</li>
39
+ <li>View reasoning and chat logs for each model's decisions.</li>
40
+ </ul>
41
+ </div>
42
+ </div>
43
  """
44
 
45
  GAME_RULES_HTML = """
 
528
  # Custom header HTML
529
  custom_header = """
530
  <div class="custom-navbar">
531
+ <div class="navbar-title">🕵️ Agentic Codenames Arena</div>
532
  <div class="navbar-links">
533
  <a href="#" class="nav-link active" data-tab-id="home_id">Home</a>
534
  <a href="#" class="nav-link" data-tab-id="play_id">Play</a>