Spaces:

Salesforce
/

CRMArena-Leaderboard

Running

Kung-Hsiang Huang commited on Nov 5, 2024

Commit

e48be8b

1 Parent(s): a43d0db

update description

Files changed (2) hide show

app.py CHANGED Viewed

@@ -171,7 +171,7 @@ with demo:
                     filter_agentic_framework = gr.CheckboxGroup(
                         choices=list(original_df["Agentic Framework"].unique()),
                         value=list(original_df["Agentic Framework"].unique()),
-                        label="Agentic Framework",
                         info="",
                         interactive=True,
                     )

                     filter_agentic_framework = gr.CheckboxGroup(
                         choices=list(original_df["Agentic Framework"].unique()),
                         value=list(original_df["Agentic Framework"].unique()),
+                        label="Agentic Frameworks",
                         info="",
                         interactive=True,
                     )

src/about.py CHANGED Viewed

@@ -1,6 +1,6 @@
 # Your leaderboard name
 TITLE = """<h1 align="center" id="space-title">CRMArena Leaderboard</h1>
-CRMArena is a novel benchmark designed to assess LLM agents on realistic customer service tasks within professional environments. By working with CRM experts, CRMArena offers nine challenging tasks across three personas—service agent, analyst, and manager—populated within a simulated organization using 16 interrelated industrial objects. This benchmark invites the community to improve AI agent capabilities in function-calling and work task understanding, demonstrating tangible business value in a realistic Salesforce Org.
 """
 # What does your leaderboard evaluate?

 # Your leaderboard name
 TITLE = """<h1 align="center" id="space-title">CRMArena Leaderboard</h1>
+<a href="https://arxiv.org/abs/2411.02305">CRMArena</a> is a novel benchmark designed to assess LLM agents on realistic customer service tasks within professional environments. By working with CRM experts, CRMArena offers nine challenging tasks across three personas—service agent, analyst, and manager—populated within a simulated organization using 16 interrelated industrial objects. This benchmark invites the community to improve AI agent capabilities in function-calling and work task understanding, demonstrating tangible business value in a realistic Salesforce Org.
 """
 # What does your leaderboard evaluate?