Kung-Hsiang Huang commited on
Commit
e48be8b
1 Parent(s): a43d0db

update description

Browse files
Files changed (2) hide show
  1. app.py +1 -1
  2. src/about.py +1 -1
app.py CHANGED
@@ -171,7 +171,7 @@ with demo:
171
  filter_agentic_framework = gr.CheckboxGroup(
172
  choices=list(original_df["Agentic Framework"].unique()),
173
  value=list(original_df["Agentic Framework"].unique()),
174
- label="Agentic Framework",
175
  info="",
176
  interactive=True,
177
  )
 
171
  filter_agentic_framework = gr.CheckboxGroup(
172
  choices=list(original_df["Agentic Framework"].unique()),
173
  value=list(original_df["Agentic Framework"].unique()),
174
+ label="Agentic Frameworks",
175
  info="",
176
  interactive=True,
177
  )
src/about.py CHANGED
@@ -1,6 +1,6 @@
1
  # Your leaderboard name
2
  TITLE = """<h1 align="center" id="space-title">CRMArena Leaderboard</h1>
3
- CRMArena is a novel benchmark designed to assess LLM agents on realistic customer service tasks within professional environments. By working with CRM experts, CRMArena offers nine challenging tasks across three personas—service agent, analyst, and manager—populated within a simulated organization using 16 interrelated industrial objects. This benchmark invites the community to improve AI agent capabilities in function-calling and work task understanding, demonstrating tangible business value in a realistic Salesforce Org.
4
  """
5
 
6
  # What does your leaderboard evaluate?
 
1
  # Your leaderboard name
2
  TITLE = """<h1 align="center" id="space-title">CRMArena Leaderboard</h1>
3
+ <a href="https://arxiv.org/abs/2411.02305">CRMArena</a> is a novel benchmark designed to assess LLM agents on realistic customer service tasks within professional environments. By working with CRM experts, CRMArena offers nine challenging tasks across three personas—service agent, analyst, and manager—populated within a simulated organization using 16 interrelated industrial objects. This benchmark invites the community to improve AI agent capabilities in function-calling and work task understanding, demonstrating tangible business value in a realistic Salesforce Org.
4
  """
5
 
6
  # What does your leaderboard evaluate?