lewtun HF staff commited on
Commit
68d5169
1 Parent(s): 62e9987

Tweak code

Browse files
Files changed (1) hide show
  1. app/src/index.html +3 -2
app/src/index.html CHANGED
@@ -158,13 +158,14 @@
158
 
159
  <p id="15c1384e-bcac-8076-9664-caa1b098c89c" class="">One subtlety with evaluating answers to math problems is that strings like \(1/\sqrt{3}\) and \(\sqrt{3}/3\) are distinct, but represent mathematically equivalent answers. The standard <a href="https://huggingface.co/papers/2206.14858">way</a> to handle this is to convert the a pair of answers to SymPy objects and then check whether subtracting the two objects and applying <code>sympy.simplify</code> gives zero. </p><p id="15c1384e-bcac-8043-a1d3-ffa0c5c3406e" class="">While this approach works well when comparing a small number of candidate answers, we found it was terribly slow when comparing many pairs in a list of \(N\) candidates; in some cases, slower than generating the candidates in the first place! To deal with this, we first reduced each answer to its <a href="https://en.wikipedia.org/wiki/Canonical_form">canonical form</a> and then computed the frequency of each form to determine the majority vote. Expand the detail below if you’re curious about how the code looks.</p>
160
 
161
- <details><summary style="font-weight:600;font-size:1.25em;line-height:1.3;margin:0"><strong>Implementation detail</strong></summary><div class="indented"><p id="15d1384e-bcac-8080-b676-e09848694520" class="">To obtain the canonical form of an algebraic expression, we first convert the LaTeX string to SymPy, apply <code>sympy.simplify</code>, and finally convert back to LaTeX: </p><script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js" integrity="sha512-7Z9J3l1+EYfeaPKcGXu3MS/7T+w19WtKQY/n+xzmw4hZhJ9tyYmcUS+4QqAlzhicE5LAfMQSF3iFTK9bQdTxXg==" crossorigin="anonymous" referrerPolicy="no-referrer"></script><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" integrity="sha512-tN7Ec6zAFaVSG3TpNAKtk4DOHNpSwKHxxrsiw4GHKESGPs5njn/0sMCUMl2svV4wo4BK/rCP7juYz+zx+l6oeQ==" crossorigin="anonymous" referrerPolicy="no-referrer"/><pre id="15b1384e-bcac-80c0-96c2-feba0d1cc92e" class="code"><code class="language-Python">from latex2sympy2 import latex2sympy
 
162
  from sympy import latex, simplify
163
 
164
  def get_canonical_form(expression: str) -&gt; str:
165
  parsed_expr = latex2sympy(expression)
166
  simplified_expr = simplify(parsed_expr)
167
- return latex(simplified_expr)</code></pre><p id="15c1384e-bcac-80ea-92cd-d199d25281b4" class="">With this function, we can then iterate over all candidate solutions in an list and keep track of how many times a canonical form has been seen before computing the final majority vote:</p><script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js" integrity="sha512-7Z9J3l1+EYfeaPKcGXu3MS/7T+w19WtKQY/n+xzmw4hZhJ9tyYmcUS+4QqAlzhicE5LAfMQSF3iFTK9bQdTxXg==" crossorigin="anonymous" referrerPolicy="no-referrer"></script><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" integrity="sha512-tN7Ec6zAFaVSG3TpNAKtk4DOHNpSwKHxxrsiw4GHKESGPs5njn/0sMCUMl2svV4wo4BK/rCP7juYz+zx+l6oeQ==" crossorigin="anonymous" referrerPolicy="no-referrer"/><pre id="15d1384e-bcac-80f3-9062-c4f44a0d0b8b" class="code"><code class="language-Python">def find_majority_answer(answers: List[str]) -&gt; str:
168
  canonical_groups = defaultdict(int)
169
  canonical_to_original = {}
170
 
 
158
 
159
  <p id="15c1384e-bcac-8076-9664-caa1b098c89c" class="">One subtlety with evaluating answers to math problems is that strings like \(1/\sqrt{3}\) and \(\sqrt{3}/3\) are distinct, but represent mathematically equivalent answers. The standard <a href="https://huggingface.co/papers/2206.14858">way</a> to handle this is to convert the a pair of answers to SymPy objects and then check whether subtracting the two objects and applying <code>sympy.simplify</code> gives zero. </p><p id="15c1384e-bcac-8043-a1d3-ffa0c5c3406e" class="">While this approach works well when comparing a small number of candidate answers, we found it was terribly slow when comparing many pairs in a list of \(N\) candidates; in some cases, slower than generating the candidates in the first place! To deal with this, we first reduced each answer to its <a href="https://en.wikipedia.org/wiki/Canonical_form">canonical form</a> and then computed the frequency of each form to determine the majority vote. Expand the detail below if you’re curious about how the code looks.</p>
160
 
161
+ <details><summary style="font-weight:600;font-size:1.25em;line-height:1.3;margin:0"><strong>Implementation detail</strong></summary><div class="indented"><p id="15d1384e-bcac-8080-b676-e09848694520" class="">To obtain the canonical form of an algebraic expression, we first convert the LaTeX string to SymPy, apply <code>sympy.simplify</code>, and finally convert back to LaTeX: </p>
162
+ <script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js" integrity="sha512-7Z9J3l1+EYfeaPKcGXu3MS/7T+w19WtKQY/n+xzmw4hZhJ9tyYmcUS+4QqAlzhicE5LAfMQSF3iFTK9bQdTxXg==" crossorigin="anonymous" referrerPolicy="no-referrer"></script><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" integrity="sha512-tN7Ec6zAFaVSG3TpNAKtk4DOHNpSwKHxxrsiw4GHKESGPs5njn/0sMCUMl2svV4wo4BK/rCP7juYz+zx+l6oeQ==" crossorigin="anonymous" referrerPolicy="no-referrer"/><pre id="15b1384e-bcac-80c0-96c2-feba0d1cc92e" class="code"><code class="language-Python">from latex2sympy2 import latex2sympy
163
  from sympy import latex, simplify
164
 
165
  def get_canonical_form(expression: str) -&gt; str:
166
  parsed_expr = latex2sympy(expression)
167
  simplified_expr = simplify(parsed_expr)
168
+ return latex(simplified_expr)</code></pre><p id="15c1384e-bcac-80ea-92cd-d199d25281b4" class="">With this function, we can then iterate over all candidate solutions in an list and keep track of how many times a canonical form has been seen before computing the final majority vote:</p><script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js" integrity="sha512-7Z9J3l1+EYfeaPKcGXu3MS/7T+w19WtKQY/n+xzmw4hZhJ9tyYmcUS+4QqAlzhicE5LAfMQSF3iFTK9bQdTxXg==" crossorigin="anonymous" referrerPolicy="no-referrer"></script><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" integrity="sha512-tN7Ec6zAFaVSG3TpNAKtk4DOHNpSwKHxxrsiw4GHKESGPs5njn/0sMCUMl2svV4wo4BK/rCP7juYz+zx+l6oeQ==" crossorigin="anonymous" referrerPolicy="no-referrer"/><pre id="15d1384e-bcac-80f3-9062-c4f44a0d0b8b" class="code"><code class="language-Python">def find_majority_answer(answers: list[str]) -&gt; str:
169
  canonical_groups = defaultdict(int)
170
  canonical_to_original = {}
171