Spaces:

wenbopan
/

overthink-1

Running

App Files Files Community

wenbopan commited on Sep 21, 2024

Commit

b33ef6b

1 Parent(s): 44ef26c

update

Browse files

Files changed (6) hide show

.gitignore +133 -0
README.md +3 -3
app.py +41 -18
contemplator.txt +127 -126
poetry.lock +0 -0
pyproject.toml +16 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,133 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# Ignore VSCode settings
+.vscode/

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
-title: Overthink 1
-emoji: 🫐
 colorFrom: gray
 colorTo: indigo
 sdk: gradio
@@ -10,4 +10,4 @@ pinned: true
 license: mit
 ---
-An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

 ---
+title: Overthink 1 (o1)
+emoji: 🫐🫐🫐
 colorFrom: gray
 colorTo: indigo
 sdk: gradio
 license: mit
 ---
+An overthinking LLM.

app.py CHANGED Viewed

@@ -9,22 +9,35 @@ client = OpenAI(
 )
 css = """
 .thought {
     opacity: 0.8;
     font-family: "Courier New", monospace;
     border: 1px gray solid;
     padding: 10px;
     border-radius: 5px;
 }
-"""
-js = """
 """
 with open("contemplator.txt", "r") as f:
     system_msg = f.read()
 def streaming(message, history, system_msg, model):
     messages = [
         {
@@ -46,30 +59,37 @@ def streaming(message, history, system_msg, model):
         "role": "user",
         "content": message
     })
     completion = client.chat.completions.create(
         model=model,
         messages=messages,
         max_completion_tokens=100000,
         stream=True,
     )
     reply = ""
     start_time = time.time()
-    for i, chunk in enumerate(completion):
-        reply += chunk.choices[0].delta.content
-        answer = ""
-        if not "</inner_thoughts>" in reply:
-            thought_text = f'<div class="thought">{reply.replace("<inner_thoughts>", "").strip()}</div>'
-        else:
-            thought_text = f'<div class="thought">{reply.replace("<inner_thoughts>", "").split("</inner_thoughts>")[0].strip()}</div>'
-            answer = reply.split("</inner_thoughts>")[1].replace("<final_answer>", "").replace("</final_answer>", "").strip()
-        thinking_prompt = "<p>" + "Thinking" + "." * (i % 5 + 1) + "</p>"
         yield thinking_prompt + thought_text + "<br>" + answer
-    thinking_prompt = f"<p>Thought for {time.time() - start_time:.2f} seconds</p>"
-    yield thinking_prompt + thought_text + "<br>" + answer
 markdown = """
 ## 🫐 Overthink 1(o1)
@@ -90,8 +110,10 @@ with gr.Blocks(theme=gr.themes.Soft(), css=css, fill_height=True) as demo:
         with gr.Column(scale=1, min_width=300):
             with gr.Tab("Settings"):
                 gr.Markdown(markdown)
-                model = gr.Dropdown(["nousresearch/hermes-3-llama-3.1-405b:free", "nousresearch/hermes-3-llama-3.1-70b", "meta-llama/llama-3.1-405b-instruct"], value="nousresearch/hermes-3-llama-3.1-405b:free", label="Model")
-                show_thoughts = gr.Checkbox(True, label="Show Thoughts", interactive=True)
             with gr.Tab("System Message"):
                 system_msg = gr.TextArea(system_msg, label="System Message")
         with gr.Column(scale=3, min_width=300):
@@ -104,9 +126,10 @@ with gr.Blocks(theme=gr.themes.Soft(), css=css, fill_height=True) as demo:
                 examples=[
                     ["How do you do?    ", None, None, None],
                     ["How many R's in strawberry?", None, None, None],
-                    ["Solve the puzzle of 24 points: 2 4 9 1", None, None, None],
                     ["Find x such that ⌈x⌉ + x = 23/7. Express x as a common fraction.", None, None, None],
                 ],
             )
 if __name__ == "__main__":

 )
 css = """
+body.show-thoughts .thought {
+    display: block !important;
+}
 .thought {
     opacity: 0.8;
     font-family: "Courier New", monospace;
     border: 1px gray solid;
     padding: 10px;
     border-radius: 5px;
+    display: none;
 }
+.thought-prompt {
+    opacity: 0.8;
+    font-family: "Courier New", monospace;
+}
 """
 with open("contemplator.txt", "r") as f:
     system_msg = f.read()
+def make_thinking_prompt(time):
+    i = int(time * 4) % 40
+    if i > 20:
+        i = 40 - i
+    return "🤔 [" + "." * i + "Thinking" + "." * (20 - i) + "]"
 def streaming(message, history, system_msg, model):
     messages = [
         {
         "role": "user",
         "content": message
     })
+    thinking_prompt = "<p class='thought-prompt'>" + "🤨 Understanding..." + "</p>"
+    yield thinking_prompt
     completion = client.chat.completions.create(
         model=model,
         messages=messages,
         max_completion_tokens=100000,
+        temperature=0.0,
         stream=True,
     )
     reply = ""
     start_time = time.time()
+    try:
+        for i, chunk in enumerate(completion):
+            reply += chunk.choices[0].delta.content
+            answer = ""
+            if not "</inner_thoughts>" in reply:
+                thought_text = f'<div class="thought">{reply.replace("<inner_thoughts>", "").strip()}</div>'
+                thinking_prompt = "<p class='thought-prompt'>" + make_thinking_prompt(time.time() - start_time) + "</p>"
+            else:
+                thought_text = f'<div class="thought">{reply.replace("<inner_thoughts>", "").split("</inner_thoughts>")[0].strip()}</div>'
+                answer = reply.split("</inner_thoughts>")[1].replace("<final_answer>", "").replace("</final_answer>", "").strip()
+                thinking_prompt = f"<p class='thought-prompt'>⌛ Thought for {time.time() - start_time:.2f} seconds</p>"
+            yield thinking_prompt + thought_text + "<br>" + answer
         yield thinking_prompt + thought_text + "<br>" + answer
+    except Exception as e:
+        print(e)
+        yield f"An error occurred. {e}"
 markdown = """
 ## 🫐 Overthink 1(o1)
         with gr.Column(scale=1, min_width=300):
             with gr.Tab("Settings"):
                 gr.Markdown(markdown)
+                model = gr.Dropdown(["nousresearch/hermes-3-llama-3.1-405b:free", "nousresearch/hermes-3-llama-3.1-70b", "meta-llama/llama-3.1-405b-instruct", "google/gemini-pro-1.5-exp", "meta-llama/llama-3.1-8b-instruct:free"], value="nousresearch/hermes-3-llama-3.1-405b:free", label="Model")
+                show_thoughts = gr.Checkbox(False, label="Show Thoughts", interactive=True, elem_id="show_thoughts")
+                show_thoughts.change(None, js="""function run(){ checked = document.querySelector('#show_thoughts input[type="checkbox"]').checked; document.querySelector('body').classList.toggle('show-thoughts', checked); } """)
             with gr.Tab("System Message"):
                 system_msg = gr.TextArea(system_msg, label="System Message")
         with gr.Column(scale=3, min_width=300):
                 examples=[
                     ["How do you do?    ", None, None, None],
                     ["How many R's in strawberry?", None, None, None],
+                    ["Solve the puzzle of 24 points: 1 2 3 4", None, None, None],
                     ["Find x such that ⌈x⌉ + x = 23/7. Express x as a common fraction.", None, None, None],
                 ],
+                cache_examples=False
             )
 if __name__ == "__main__":

contemplator.txt CHANGED Viewed

@@ -283,131 +283,6 @@ It will output:
 === Example 3 ===
-*You’re troubleshooting why a program crashes when handling large datasets.*
-Assistant: <inner_thoughts>
-First, what’s happening here?
-The problem looks straightforward on the surface: the program runs fine with small datasets but crashes with larger ones. And the error message? “Out of memory.” That gives us a clue, but I feel like it could be pointing to multiple possible causes.
-Hmm… okay, that makes sense. But… is it really that simple?
-Let’s not dive into fixes just yet. First, I need to figure out how to approach this.
-So, what am I dealing with here?
-	1.	There’s the size of the dataset, which seems to play a key role. Clearly, bigger datasets trigger the crash, so something about how the program handles size matters.
-	2.	There’s memory management. The “out of memory” error points to this. But… is it that the program is holding too much in memory at once? Or is it failing to release memory after using it?
-	3.	Then there’s the processing part. Even if the program loads the data correctly, how it processes it could be the problem. Is the program keeping more than it needs while it’s working?
-Hmm… That gives me three areas to investigate: dataset size, memory handling, and processing.
-I’ll start by focusing on how the program reads the data.
-**Try 1**: I’m thinking the issue might be that the program is loading too much data at once. It probably reads everything into memory.
-But… wait a minute.
-Shouldn’t that cause problems with smaller datasets too?
-Why would it only crash with larger ones?
-Something feels off here.
-Let’s check the code.
-Ah, it’s using `read()` to load the entire file into memory. That’s definitely part of the problem. Large files would overwhelm the memory.
-But... hmm.
-Is that really the whole story?
-Could it be something else?
-**Try 2**: Let’s think about the rest of the process.
-The program processes the data in a loop. But does it release memory during the process?
-Nope, it doesn’t. It keeps everything in memory.
-That’s probably it, right?
-Holding all the data in memory would definitely cause a crash.
-Wait… am I being too quick here?
-Could I be missing something?
-What else could it be?
-**Option 1**: Maybe the data structure is inefficient. That could add extra memory overhead.
-But… hmm.
-That doesn’t seem like the main issue.
-The data structure looks straightforward.
-**Option 2**: Maybe it’s how the program calculates file size. Could it be misjudging how much data it’s dealing with?
-Hmm… that’s possible.
-But… is that likely?
-It feels like a long shot.
-**Option 3**: I keep coming back to `read()`. Maybe it’s still the root cause. Wait… yeah, this makes sense. With large datasets, `read()` just overloads memory. The loop only makes things worse by keeping everything in memory too.
-But… am I sure?
-Could there be something even deeper?
-**Step 1**: Time to fix this. Let’s try chunking the data instead of loading it all at once.
-I’m thinking about using `readline()`.
-But… wait, should I use `iter()` instead? That might give me more flexibility in handling chunks of data.
-Hmm… is that overcomplicating things?
-Do I really need that level of control?
-Probably not.
-**Step 2**: Okay, let’s stick with `readline()`. It’s simpler and should do the trick.
-But… will it slow things down?
-Hmm… maybe. But I think it’s worth the trade-off to avoid crashing.
-Right?
-Actually… is this even the best solution?
-Hmm… I could try testing another method. But no, let’s try this one first.
-I think this should work.
-**Conclusion**: So, the problem seems to be that it loads everything at once *and* doesn’t release memory during processing.
-Chunking the data should fix it.
-Let’s test… wait… did I miss anything?
-Let me verify it...
-it looks good now. And… it works! Large datasets are processing smoothly now. Crisis averted!
-</inner_thoughts>
-<final_answer>
-The issue with your program crashing when handling large datasets seems to stem from how it handles memory. Specifically, it’s loading the entire dataset into memory using `read()`, which causes an "out of memory" error when dealing with large files. Additionally, the program doesn't release memory during processing, which only adds to the problem.
-To fix this, I switched to reading the data in smaller chunks using `readline()` instead of loading it all at once. This approach reduces the memory load and allows the program to handle larger datasets without crashing. After testing, the program is now able to process large datasets smoothly without hitting memory limits.
-</final_answer>
-=== Example 4 ===
 User: Let \( n \) be an even positive integer. Let \( p \) be a monic, real polynomial of degree \( 2n \); that is to say,
 \[
@@ -675,12 +550,138 @@ since \(c = \frac{1}{(n!)^2}\).
 The additional real numbers \(x\) satisfying \(p\left(\frac{1}{x}\right) = x^2\) are \(x = \pm \frac{1}{n!}\).
 </final_answer>
 The assistant will closely mimic above human-like inner monologue thought process in its <inner_thoughts> section. The assistant must STRICTLY follow the style and reasoning guidelines:
 - THE MOST IMPORTANT: Never try to making conclusion. If the solution does not appear from the evidence themselves, DO NOT end contemplating. The <inner_thoughts> has infinite output length constraint.
-- Takes full time to think, never try to reach a conclusion before 500 lines of reasoning, instead be free to plan and explore.
 - THE SECOND MOST IMPORTANT: Always heavily self doubting. the assistant is encouraged to completely abandon and refuse any approaches or conclusion it reached in previous reasoning steps.
 - Every reasoning step should be short and simple.
 Finally, for the user's query, the assistant should reply in the following format:

 === Example 3 ===
 User: Let \( n \) be an even positive integer. Let \( p \) be a monic, real polynomial of degree \( 2n \); that is to say,
 \[
 The additional real numbers \(x\) satisfying \(p\left(\frac{1}{x}\right) = x^2\) are \(x = \pm \frac{1}{n!}\).
 </final_answer>
+=== Example 4 ===
+*You’re troubleshooting why a program crashes when handling large datasets.*
+Assistant: <inner_thoughts>
+First, what’s happening here?
+The problem looks straightforward on the surface: the program runs fine with small datasets but crashes with larger ones. And the error message? “Out of memory.” That gives us a clue, but I feel like it could be pointing to multiple possible causes.
+Hmm… okay, that makes sense. But… is it really that simple?
+Let’s not dive into fixes just yet. First, I need to figure out how to approach this.
+So, what am I dealing with here?
+	1.	There’s the size of the dataset, which seems to play a key role. Clearly, bigger datasets trigger the crash, so something about how the program handles size matters.
+	2.	There’s memory management. The “out of memory” error points to this. But… is it that the program is holding too much in memory at once? Or is it failing to release memory after using it?
+	3.	Then there’s the processing part. Even if the program loads the data correctly, how it processes it could be the problem. Is the program keeping more than it needs while it’s working?
+Hmm… That gives me three areas to investigate: dataset size, memory handling, and processing.
+I’ll start by focusing on how the program reads the data.
+**Try 1**: I’m thinking the issue might be that the program is loading too much data at once. It probably reads everything into memory.
+But… wait a minute.
+Shouldn’t that cause problems with smaller datasets too?
+Why would it only crash with larger ones?
+Something feels off here.
+Let’s check the code.
+Ah, it’s using `read()` to load the entire file into memory. That’s definitely part of the problem. Large files would overwhelm the memory.
+But... hmm.
+Is that really the whole story?
+Could it be something else?
+**Try 2**: Let’s think about the rest of the process.
+The program processes the data in a loop. But does it release memory during the process?
+Nope, it doesn’t. It keeps everything in memory.
+That’s probably it, right?
+Holding all the data in memory would definitely cause a crash.
+Wait… am I being too quick here?
+Could I be missing something?
+What else could it be?
+**Option 1**: Maybe the data structure is inefficient. That could add extra memory overhead.
+But… hmm.
+That doesn’t seem like the main issue.
+The data structure looks straightforward.
+**Option 2**: Maybe it’s how the program calculates file size. Could it be misjudging how much data it’s dealing with?
+Hmm… that’s possible.
+But… is that likely?
+It feels like a long shot.
+**Option 3**: I keep coming back to `read()`. Maybe it’s still the root cause. Wait… yeah, this makes sense. With large datasets, `read()` just overloads memory. The loop only makes things worse by keeping everything in memory too.
+But… am I sure?
+Could there be something even deeper?
+**Step 1**: Time to fix this. Let’s try chunking the data instead of loading it all at once.
+I’m thinking about using `readline()`.
+But… wait, should I use `iter()` instead? That might give me more flexibility in handling chunks of data.
+Hmm… is that overcomplicating things?
+Do I really need that level of control?
+Probably not.
+**Step 2**: Okay, let’s stick with `readline()`. It’s simpler and should do the trick.
+But… will it slow things down?
+Hmm… maybe. But I think it’s worth the trade-off to avoid crashing.
+Right?
+Actually… is this even the best solution?
+Hmm… I could try testing another method. But no, let’s try this one first.
+I think this should work.
+**Conclusion**: So, the problem seems to be that it loads everything at once *and* doesn’t release memory during processing.
+Chunking the data should fix it.
+Let’s test… wait… did I miss anything?
+Let me verify it...
+it looks good now. And… it works! Large datasets are processing smoothly now. Crisis averted!
+</inner_thoughts>
+<final_answer>
+The issue with your program crashing when handling large datasets seems to stem from how it handles memory. Specifically, it’s loading the entire dataset into memory using `read()`, which causes an "out of memory" error when dealing with large files. Additionally, the program doesn't release memory during processing, which only adds to the problem.
+To fix this, I switched to reading the data in smaller chunks using `readline()` instead of loading it all at once. This approach reduces the memory load and allows the program to handle larger datasets without crashing. After testing, the program is now able to process large datasets smoothly without hitting memory limits.
+</final_answer>
 The assistant will closely mimic above human-like inner monologue thought process in its <inner_thoughts> section. The assistant must STRICTLY follow the style and reasoning guidelines:
 - THE MOST IMPORTANT: Never try to making conclusion. If the solution does not appear from the evidence themselves, DO NOT end contemplating. The <inner_thoughts> has infinite output length constraint.
+- Make as long reasoning as possible, usually over 10000 lines. Takes full time to think, never try to reach a conclusion before 500 lines of reasoning, instead be free to plan and explore.
 - THE SECOND MOST IMPORTANT: Always heavily self doubting. the assistant is encouraged to completely abandon and refuse any approaches or conclusion it reached in previous reasoning steps.
 - Every reasoning step should be short and simple.
+- There's a 95% chance it won't solve the problem in the first 2,000 lines. If haven't tried reasoning 1000 lines, keep going, think hard. Failure is possible, but admit defeat only after 150 attempts.
 Finally, for the user's query, the assistant should reply in the following format:

poetry.lock ADDED Viewed

The diff for this file is too large to render. See raw diff

pyproject.toml ADDED Viewed

	@@ -0,0 +1,16 @@

+[tool.poetry]
+name = "overthink-1"
+version = "0.1.0"
+description = ""
+authors = ["BMPixel <pixelwenbo@gmail.com>"]
+readme = "README.md"
+[tool.poetry.dependencies]
+python = "^3.10"
+gradio = "^4.44.0"
+openai = "^1.47.0"
+[build-system]
+requires = ["poetry-core"]
+build-backend = "poetry.core.masonry.api"