lamm-mit
/

PRefLexOR_ORPO_DPO_EXO_10242024

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mjbuehler commited on Oct 24, 2024

Commit

1167c9f

·

verified ·

1 Parent(s): eb037a1

Update README.md

Files changed (1) hide show

README.md +61 -1

README.md CHANGED Viewed

@@ -21,6 +21,7 @@ Figure 2: PRefLexOR Recursive Reasoning Algorithm: An iterative approach leverag
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_name='lamm-mit/PRefLexOR_ORPO_DPO_EXO_10242024'
 model = AutoModelForCausalLM.from_pretrained(model_name,
     torch_dtype =torch.bfloat16,
@@ -31,8 +32,13 @@ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True,
                                          )
 ```
-Simple inference:
 ```python
 txt = 'What is the relationship between materials and music? Brief answer.' + f' Use {think_start}.'
 output_text, messages = generate_local_model(
@@ -59,6 +65,60 @@ print ("THINKING:\n\n", thinking)
 print ("ANSWER:\n\n", answer_only)
 ```
 ## Citation
 ```bibtex

 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_name='lamm-mit/PRefLexOR_ORPO_DPO_EXO_10242024'
 model = AutoModelForCausalLM.from_pretrained(model_name,
     torch_dtype =torch.bfloat16,
                                          )
 ```
+## Inference example
+###  Simple inference:
 ```python
+from PRefLexOR import *
 txt = 'What is the relationship between materials and music? Brief answer.' + f' Use {think_start}.'
 output_text, messages = generate_local_model(
 print ("ANSWER:\n\n", answer_only)
 ```
+### Recursive inference usingh multi-agentic modeling
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Load reasoning model
+model_name='lamm-mit/PRefLexOR_ORPO_DPO_EXO_10242024'
+model = AutoModelForCausalLM.from_pretrained(model_name,
+    torch_dtype =torch.bfloat16,
+    attn_implementation="flash_attention_2",device_map="auto",trust_remote_code=True,
+    )
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True,
+                                          use_fast=False,
+                                         )
+# Load critic model
+model_name_base = "meta-llama/Llama-3.2-3B-Instruct"
+critic_model = AutoModelForCausalLM.from_pretrained(
+    model_name_base,
+    torch_dtype=torch.bfloat16,
+    attn_implementation="flash_attention_2",
+    device_map="auto",
+    trust_remote_code=True
+)
+```
+Example inference
+```python
+output_text, output_list, output_text_integrated = recursive_response_from_thinking(
+    model=model,
+    tokenizer=tokenizer,
+    model_critic=critic_model,
+    tokenizer_critic=tokenizer,  #same tokenizer in our case
+    question="Develop an idea of how graphene can be combined with silk fibers to create a filtration membrane.",
+    N=3,
+    temperature=0.1,
+    temperature_improvement=0.1,
+    system_prompt="You are a helpful assistant.",
+    system_prompt_critic="You carefully improve responses, with attention to detail, and following all directions.",
+    verbatim=False,
+```
+Printing the output:
+```python
+for i, item in enumerate(output_list):
+    print (f"i={i}", 64*"-")
+    print (item)
+print (64*"#")
+print ("INTEGRATED RESPONSE:")
+print (output_text_integrated)
+print (64*"#")
+```
 ## Citation
 ```bibtex