RobbiePasquale
/

lightbulb

Model card Files Files and versions Community

RobbiePasquale commited on Oct 10, 2024

Commit

7221607

verified ·

1 Parent(s): 17d845e

Update README.md

Browse files

Files changed (1) hide show

README.md +178 -0

README.md CHANGED Viewed

@@ -116,6 +116,7 @@ The `AutonomousWebAgent` is a sophisticated, multi-component search and retrieva
    - `ToTNode` and `ToTSearch` classes enable the agent to generate thoughts, evaluate them, and navigate through them as a tree, considering various potential paths to best answer the query.
    - It combines MCTS and RAG to synthesize responses based on the generated thought paths.
 ### Training Process
 The training process for the agent involves episodic learning, where it interacts with various queries from a predefined list. Each query initiates an episode, and the agent performs actions based on its learned policy:
@@ -292,6 +293,183 @@ After each epoch, the model is evaluated on the validation set, computing the av
 ### Checkpoints
 At the end of each epoch, the model saves checkpoints of all components, enabling easy resumption or further fine-tuning as needed.
 ## Requirements

    - `ToTNode` and `ToTSearch` classes enable the agent to generate thoughts, evaluate them, and navigate through them as a tree, considering various potential paths to best answer the query.
    - It combines MCTS and RAG to synthesize responses based on the generated thought paths.
 ### Training Process
 The training process for the agent involves episodic learning, where it interacts with various queries from a predefined list. Each query initiates an episode, and the agent performs actions based on its learned policy:
 ### Checkpoints
 At the end of each epoch, the model saves checkpoints of all components, enabling easy resumption or further fine-tuning as needed.
+## Inference Details
+1. Input Processing:
+   - The function takes a query (text input), world model components, a root thought node, and a tokenizer.
+   - The query is tokenized and encoded using the provided tokenizer.
+2. Inference Modes:
+   The function supports three inference modes:
+   a. 'without_world_model':
+      - This mode directly uses the transformer model to generate text.
+      - It doesn't utilize the world model components or the Tree of Thought.
+      - The transformer generates text autoregressively up to the specified max length.
+   b. 'world_model':
+      - This mode uses the world model components but doesn't use the Tree of Thought.
+      - It generates actions based on the prediction network's output.
+   c. 'world_model_tree_of_thought':
+      - This is the most comprehensive mode, using both the world model and the Tree of Thought.
+3. World Model Inference Process:
+   For the 'world_model' and 'world_model_tree_of_thought' modes:
+   a. Initial State:
+      - The query is passed through the transformer model.
+      - The representation network creates an initial state representation from the transformer output.
+   b. Action Selection:
+      - For 'world_model':
+        - The prediction network generates policy logits from the state representation.
+        - Actions are selected based on the highest probabilities in the policy.
+      - For 'world_model_tree_of_thought':
+        - It uses Monte Carlo Tree Search (MCTS) to explore the Tree of Thought.
+        - For each MCTS iteration:
+          * Selection: Traverse the tree to find a leaf node.
+          * Expansion: Add child nodes to the leaf.
+          * Evaluation: Use the prediction network to estimate the value of the node.
+          * Backpropagation: Update the values and visit counts of nodes.
+        - The best action is chosen based on visit counts after MCTS.
+   c. State Transition:
+      - The selected action is applied to the current state using the dynamics network.
+      - This creates a new state representation for the next step.
+   d. Sequence Generation:
+      - The process repeats for the specified number of steps or until a termination condition is met.
+      - For the Tree of Thought approach, it continues until reaching a leaf node in the thought tree.
+4. Output:
+   - For 'without_world_model', it returns the generated text.
+   - For 'world_model' and 'world_model_tree_of_thought', it returns a sequence of selected actions (thoughts).
+The world model inference leverages the learned representations and dynamics to navigate the problem-solving process. The Tree of Thought approach adds structure to this process, guiding the model through a predefined hierarchy of problem-solving steps. This allows for a more structured and potentially more effective approach to complex problem-solving tasks.
+Here I am utilising Trees of Thought as a structure of how to structure sets of policies, and sequences of actions. These Tree structures provide the World Model a general thought structure and pattern, similarly to how humans create thought patterns for solving certain problems (e.g. understand, describe, analyse, etc).
+Here are some example Trees of Thought:
+graph TD
+    A[Problem-Solving Process] --> B[Problem Identification]
+    A --> C[Problem Analysis]
+    A --> D[Solution Generation]
+    A --> E[Implementation]
+    A --> F[Evaluation and Adjustment]
+    B --> B1[Define the Problem]
+    B --> B2[Identify Stakeholders]
+    B --> B3[Determine Constraints]
+    B --> B4[Recognize Problem Type]
+    B --> B5[Historical Context]
+    C --> C1[Root Cause Analysis]
+    C --> C2[System Mapping]
+    C --> C3[Data Collection]
+    C --> C4[Impact Assessment]
+    C --> C5[Theoretical Framework]
+    D --> D1[Creative Problem Solving]
+    D --> D2[Analytical Approach]
+    D --> D3[Mathematical Computation]
+    D --> D4[Decision Making]
+    E --> E1[Action Planning]
+    E --> E2[Resource Allocation]
+    E --> E3[Change Management]
+    F --> F1[Verification]
+    F --> F2[Performance Metrics]
+    F --> F3[Feedback Loops]
+    F --> F4[Continuous Improvement]
+    C3 --> C3a[Quantitative Data]
+    C3 --> C3b[Qualitative Data]
+    C3 --> C3c[Data Validation]
+    D1 --> D1a[Divergent Thinking]
+    D1 --> D1b[Convergent Thinking]
+    D1 --> D1c[Lateral Thinking]
+    D2 --> D2a[Logical Reasoning]
+    D2 --> D2b[Critical Analysis]
+    D2 --> D2c[Systems Thinking]
+    D3 --> D3a[Basic Operations]
+    D3 --> D3b[Advanced Operations]
+    D3 --> D3c[Computational Methods]
+    D4 --> D4a[Decision Trees]
+    D4 --> D4b[Multi-Criteria Analysis]
+    D4 --> D4c[Probabilistic Reasoning]
+    G[Cross-Cutting Considerations] --> G1[Ethical Framework]
+    G --> G2[Stakeholder Management]
+    G --> G3[Interdisciplinary Connections]
+    G --> G4[Technological Integration]
+    G --> G5[Emotional Intelligence]
+    G --> G6[Collaborative Problem Solving]
+    G1 --> G1a[Value-based Decision Making]
+    G1 --> G1b[Long-term Consequences]
+    G2 --> G2a[Direct Stakeholders]
+    G2 --> G2b[Indirect Stakeholders]
+    G2 --> G2c[Conflicting Interests]
+    G3 --> G3a[Related Fields]
+    G3 --> G3b[Cross-disciplinary Impact]
+    G4 --> G4a[AI-assisted Problem Solving]
+    G4 --> G4b[Data-driven Insights]
+    G4 --> G4c[Digital Collaboration Tools]
+    G5 --> G5a[Self-Awareness]
+    G5 --> G5b[Empathy]
+    G5 --> G5c[Stress Management]
+    G6 --> G6a[Team Dynamics]
+    G6 --> G6b[Communication Strategies]
+    G6 --> G6c[Conflict Resolution]
+    H[Computational Considerations] --> H1[CPU Operations]
+    H --> H2[GPU Parallelization]
+    H --> H3[Floating-Point Precision]
+    I[Order of Operations] --> I1[Parentheses]
+    I --> I2[Exponents]
+    I --> I3[Multiplication and Division]
+    I --> I4[Addition and Subtraction]
+    J[Critical Thinking] --> J1[Assumptions Questioning]
+    J --> J2[Bias Recognition]
+    K[Future Perspective] --> K1[Short-term Projections]
+    K --> K2[Long-term Scenarios]
+    K --> K3[Potential Impacts]
+    L[Learning and Adaptation] --> L1[Reflective Practice]
+    L --> L2[Knowledge Transfer]
+    L --> L3[Adaptive Problem Solving]
+graph TD
+    A[Meta-Cognitive Strategies] --> B[Creative Problem Solving]
+    A --> C[Systems Thinking]
+    A --> D[Decision Making]
+    A --> E[Emotional Intelligence]
+    A --> F[Collaborative Problem Solving]
+    B --> B1[Divergent Thinking]
+    B --> B2[Convergent Thinking]
+    B --> B3[Lateral Thinking]
+    C --> C1[Holistic Perspective]
+    C --> C2[Feedback Loops]
+    C --> C3[Emergent Properties]
+    D --> D1[Decision Trees]
+    D --> D2[Multi-Criteria Decision Analysis]
+    D --> D3[Probabilistic Reasoning]
+    E --> E1[Self-Awareness]
+    E --> E2[Empathy]
+    E --> E3[Stress Management]
+    F --> F1[Team Dynamics]
+    F --> F2[Communication Strategies]
+    F --> F3[Conflict Resolution]
+    G[Learning and Adaptation]
+    A --> G
+    G --> G1[Reflective Practice]
+    G --> G2[Knowledge Transfer]
+    G --> G3[Adaptive Problem Solving]
+    H[Ethical Framework]
+    A --> H
+    H --> H1[Value-based Decision Making]
+    H --> H2[Stakeholder Analysis]
+    H --> H3[Long-term Consequences]
+    I[Technological Integration]
+    A --> I
+    I --> I1[AI-assisted Problem Solving]
+    I --> I2[Data-driven Insights]
+    I --> I3[Digital Collaboration Tools]
 ## Requirements