Aluode commited on
Commit
aa3cea4
1 Parent(s): 4be4001

Upload 6 files

Browse files
Files changed (6) hide show
  1. LICENSE +21 -0
  2. README.md +151 -3
  3. app.py +488 -0
  4. notes.txt +28 -0
  5. questionanswerpairsample.json +54 -0
  6. requirements.txt +9 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2024 anttiluode
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,3 +1,151 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Dynamic AI: Fractal Universe Chocolate Wafer Model (FUCWM)
3
+
4
+ ### Watch Demo
5
+
6
+ Check out the demo of the project on YouTube: [Watch Here](https://www.youtube.com/live/d__ras4nLU4)
7
+
8
+
9
+ Dynamic AI is an experimental neural network model inspired by fractal structures in the universe and the human brain. It incorporates recursive nodes (FractalNodes) to dynamically grow and learn through Hebbian-like updates and pruning. The model also integrates a VAE (Variational Autoencoder) for encoding latent space representations. This repository contains the code for training, chatting, and interacting with the model via a Gradio interface.
10
+
11
+ # Attention Mechanism (New)
12
+
13
+ The attention mechanism dynamically adjusts the focus of the model by assigning importance to different child nodes in the fractal structure. Each child node receives an attention score based on its relevance, which is calculated using a softmax function. This allows the model to prioritize certain nodes over others during the forward pass, enabling more efficient learning and processing. Additionally, the model maintains a co-activation matrix that tracks how frequently different nodes are activated together, which further refines the attention scores. This approach enhances the model’s adaptability and helps manage complex hierarchical interactions.
14
+
15
+
16
+ ## Features
17
+
18
+ - **Recursive Fractal Nodes**: Nodes can grow and create child nodes based on the complexity of their output, simulating the recursive, fractal-like nature of the brain and the universe.
19
+ - **Variational Autoencoder (VAE)**: Encodes latent representations of inputs.
20
+ - **Layer Normalization and Xavier Initialization**: Enhances training stability.
21
+ - **Dynamic Complexity-based Growth**: Nodes grow based on complexity thresholds and manage child connections.
22
+ - **Dynamic AI Chat**: Users can interact with the model to generate responses.
23
+ - **LM Studio Integration**: Chat with a local LM Studio instance in a collaborative conversational framework.
24
+ - **Gradio Interface**: A user-friendly interface to interact with the AI model, train it on Q&A pairs, and simulate conversations with LM Studio.
25
+
26
+ ## What it is?
27
+
28
+ Think Fractal ball around big bang with chocolate wafer inspired super weights that add complexity to normal weights.
29
+
30
+ Now with added attention mechanism. I just asked Claude to think of the node 1 as a sort of phone book that keeps tabs
31
+ of the child nodes and can hooke em up if they fire together. So they can.. Wire together. Tsk Tsk. You know what I mean.
32
+
33
+ The depth setting can make the ball complexities explode to Nan territory real fast and there was a real fight to keep the
34
+ complexity setting at bay.
35
+
36
+ ## Requirements
37
+ - Python 3.8+
38
+ - PyTorch
39
+ - Gradio
40
+ - LM Studio (optional, for integration with the `talk_with_lm_studio` feature)
41
+ - Etc
42
+ The requirements.txt was written by ChatGPT. I have not tested if it would work as it is.
43
+
44
+ ## Problems?
45
+
46
+ Ask from Claude / ChatGPT. Paste them this and the code. They will understand what to do. NotebookLM
47
+ talking heads think this is groundbreaking. But since all I hear is crickets. I guess it aint. But it most
48
+ def has been a wild ride.
49
+
50
+ ## Installation
51
+
52
+ 1. Clone this repository:
53
+
54
+ ```bash
55
+ git clone https://github.com/anttiluode/DaFUC.git
56
+ cd dynamic-ai-fractal-wafer
57
+ ```
58
+
59
+ 2. Install dependencies:
60
+
61
+ ```bash
62
+ pip install -r requirements.txt
63
+ ```
64
+
65
+ 3. If you're planning to use LM Studio, ensure it's installed and running locally. Configure the `lm_studio_client` by setting your API key and URL in the code.
66
+
67
+ 4. Run the application:
68
+
69
+ ```bash
70
+ python app.py
71
+ ```
72
+
73
+ ## Usage
74
+
75
+ ### 1. Chat with Dynamic AI
76
+
77
+ You can use the Gradio interface to chat with the Dynamic AI model.
78
+
79
+ - **Message**: Enter your message and adjust the temperature for creativity.
80
+ - **Response**: The AI will generate a response based on its learned knowledge.
81
+
82
+ ### 2. Train the Model on Q&A Pairs
83
+
84
+ You can train the model on a list of question-answer pairs using the Gradio interface.
85
+
86
+ - **Q&A Pairs File**: Upload a JSON file containing question-answer pairs.
87
+ - **Epochs**: Set the number of training epochs.
88
+ - **Training Output**: Monitor the progress of training, including loss metrics.
89
+ - This can lead to the complexity being wildly off and the model begins to parrot the words in the
90
+ - question answer pairs.
91
+
92
+ ### 3. LM Studio Conversation
93
+
94
+ ! You may have to wait a while for the conversation to start. I think perhaps there are
95
+ multiple empty interactions but eventually the model says something and lm studio grabs on to that.
96
+ If you teach the model with question answer pairs it sticks on to them and the complexity
97
+ does not stabilze. On the initial live training video I did at the beginning of this readme
98
+ something amazing happened. The complexity stabilized at 16 and did not budge.
99
+
100
+ You can simulate a collaborative conversation between Dynamic AI and LM Studio:
101
+
102
+ - **Initial Message**: Set the initial message to start the conversation.
103
+ - **Duration**: Set the duration of the conversation.
104
+ - **Delay**: Set the delay between messages.
105
+
106
+ This is a good start. The question answer pairs seem to produce more random AI.
107
+
108
+ ### 4. Save/Load Model State
109
+
110
+ You can save and load the state of the model using the Gradio interface.
111
+
112
+ - **Save State**: Save the current model state to a file.
113
+ - **Load State**: Load a previously saved state to restore the model.
114
+
115
+ ## Example
116
+
117
+ To chat with Dynamic AI using the command line interface:
118
+
119
+ ```bash
120
+ python app.py
121
+ ```
122
+
123
+ Then, access the Gradio interface from your browser. You can interact with the AI by typing messages, training it, or saving/loading its state.
124
+
125
+ ### Training Data Format
126
+
127
+ The Q&A pairs should be provided in a JSON file in the following format:
128
+
129
+ ```json
130
+ [
131
+ {"question": "What is the capital of France?", "answer": "Paris"},
132
+ {"question": "Who wrote '1984'?", "answer": "George Orwell"}
133
+ ]
134
+ ```
135
+
136
+ ### Contribution
137
+
138
+ Feel free to contribute to this project by submitting pull requests or opening issues for improvements or bugs.
139
+
140
+ ### Issues
141
+
142
+ The depth settings are extremely important:
143
+
144
+ dynamic_ai = DynamicAI(vocab_size=50000, embed_dim=256, latent_dim=256, output_dim=256, max_depth=7)
145
+
146
+ As the deeper it gets, the deeper the "fractal ball" around point one "Think big bang" gets, the more
147
+ complex it gets. You hit NAN (out of reach) complexity very fast and the thing wont work.
148
+
149
+ ### License
150
+
151
+ This project is licensed under the MIT License.
app.py ADDED
@@ -0,0 +1,488 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ import torch.optim as optim
4
+ import torch.nn.functional as F
5
+ import random
6
+ import json
7
+ import logging
8
+ import gradio as gr
9
+ import time
10
+ from openai import OpenAI
11
+
12
+ logging.basicConfig(level=logging.INFO)
13
+ logger = logging.getLogger(__name__)
14
+
15
+ # VAE Class for Latent Space Encoding
16
+ class VAE(nn.Module):
17
+ def __init__(self, input_dim, latent_dim):
18
+ super().__init__()
19
+ self.encoder = nn.Sequential(
20
+ nn.Linear(input_dim, 128),
21
+ nn.ReLU(),
22
+ nn.Linear(128, latent_dim * 2)
23
+ )
24
+ self.decoder = nn.Sequential(
25
+ nn.Linear(latent_dim, 128),
26
+ nn.ReLU(),
27
+ nn.Linear(128, input_dim)
28
+ )
29
+ self.latent_dim = latent_dim
30
+
31
+ def forward(self, x):
32
+ mu_logvar = self.encoder(x)
33
+ mu, logvar = torch.chunk(mu_logvar, 2, dim=-1)
34
+ z = self.sample_latent(mu, logvar)
35
+ recon = self.decoder(z)
36
+ return recon, mu, logvar, z
37
+
38
+ def sample_latent(self, mu, logvar):
39
+ std = torch.exp(0.5 * logvar)
40
+ eps = torch.randn_like(std)
41
+ return mu + eps * std
42
+
43
+ def vae_loss(self, recon_x, x, mu, logvar):
44
+ recon_loss = nn.MSELoss()(recon_x, x)
45
+ kld = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
46
+ return recon_loss + kld
47
+
48
+ # FractalNode class for FUCWM with Attention
49
+ class FractalNode(nn.Module):
50
+ def __init__(self, input_dim, output_dim, depth=0, max_depth=5, max_children=2):
51
+ super().__init__()
52
+ self.traditional_weight = nn.Linear(input_dim, output_dim)
53
+ nn.init.xavier_uniform_(self.traditional_weight.weight)
54
+ self.superweight = nn.Parameter(torch.eye(output_dim))
55
+ self.norm = nn.LayerNorm(output_dim)
56
+ self._children = []
57
+ self.is_active = True
58
+ self.max_children = max_children
59
+ self.complexity_threshold = 0.5
60
+ self.depth = depth
61
+ self.max_depth = max_depth
62
+ self.attention_weights = nn.Parameter(torch.ones(max_children))
63
+
64
+ def forward(self, x):
65
+ if x.dim() == 1:
66
+ x = x.unsqueeze(0)
67
+
68
+ base_output = self.traditional_weight(x)
69
+ base_output = self.norm(base_output)
70
+ complexity = self.calculate_complexity(base_output)
71
+
72
+ if complexity > self.complexity_threshold and len(self._children) < self.max_children and self.depth < self.max_depth:
73
+ new_child = FractalNode(self.traditional_weight.out_features, self.traditional_weight.out_features,
74
+ depth=self.depth+1, max_depth=self.max_depth)
75
+ self._children.append(new_child)
76
+ self.add_module(f'child_{len(self._children)}', new_child)
77
+
78
+ modulated_output = torch.matmul(self.superweight, base_output.unsqueeze(-1)).squeeze(-1)
79
+
80
+ for i, child in enumerate(self._children):
81
+ if child.is_active:
82
+ child_output = child(modulated_output)
83
+ modulated_output = modulated_output + child_output * F.softmax(self.attention_weights, dim=0)[i]
84
+
85
+ return modulated_output
86
+
87
+ def calculate_complexity(self, output):
88
+ return torch.log(1 + torch.norm(output))
89
+
90
+ def calculate_relevance(self, child_output):
91
+ return torch.sigmoid(torch.sum(child_output * self.superweight))
92
+
93
+ def update_superweights(self, context):
94
+ context_influence = torch.tanh(torch.matmul(self.superweight, context.unsqueeze(-1))).squeeze(-1)
95
+ self.superweight.data = self.superweight.data + 0.01 * context_influence
96
+ for child in self._children:
97
+ if child.is_active:
98
+ child.update_superweights(context)
99
+
100
+ def grow(self, complexity_threshold):
101
+ if self.calculate_complexity(self.traditional_weight.weight) > complexity_threshold and len(self._children) < self.max_children and self.depth < self.max_depth:
102
+ new_child = FractalNode(self.traditional_weight.out_features, self.traditional_weight.out_features,
103
+ depth=self.depth+1, max_depth=self.max_depth)
104
+ self._children.append(new_child)
105
+ self.add_module(f'child_{len(self._children)}', new_child)
106
+ for child in self._children:
107
+ child.grow(complexity_threshold)
108
+
109
+ def update_attention(self, co_activation_vector):
110
+ self.attention_weights.data += co_activation_vector[:len(self._children)]
111
+ self.attention_weights.data = F.softmax(self.attention_weights, dim=0)
112
+
113
+ @property
114
+ def complexity(self):
115
+ return torch.norm(self.superweight)
116
+
117
+ @property
118
+ def children(self):
119
+ return self._children
120
+
121
+ # FUCWM class with Attention
122
+ class FUCWM(nn.Module):
123
+ def __init__(self, vocab_size, embed_dim, output_dim, max_depth=5):
124
+ super().__init__()
125
+ self.word_embeddings = nn.Embedding(vocab_size, embed_dim)
126
+ self.root = FractalNode(embed_dim, output_dim, max_depth=max_depth)
127
+ self.max_depth = max_depth
128
+ self.co_activation_matrix = torch.zeros((max_depth, max_depth))
129
+
130
+ def forward(self, x):
131
+ if x.dtype == torch.long:
132
+ embedded = self.word_embeddings(x)
133
+ if embedded.dim() == 3:
134
+ embedded = embedded.mean(dim=1)
135
+ else:
136
+ embedded = x
137
+
138
+ output = self.root(embedded)
139
+ self.update_co_activations()
140
+ return output
141
+
142
+ def grow(self, complexity_threshold):
143
+ self.root.grow(complexity_threshold)
144
+
145
+ def update_superweights(self, context):
146
+ self.root.update_superweights(context)
147
+
148
+ def manage_padding(self):
149
+ def _manage_padding(node, depth):
150
+ if depth >= self.max_depth:
151
+ node.is_active = False
152
+ else:
153
+ activation = torch.norm(node.superweight)
154
+ if not node.is_active and activation > 0.5:
155
+ node.is_active = True
156
+ elif node.is_active and activation < 0.1:
157
+ node.is_active = False
158
+ for child in node.children:
159
+ _manage_padding(child, depth + 1)
160
+ _manage_padding(self.root, 0)
161
+
162
+ def update_co_activations(self):
163
+ for i in range(self.max_depth):
164
+ for j in range(self.max_depth):
165
+ if i != j:
166
+ self.co_activation_matrix[i][j] += 0.1 * random.random()
167
+
168
+ self.co_activation_matrix = F.softmax(self.co_activation_matrix, dim=1)
169
+
170
+ def update_attention_weights(self):
171
+ def update_node(node, depth):
172
+ node.update_attention(self.co_activation_matrix[depth])
173
+ for child in node.children:
174
+ update_node(child, depth+1)
175
+
176
+ update_node(self.root, 0)
177
+
178
+ class DynamicAI:
179
+ def __init__(self, vocab_size=10000, embed_dim=64, latent_dim=64, output_dim=64, max_depth=5):
180
+ self.vae = VAE(embed_dim, latent_dim)
181
+ self.model = FUCWM(vocab_size, embed_dim, output_dim, max_depth)
182
+ self.optimizer = optim.Adam(list(self.vae.parameters()) + list(self.model.parameters()), lr=0.0001)
183
+ self.scheduler = optim.lr_scheduler.StepLR(self.optimizer, step_size=5, gamma=0.1)
184
+ self.criterion = nn.MSELoss()
185
+ self.word_to_index = {}
186
+ self.index_to_word = {}
187
+ self.next_index = 0
188
+ self.lm_studio_client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
189
+
190
+ def tokenize(self, text):
191
+ words = text.lower().split()
192
+ indices = []
193
+ for word in words:
194
+ if word not in self.word_to_index:
195
+ self.word_to_index[word] = self.next_index
196
+ self.index_to_word[self.next_index] = word
197
+ self.next_index += 1
198
+ indices.append(self.word_to_index[word])
199
+ return torch.tensor(indices, dtype=torch.long).unsqueeze(0)
200
+
201
+ def chat(self, input_text, max_length=20, temperature=0.7):
202
+ input_tokens = self.tokenize(input_text)
203
+ thinking_process = []
204
+ with torch.no_grad():
205
+ embedded_q = self.model.word_embeddings(input_tokens)
206
+ _, _, _, z_q = self.vae(embedded_q.mean(dim=1))
207
+ output, node_info = self.fractal_thinking(z_q)
208
+ thinking_process.append(node_info)
209
+
210
+ response = []
211
+ for _ in range(max_length):
212
+ output = output / temperature
213
+ probs = torch.softmax(output, dim=-1)
214
+ next_word_index = torch.multinomial(probs, 1).item()
215
+ next_word = self.index_to_word.get(next_word_index, "")
216
+ if next_word:
217
+ response.append(next_word)
218
+ if next_word in ['.', '!', '?']:
219
+ break
220
+ next_token = self.tokenize(next_word)
221
+ _, _, _, next_latent = self.vae(self.model.word_embeddings(next_token).mean(dim=1))
222
+ output, node_info = self.fractal_thinking(next_latent)
223
+ thinking_process.append(node_info)
224
+ else:
225
+ break
226
+
227
+ thinking_str = "\n".join(thinking_process)
228
+ response_str = ' '.join(response)
229
+ return f"Thinking Process:\n{thinking_str}\n\nResponse: {response_str}"
230
+
231
+ def fractal_thinking(self, input_vector):
232
+ def traverse_node(node, x, depth):
233
+ node_info = f"Node depth: {depth}, Complexity: {node.complexity.item():.4f}, Children: {len(node.children)}"
234
+ output = node(x)
235
+
236
+ if depth < node.max_depth:
237
+ for child in node.children:
238
+ child_output, child_info = traverse_node(child, output, depth + 1)
239
+ output = output + child_output * node.calculate_relevance(child_output)
240
+ node_info += f"\n{child_info}"
241
+
242
+ return output, node_info
243
+
244
+ output, node_info = traverse_node(self.model.root, input_vector, 0)
245
+ return output, node_info
246
+
247
+ def talk_with_lm_studio(self, initial_message, conversation_duration=60, delay=2):
248
+ message = initial_message
249
+ start_time = time.time()
250
+ conversation_log = []
251
+
252
+ while time.time() - start_time < conversation_duration:
253
+ ai_response = self.chat(message)
254
+ logger.info(f"DynamicAI:\n{ai_response}")
255
+ conversation_log.append(f"DynamicAI:\n{ai_response}")
256
+ yield "\n\n".join(conversation_log)
257
+
258
+ ai_message = ai_response.split("Response: ")[-1].strip()
259
+
260
+ if not ai_message:
261
+ logger.info("DynamicAI generated an empty response. Skipping LM Studio turn.")
262
+ conversation_log.append("DynamicAI: [No response generated. Still learning...]")
263
+ yield "\n\n".join(conversation_log)
264
+ time.sleep(delay)
265
+ continue
266
+
267
+ lm_studio_response = self.send_to_lm_studio(ai_message)
268
+ if lm_studio_response:
269
+ logger.info(f"LM Studio: {lm_studio_response}")
270
+ conversation_log.append(f"LM Studio: {lm_studio_response}")
271
+ message = lm_studio_response
272
+ yield "\n\n".join(conversation_log)
273
+ else:
274
+ logger.warning("No response from LM Studio. Ending conversation.")
275
+ break
276
+
277
+ time.sleep(delay)
278
+
279
+ def send_to_lm_studio(self, message):
280
+ if not message.strip():
281
+ logger.warning("Attempted to send an empty message to LM Studio. Skipping.")
282
+ return None
283
+
284
+ try:
285
+ completion = self.lm_studio_client.chat.completions.create(
286
+ model="unsloth/Llama-3.2-3B-Instruct-GGUF",
287
+ messages=[
288
+ {"role": "system", "content": "You're talking to an experimental fractal AI that is still learning to communicate. If it doesn't respond or sends empty messages, please be patient and continue the conversation."},
289
+ {"role": "user", "content": message}
290
+ ],
291
+ temperature=0.7,
292
+ )
293
+ response = completion.choices[0].message.content
294
+ return response
295
+ except Exception as e:
296
+ logger.error(f"Error sending to LM Studio: {str(e)}")
297
+ return None
298
+
299
+ def train_on_qa_pairs(self, qa_pairs, epochs=10):
300
+ if not isinstance(qa_pairs, list) or len(qa_pairs) == 0:
301
+ raise ValueError("qa_pairs must be a non-empty list")
302
+
303
+ logger.info(f"Training on {len(qa_pairs)} Q&A pairs for {epochs} epochs...")
304
+ for epoch in range(epochs):
305
+ total_loss = 0
306
+ errors = 0
307
+ random.shuffle(qa_pairs)
308
+ for i, (question, answer) in enumerate(qa_pairs):
309
+ self.optimizer.zero_grad()
310
+
311
+ try:
312
+ q_tokens = self.tokenize(question)
313
+ a_tokens = self.tokenize(answer)
314
+
315
+ q_embedded = self.model.word_embeddings(q_tokens)
316
+ _, _, _, q_latent = self.vae(q_embedded.mean(dim=1))
317
+
318
+ a_embedded = self.model.word_embeddings(a_tokens)
319
+ _, _, _, a_latent = self.vae(a_embedded.mean(dim=1))
320
+
321
+ q_output = self.model(q_latent)
322
+ a_output = self.model(a_latent)
323
+
324
+ loss = self.criterion(q_output, a_output)
325
+ loss.backward()
326
+
327
+ torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=1.0)
328
+
329
+ self.optimizer.step()
330
+
331
+ total_loss += loss.item()
332
+
333
+ self.model.grow(complexity_threshold=0.5)
334
+ self.model.update_superweights(q_output.detach())
335
+ self.model.manage_padding()
336
+ self.model.update_attention_weights()
337
+
338
+ if i % 10 == 0:
339
+ logger.info(f"Epoch {epoch+1}, Pair {i+1}/{len(qa_pairs)}, Loss: {loss.item():.4f}")
340
+
341
+ except Exception as e:
342
+ logger.error(f"Error processing pair: {question} | {answer}")
343
+ logger.error(f"Error details: {str(e)}")
344
+ errors += 1
345
+ continue
346
+
347
+ avg_loss = total_loss / (len(qa_pairs) - errors) if len(qa_pairs) > errors else 0
348
+ logger.info(f"Epoch {epoch+1}/{epochs}, Average Loss: {avg_loss:.4f}, Errors: {errors}")
349
+
350
+ self.scheduler.step()
351
+
352
+ self.save_state(f"model_state_epoch_{epoch+1}.pth")
353
+
354
+ yield epoch + 1, avg_loss, errors
355
+
356
+ def save_state(self, filename):
357
+ state = {
358
+ 'model_state': self.model.state_dict(),
359
+ 'vae_state': self.vae.state_dict(),
360
+ 'optimizer_state': self.optimizer.state_dict(),
361
+ 'scheduler_state': self.scheduler.state_dict(),
362
+ 'word_to_index': self.word_to_index,
363
+ 'index_to_word': self.index_to_word,
364
+ 'next_index': self.next_index
365
+ }
366
+ torch.save(state, filename)
367
+ logger.info(f"Model state saved to {filename}")
368
+
369
+ def load_state(self, filename):
370
+ state = torch.load(filename)
371
+ self.word_to_index = state['word_to_index']
372
+ self.index_to_word = state['index_to_word']
373
+ self.next_index = state['next_index']
374
+
375
+ self.rebuild_model_structure(state['model_state'])
376
+
377
+ self.model.load_state_dict(state['model_state'])
378
+ self.vae.load_state_dict(state['vae_state'])
379
+ self.optimizer.load_state_dict(state['optimizer_state'])
380
+ self.scheduler.load_state_dict(state['scheduler_state'])
381
+
382
+ logger.info(f"Model state loaded from {filename}")
383
+
384
+ def rebuild_model_structure(self, state_dict):
385
+ def rebuild_node(node, prefix):
386
+ child_indices = set()
387
+ for name in state_dict.keys():
388
+ if name.startswith(prefix):
389
+ parts = name[len(prefix):].split('.')
390
+ if parts[0].startswith('child_'):
391
+ child_index = int(parts[0].split('_')[1])
392
+ child_indices.add(child_index)
393
+
394
+ for index in sorted(child_indices):
395
+ while len(node._children) < index:
396
+ new_child = FractalNode(node.traditional_weight.out_features,
397
+ node.traditional_weight.out_features,
398
+ depth=node.depth+1,
399
+ max_depth=node.max_depth)
400
+ node._children.append(new_child)
401
+ node.add_module(f'child_{len(node._children)}', new_child)
402
+
403
+ child_prefix = f"{prefix}child_{index}."
404
+ rebuild_node(node._children[index-1], child_prefix)
405
+
406
+ rebuild_node(self.model.root, "root.")
407
+
408
+ def grow(self, complexity_threshold):
409
+ self.model.grow(complexity_threshold)
410
+
411
+ def update_superweights(self, context):
412
+ self.model.update_superweights(context)
413
+
414
+ def manage_padding(self):
415
+ self.model.manage_padding()
416
+
417
+ # Gradio Interface for DynamicAI
418
+ def create_gradio_interface(ai):
419
+ def handle_chat(message, temperature):
420
+ return ai.chat(message, temperature=float(temperature))
421
+
422
+ def handle_save(filename):
423
+ ai.save_state(filename)
424
+ return f"State saved to {filename}"
425
+
426
+ def handle_load(filename):
427
+ ai.load_state(filename)
428
+ return f"State loaded from {filename}"
429
+
430
+ def handle_train_qa(qa_pairs_file, epochs):
431
+ try:
432
+ with open(qa_pairs_file.name, 'r', encoding='utf-8') as f:
433
+ qa_pairs = json.load(f)
434
+
435
+ output = ["Starting training..."]
436
+ for epoch, loss, errors in ai.train_on_qa_pairs(qa_pairs, epochs=int(epochs)):
437
+ output.append(f"Epoch {epoch}/{epochs}, Loss: {loss:.4f}, Errors: {errors}")
438
+ output.append("Training completed successfully")
439
+ return "\n".join(output)
440
+ except Exception as e:
441
+ return f"Error during training: {str(e)}"
442
+
443
+ def handle_lm_studio_chat(initial_message, duration, delay):
444
+ conversation_log = gr.Textbox()
445
+ for log in ai.talk_with_lm_studio(initial_message, conversation_duration=float(duration), delay=float(delay)):
446
+ conversation_log = log
447
+ yield conversation_log
448
+
449
+ with gr.Blocks() as interface:
450
+ gr.Markdown("# Dynamic AI with Fractal Universe Chocolate Wafer Model and Attention Mechanism")
451
+
452
+ with gr.Tab("Chat"):
453
+ chat_input = gr.Textbox(label="Your message")
454
+ temperature = gr.Slider(minimum=0.1, maximum=2.0, value=0.7, label="Temperature")
455
+ chat_output = gr.Textbox(label="AI response")
456
+ chat_button = gr.Button("Chat")
457
+ chat_button.click(handle_chat, inputs=[chat_input, temperature], outputs=chat_output)
458
+
459
+ with gr.Tab("LM Studio Conversation"):
460
+ initial_message = gr.Textbox(label="Initial Message")
461
+ duration = gr.Number(label="Conversation Duration (seconds)", value=60)
462
+ delay = gr.Number(label="Delay between messages (seconds)", value=2)
463
+ conversation_log = gr.Textbox(label="Conversation Log", lines=20)
464
+ start_conversation = gr.Button("Start Conversation")
465
+ start_conversation.click(handle_lm_studio_chat, inputs=[initial_message, duration, delay], outputs=conversation_log)
466
+
467
+ with gr.Tab("Train on Q&A"):
468
+ qa_file = gr.File(label="Q&A Pairs JSON File")
469
+ epochs_input = gr.Number(label="Number of Epochs", value=10)
470
+ train_button = gr.Button("Train on Q&A Pairs")
471
+ train_output = gr.Textbox(label="Training status")
472
+ train_button.click(handle_train_qa, inputs=[qa_file, epochs_input], outputs=train_output)
473
+
474
+ with gr.Tab("Save/Load State"):
475
+ filename_input = gr.Textbox(label="Filename")
476
+ save_button = gr.Button("Save State")
477
+ load_button = gr.Button("Load State")
478
+ state_output = gr.Textbox(label="Operation result")
479
+ save_button.click(handle_save, inputs=filename_input, outputs=state_output)
480
+ load_button.click(handle_load, inputs=filename_input, outputs=state_output)
481
+
482
+ return interface
483
+
484
+ # Main execution
485
+ if __name__ == "__main__":
486
+ dynamic_ai = DynamicAI(vocab_size=50000, embed_dim=256, latent_dim=256, output_dim=256, max_depth=7)
487
+ iface = create_gradio_interface(dynamic_ai)
488
+ iface.launch()
notes.txt ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ It is most interesting when you let LM studio talk to it from the start.
2
+
3
+ The depth settings (at the end of the code) can fast lead to explosive complexity growth.
4
+
5
+ "I" (Claude / ChatGPT) added a lot of things from actual AI's into it. Ie forward / backward pass, vae etc. Have fun.
6
+
7
+ I do not really claim to understand it well. I had faint idea of fractals originating from big bang (Matter and galaxies are spread in fractal shape).
8
+ Or thoughts being processed in the brain and AI as fractals which has been the main stream idea since 1986.
9
+
10
+ Interestingly most of the thinking is happening in the outer most layer as in "cortex". But perhaps that is natural as when you have a big lump
11
+ of thinking nodes. Most of them are in the outer edge.
12
+
13
+ The chocolate wafer model was a way to try to make normal ai model more dynamic, introducing a "super weight". Imagine wafer full of normal weights
14
+ and bunch of them constituting a super weight. That was somehow transferred to this code from a older code I did to try to make a normal AI dynamic.
15
+ I was thinking, if it is impossible to change them all, how about we change clusters of them.
16
+
17
+ it was coded by Claude / ChatGPT, although it was like wrangling cats to make the dynamic ai to talk. The problem is that normal AI's deal with
18
+ matrix multiplications etc that are known. In fractal, there is nothing that is known, it is growing ball of mathematics. We did scalars etc.
19
+ Tried to originate the fractals from the "big bang" and do forward and backward passes between the big bang and the outer layers. I do not
20
+ claim to understand what it is that was coded in yet myself either.
21
+
22
+ It was end result of a weird week.
23
+
24
+ Most of the problems came from trying to keep the complexities at bay. We had it grow in complexity to Nan very fast at some settings.
25
+
26
+ LM Studio chat with it makes it seem almost like it makes sense. Teaching the json question answer pairs it is answering in jumbled mess.
27
+
28
+ When you talk to it for the first time. It is like big bang. The first fractal appears.. :D
questionanswerpairsample.json ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ ["What is the largest planet in our solar system?", "Jupiter is the largest planet in our solar system, known for its massive size and prominent storm, the Great Red Spot."],
3
+ ["Who was the first President of the United States?", "George Washington was the first President of the United States, serving from 1789 to 1797."],
4
+ ["What is the speed of light?", "The speed of light is approximately 299,792 kilometers per second (186,282 miles per second) in a vacuum."],
5
+ ["What is the capital of Japan?", "The capital of Japan is Tokyo, a major cultural, political, and economic hub in the world."],
6
+ ["Who developed the theory of relativity?", "Albert Einstein developed the theory of relativity, which revolutionized the understanding of space, time, and gravity."],
7
+ ["What is the primary gas found in Earth's atmosphere?", "Nitrogen makes up about 78% of Earth's atmosphere, making it the most abundant gas."],
8
+ ["Which artist painted the Mona Lisa?", "Leonardo da Vinci painted the Mona Lisa, one of the most famous artworks in history."],
9
+ ["What is the smallest unit of matter?", "The atom is the smallest unit of matter that retains the properties of an element."],
10
+ ["Who discovered penicillin?", "Alexander Fleming discovered penicillin in 1928, which led to the development of antibiotics."],
11
+ ["What is the chemical symbol for gold?", "The chemical symbol for gold is Au, derived from its Latin name 'Aurum'."],
12
+ ["What is the largest organ in the human body?", "The skin is the largest organ in the human body, protecting internal organs and regulating body temperature."],
13
+ ["Who wrote '1984'?", "George Orwell wrote '1984', a dystopian novel about a totalitarian regime and surveillance state."],
14
+ ["What is the freezing point of water?", "The freezing point of water is 0 degrees Celsius (32 degrees Fahrenheit)."],
15
+ ["Which country has the most populous democracy?", "India is the most populous democracy in the world, with over a billion citizens."],
16
+ ["What is the hardest natural substance on Earth?", "Diamond is the hardest natural substance, formed under extreme pressure and temperature conditions."],
17
+ ["What is the largest desert on Earth?", "The Antarctic Desert is the largest desert in the world, covering about 14 million square kilometers."],
18
+ ["Who is the founder of Microsoft?", "Bill Gates co-founded Microsoft in 1975 with Paul Allen, creating one of the largest tech companies in the world."],
19
+ ["What is the tallest mountain in the world?", "Mount Everest, located in the Himalayas, is the tallest mountain on Earth, standing at 8,848 meters (29,029 feet)."],
20
+ ["What is the main ingredient in bread?", "Flour is the main ingredient in bread, combined with water, yeast, and salt."],
21
+ ["What is the symbol for iron on the periodic table?", "The chemical symbol for iron is Fe, derived from its Latin name 'Ferrum'."],
22
+ ["Who wrote 'The Odyssey'?", "Homer is traditionally credited with writing 'The Odyssey', an epic poem about the Greek hero Odysseus."],
23
+ ["What is the boiling point of water?", "The boiling point of water is 100 degrees Celsius (212 degrees Fahrenheit) at sea level."],
24
+ ["Who was the first human to travel into space?", "Yuri Gagarin, a Soviet cosmonaut, became the first human to travel into space in 1961."],
25
+ ["What is the largest mammal on Earth?", "The blue whale is the largest mammal on Earth, reaching lengths of up to 30 meters (98 feet)."],
26
+ ["What is the longest river in the world?", "The Nile is considered the longest river in the world, stretching about 6,650 kilometers (4,130 miles)."],
27
+ ["Who painted the ceiling of the Sistine Chapel?", "Michelangelo painted the ceiling of the Sistine Chapel between 1508 and 1512, depicting scenes from the Bible."],
28
+ ["What is the most abundant element in the universe?", "Hydrogen is the most abundant element in the universe, making up roughly 75% of all visible matter."],
29
+ ["What does DNA stand for?", "DNA stands for deoxyribonucleic acid, the molecule that carries genetic information in living organisms."],
30
+ ["Which country is known as the Land of the Rising Sun?", "Japan is often referred to as the Land of the Rising Sun, symbolizing its eastern location in Asia."],
31
+ ["What is the largest ocean on Earth?", "The Pacific Ocean is the largest ocean, covering more than 63 million square miles."],
32
+ ["Who was the first woman to win a Nobel Prize?", "Marie Curie was the first woman to win a Nobel Prize, receiving the award in Physics in 1903."],
33
+ ["What is the longest bone in the human body?", "The femur, or thigh bone, is the longest bone in the human body."],
34
+ ["What is the capital city of Canada?", "Ottawa is the capital city of Canada, located in the province of Ontario."],
35
+ ["Who invented the telephone?", "Alexander Graham Bell is credited with inventing the first practical telephone in 1876."],
36
+ ["What is the square root of 81?", "The square root of 81 is 9."],
37
+ ["Which planet is known as the Red Planet?", "Mars is known as the Red Planet due to its reddish appearance, caused by iron oxide on its surface."],
38
+ ["What is the main language spoken in Brazil?", "Portuguese is the official and most widely spoken language in Brazil."],
39
+ ["Who composed the music for 'The Four Seasons'?", "Antonio Vivaldi composed 'The Four Seasons', a set of violin concertos representing the seasons."],
40
+ ["What is the name of the longest running Broadway show?", "The longest running Broadway show is 'The Phantom of the Opera', which premiered in 1988."],
41
+ ["What is the smallest continent by land area?", "Australia is the smallest continent by land area, covering about 7.7 million square kilometers."],
42
+ ["Who discovered gravity?", "Sir Isaac Newton is credited with formulating the law of universal gravitation in the 17th century."],
43
+ ["What is the primary function of red blood cells?", "Red blood cells transport oxygen from the lungs to the rest of the body and carry carbon dioxide back to the lungs."],
44
+ ["Who was the first Emperor of China?", "Qin Shi Huang was the first Emperor of China, uniting the country in 221 BC."],
45
+ ["What is the chemical formula for water?", "The chemical formula for water is H2O, consisting of two hydrogen atoms and one oxygen atom."],
46
+ ["What is the largest species of shark?", "The whale shark is the largest species of shark, reaching lengths of up to 12 meters (40 feet)."],
47
+ ["Which country is home to the Great Barrier Reef?", "Australia is home to the Great Barrier Reef, the world's largest coral reef system."],
48
+ ["Who wrote 'Pride and Prejudice'?", "Jane Austen wrote 'Pride and Prejudice', a classic novel first published in 1813."],
49
+ ["What is the hottest planet in the solar system?", "Venus is the hottest planet in the solar system, with surface temperatures over 450°C (840°F) due to its thick atmosphere."],
50
+ ["What is the currency of Japan?", "The currency of Japan is the yen (¥)."],
51
+ ["Which element has the atomic number 1?", "Hydrogen has the atomic number 1, making it the first element on the periodic table."],
52
+ ["Who painted 'Starry Night'?", "Vincent van Gogh painted 'Starry Night' in 1889, one of his most famous works."],
53
+ ["What is the highest-grossing film of all time?", "As of 2024, 'Avatar' is the highest-grossing film of all time, directed by James Cameron."]
54
+ ]
requirements.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ torch
2
+ gradio
3
+ openai==1.41
4
+ transformers
5
+ numpy
6
+ scipy
7
+ protobuf
8
+ requests
9
+ jsonschema