JarvisChan630 commited on
Commit
de65d7b
1 Parent(s): fc95199
Dockerfile CHANGED
@@ -10,6 +10,7 @@ WORKDIR /app
10
 
11
  USER root
12
  RUN chmod -R 777 /app
 
13
  # Install minimal required build tools and dependencies for Playwright
14
  RUN apt-get update && apt-get install -y \
15
  gcc \
@@ -44,9 +45,10 @@ COPY . .
44
  # COPY config/config.yaml /app/config/config.yaml
45
  # COPY config/config.yaml /config/config.yaml
46
  # COPY agent_memory/jar3d_final_response_previous_run.txt /app/agent_memory/jar3d_final_response_previous_run.txt
47
- COPY agent_memory/jar3d_final_response_previous_run.txt /home/user/app/agent_memory/jar3d_final_response_previous_run.txt
48
 
49
 
 
 
50
  # Expose the port Chainlit runs on
51
  EXPOSE 8000
52
 
 
10
 
11
  USER root
12
  RUN chmod -R 777 /app
13
+
14
  # Install minimal required build tools and dependencies for Playwright
15
  RUN apt-get update && apt-get install -y \
16
  gcc \
 
45
  # COPY config/config.yaml /app/config/config.yaml
46
  # COPY config/config.yaml /config/config.yaml
47
  # COPY agent_memory/jar3d_final_response_previous_run.txt /app/agent_memory/jar3d_final_response_previous_run.txt
 
48
 
49
 
50
+ ENTRYPOINT ["entrypoint.sh"]
51
+
52
  # Expose the port Chainlit runs on
53
  EXPOSE 8000
54
 
README.md CHANGED
@@ -30,7 +30,7 @@ Thanks John Adeojo, who brings this wonderful project to open source community!
30
 
31
  ## TODO
32
  [] fix "/end" meta expert 503 error,maybe we should "Retry".
33
- [] deploy to Huggingface
34
 
35
 
36
  [] Long-term memory.
@@ -63,16 +63,30 @@ In `offline_graph_rag` file, we combine similarity search with
63
 
64
  ## Table of Contents
65
 
66
- 1. [Core Concepts](#core-concepts)
67
- 2. [Prerequisites](#prerequisites)
68
- 3. [Configuration](#configuration)
69
- - [API Key Configuration](#api-key-configuration)
70
- - [Endpoints Configuration](#endpoints-configuration)
71
- 4. [Setup for Basic Meta Agent](#setup-for-basic-meta-agent)
72
- 5. [Setup for Jar3d](#setup-for-jar3d)
73
- - [Docker Setup for Jar3d](#docker-setup-for-jar3d)
74
- - [Interacting with Jar3d](#interacting-with-jar3d)
75
- 6. [Roadmap for Jar3d](#roadmap-for-jar3d)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
 
77
  ## Core Concepts
78
 
 
30
 
31
  ## TODO
32
  [] fix "/end" meta expert 503 error,maybe we should "Retry".
33
+ [x] deploy to Huggingface
34
 
35
 
36
  [] Long-term memory.
 
63
 
64
  ## Table of Contents
65
 
66
+ - [Super Expert](#super-expert)
67
+ - [Tech Stack](#tech-stack)
68
+ - [TODO](#todo)
69
+ - [PMF - What problem this project has solved?](#pmf---what-problem-this-project-has-solved)
70
+ - [Business Logics](#business-logics)
71
+ - [LLM Application Workflow](#llm-application-workflow)
72
+ - [Bullet points](#bullet-points)
73
+ - [FAQ](#faq)
74
+ - [Table of Contents](#table-of-contents)
75
+ - [Core Concepts](#core-concepts)
76
+ - [Prerequisites](#prerequisites)
77
+ - [Environment Setup](#environment-setup)
78
+ - [Repository Setup](#repository-setup)
79
+ - [Configuration](#configuration)
80
+ - [API Key Configuration](#api-key-configuration)
81
+ - [Endpoints Configuration](#endpoints-configuration)
82
+ - [Setup for Basic Meta Agent](#setup-for-basic-meta-agent)
83
+ - [Run Your Query in Shell](#run-your-query-in-shell)
84
+ - [Setup for Jar3d](#setup-for-jar3d)
85
+ - [Docker Setup for Jar3d](#docker-setup-for-jar3d)
86
+ - [Prerequisites](#prerequisites-1)
87
+ - [Quick Start](#quick-start)
88
+ - [Notes](#notes)
89
+ - [Interacting with Jar3d](#interacting-with-jar3d)
90
 
91
  ## Core Concepts
92
 
agent_memory/jar3d_final_response_previous_run.txt CHANGED
@@ -1,131 +0,0 @@
1
- # Literature Review on the Current State of Large Language Models (LLMs)
2
-
3
- ## Introduction
4
-
5
- Large Language Models (LLMs) have revolutionized the field of Natural Language Processing (NLP) by demonstrating unprecedented capabilities in language understanding and generation. These models have significantly impacted various domains, including machine translation, question-answering systems, and content creation. This literature review provides a comprehensive overview of the advancements in LLMs up to 2023, focusing on architecture developments, training techniques, ethical considerations, and practical applications.
6
-
7
- ## Architecture Advancements
8
-
9
- ### Transformer Architecture
10
-
11
- The introduction of the Transformer architecture by Vaswani et al. (2017) marked a pivotal moment in NLP. By utilizing self-attention mechanisms, Transformers addressed the limitations of recurrent neural networks, particularly in handling long-range dependencies and parallelization during training.
12
-
13
- ### GPT Series
14
-
15
- OpenAI's Generative Pre-trained Transformer (GPT) series has been instrumental in pushing the boundaries of LLMs:
16
-
17
- - **GPT-2** (Radford et al., 2019): Featured 1.5 billion parameters and demonstrated coherent text generation, raising awareness about the potential and risks of LLMs.
18
- - **GPT-3** (Brown et al., 2020): Expanded to 175 billion parameters, exhibiting remarkable few-shot learning abilities and setting new benchmarks in NLP tasks.
19
-
20
- ### Scaling Laws and Large-Scale Models
21
-
22
- Kaplan et al. (2020) established empirical scaling laws, showing that model performance scales predictably with computational resources, model size, and dataset size. This led to the development of even larger models:
23
-
24
- - **Megatron-Turing NLG 530B** (Smith et al., 2022): A collaboration between NVIDIA and Microsoft, this model contains 530 billion parameters, enhancing language generation capabilities.
25
- - **PaLM** (Chowdhery et al., 2022): Google's 540-billion-parameter model showcased state-of-the-art performance in reasoning and language understanding tasks.
26
-
27
- ## Training Techniques
28
-
29
- ### Unsupervised and Self-Supervised Learning
30
-
31
- LLMs are primarily trained using vast amounts of unlabelled text data through unsupervised or self-supervised learning, enabling them to learn language patterns without explicit annotations (Devlin et al., 2019).
32
-
33
- ### Fine-Tuning and Transfer Learning
34
-
35
- Fine-tuning allows LLMs to adapt to specific tasks by training on smaller, task-specific datasets. Techniques like Transfer Learning have been crucial in applying general language understanding to specialized domains (Howard & Ruder, 2018).
36
-
37
- ### Instruction Tuning and Prompt Engineering
38
-
39
- Wei et al. (2021) introduced instruction tuning, enhancing LLMs' ability to follow human instructions by fine-tuning on datasets with task instructions. Prompt engineering has emerged as a method to elicit desired behaviors from LLMs without additional training.
40
-
41
- ### Reinforcement Learning from Human Feedback (RLHF)
42
-
43
- RLHF incorporates human preferences to refine model outputs, aligning them with human values and improving safety (Christiano et al., 2017).
44
-
45
- ## Ethical Considerations
46
-
47
- ### Bias and Fairness
48
-
49
- LLMs can inadvertently perpetuate biases present in their training data. Studies have highlighted issues related to gender, race, and cultural stereotypes (Bender et al., 2021). Efforts are ongoing to mitigate biases through data curation and algorithmic adjustments (Bolukbasi et al., 2016).
50
-
51
- ### Misinformation and Content Moderation
52
-
53
- The ability of LLMs to generate plausible but incorrect or harmful content poses risks in misinformation dissemination. OpenAI has explored content moderation strategies and responsible deployment practices (Solaiman et al., 2019).
54
-
55
- ### Privacy Concerns
56
-
57
- Training on large datasets may include sensitive information, raising privacy issues. Techniques like differential privacy are being explored to protect individual data (Abadi et al., 2016).
58
-
59
- ### Transparency and Interpretability
60
-
61
- Understanding the decision-making processes of LLMs is challenging due to their complexity. Research into explainable AI aims to make models more interpretable (Danilevsky et al., 2020), which is critical for trust and regulatory compliance.
62
-
63
- ## Applications
64
-
65
- ### Healthcare
66
-
67
- LLMs assist in clinical documentation, patient communication, and research data analysis. They facilitate faster diagnosis and personalized treatment plans (Jiang et al., 2020).
68
-
69
- ### Finance
70
-
71
- In finance, LLMs are used for algorithmic trading, risk assessment, and customer service automation, enhancing efficiency and decision-making processes (Yang et al., 2020).
72
-
73
- ### Education
74
-
75
- Educational technologies leverage LLMs for personalized learning experiences, automated grading, and language tutoring, contributing to improved learning outcomes (Zawacki-Richter et al., 2019).
76
-
77
- ### Legal Sector
78
-
79
- LLMs aid in legal document analysis, contract review, and summarization, reducing manual workloads and increasing accuracy (Bommarito & Katz, 2018).
80
-
81
- ### Customer Service and Virtual Assistants
82
-
83
- Chatbots and virtual assistants powered by LLMs provide customer support, handle inquiries, and perform tasks, improving user engagement and satisfaction (Xu et al., 2020).
84
-
85
- ## Conclusion
86
-
87
- Advancements in Large Language Models up to 2023 have significantly influenced AI and NLP, leading to models capable of understanding and generating human-like text. Progress in model architectures and training techniques has expanded their applicability across diverse industries. However, ethical considerations regarding bias, misinformation, and transparency remain critical challenges. Addressing these concerns is essential for the responsible development and deployment of LLMs. Future research is expected to focus on enhancing model efficiency, interpretability, and alignment with human values.
88
-
89
- ## References
90
-
91
- - Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). **Deep Learning with Differential Privacy.** *Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security*, 308-318.
92
-
93
- - Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). **On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?** *Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency*, 610-623.
94
-
95
- - Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). **Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.** *Advances in Neural Information Processing Systems*, 4349-4357.
96
-
97
- - Bommarito, M. J., & Katz, D. M. (2018). **A Statistical Analysis of the Predictive Technologies of Law and the Future of Legal Practice.** *Stanford Technology Law Review*, 21, 286.
98
-
99
- - Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). **Language Models are Few-Shot Learners.** *Advances in Neural Information Processing Systems*, 33, 1877-1901.
100
-
101
- - Chowdhery, A., Narang, S., Devlin, J., et al. (2022). **PaLM: Scaling Language Modeling with Pathways.** *arXiv preprint* arXiv:2204.02311.
102
-
103
- - Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). **Deep Reinforcement Learning from Human Preferences.** *Advances in Neural Information Processing Systems*, 30.
104
-
105
- - Danilevsky, M., Qian, Y., Aharon, R., et al. (2020). **A Survey of the State of Explainable AI for Natural Language Processing.** *arXiv preprint* arXiv:2010.00711.
106
-
107
- - Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). **BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.** *Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics*, 4171-4186.
108
-
109
- - Howard, J., & Ruder, S. (2018). **Universal Language Model Fine-tuning for Text Classification.** *Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics*, 328-339.
110
-
111
- - Jiang, F., Jiang, Y., Zhi, H., et al. (2020). **Artificial Intelligence in Healthcare: Past, Present and Future.** *Stroke and Vascular Neurology*, 5(2), 230-243.
112
-
113
- - Kaplan, J., McCandlish, S., Henighan, T., et al. (2020). **Scaling Laws for Neural Language Models.** *arXiv preprint* arXiv:2001.08361.
114
-
115
- - Radford, A., Wu, J., Child, R., et al. (2019). **Language Models are Unsupervised Multitask Learners.** *OpenAI Blog*, 1(8).
116
-
117
- - Smith, S., Gray, J., Forte, S., et al. (2022). **Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.** *arXiv preprint* arXiv:2201.11990.
118
-
119
- - Solaiman, I., Brundage, M., Clark, J., et al. (2019). **Release Strategies and the Social Impacts of Language Models.** *arXiv preprint* arXiv:1908.09203.
120
-
121
- - Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). **Attention is All You Need.** *Advances in Neural Information Processing Systems*, 30.
122
-
123
- - Wei, J., Bosma, M., Zhao, V., et al. (2021). **Finetuned Language Models Are Zero-Shot Learners.** *arXiv preprint* arXiv:2109.01652.
124
-
125
- - Xu, P., Liu, Z., Ou, C., & Li, W. (2020). **Leveraging Pre-trained Language Model in Machine Reading Comprehension with Multi-task Learning.** *Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing*, 226-236.
126
-
127
- - Yang, X., Yin, Z., & Li, Y. (2020). **Application of Artificial Intelligence in Financial Industry.** *Journal of Physics: Conference Series*, 1486(4), 042047.
128
-
129
- - Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). **Systematic Review of Research on Artificial Intelligence Applications in Higher Education – Where Are the Educators?** *International Journal of Educational Technology in Higher Education*, 16(1), 1-27.
130
-
131
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
models/llms.py CHANGED
@@ -23,7 +23,7 @@ class BaseModel:
23
  self.retry_delay = retry_delay
24
 
25
  # @retry(stop=stop_after_attempt(3), wait=wait_fixed(1), retry=retry_if_exception_type(requests.RequestException))
26
- @retry(stop=stop_after_attempt(5), wait=wait_exponential(multiplier=1, min=2, max=10), reraise=True)
27
  def _make_request(self, url, headers, payload):
28
 
29
  response = requests.post(url, headers=headers, data=json.dumps(payload))
 
23
  self.retry_delay = retry_delay
24
 
25
  # @retry(stop=stop_after_attempt(3), wait=wait_fixed(1), retry=retry_if_exception_type(requests.RequestException))
26
+ @retry(stop=stop_after_attempt(5), wait=wait_exponential(multiplier=1, min=10, max=15), reraise=True)
27
  def _make_request(self, url, headers, payload):
28
 
29
  response = requests.post(url, headers=headers, data=json.dumps(payload))
tools/legacy/offline_graph_rag_tool copy.py CHANGED
@@ -331,7 +331,7 @@ def create_graph_index(
331
  if os.environ.get('LLM_SERVER') == "openai":
332
  # require hundreds calls to api
333
  # we create index for every small chunk
334
- llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini")
335
 
336
  else:
337
  llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
 
331
  if os.environ.get('LLM_SERVER') == "openai":
332
  # require hundreds calls to api
333
  # we create index for every small chunk
334
+ llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18")
335
 
336
  else:
337
  llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
tools/legacy/rag_tool.py CHANGED
@@ -318,7 +318,7 @@ def create_graph_index(
318
  graph: Neo4jGraph = None,
319
  max_threads: int = 5
320
  ) -> Neo4jGraph:
321
- llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini")
322
 
323
  # llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
324
 
 
318
  graph: Neo4jGraph = None,
319
  max_threads: int = 5
320
  ) -> Neo4jGraph:
321
+ llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18")
322
 
323
  # llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
324
 
tools/offline_graph_rag_tool.py CHANGED
@@ -324,7 +324,7 @@ def create_graph_index(
324
  ) -> Neo4jGraph:
325
 
326
  if os.environ.get('LLM_SERVER') == "openai":
327
- llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini")
328
  else:
329
  llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
330
 
 
324
  ) -> Neo4jGraph:
325
 
326
  if os.environ.get('LLM_SERVER') == "openai":
327
+ llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18")
328
  else:
329
  llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
330
 
tools/offline_graph_rag_tool_with_async.py CHANGED
@@ -321,7 +321,7 @@ def create_graph_index(
321
  ) -> Neo4jGraph:
322
 
323
  if os.environ.get('LLM_SERVER') == "openai":
324
- llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini")
325
  else:
326
  llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
327
 
 
321
  ) -> Neo4jGraph:
322
 
323
  if os.environ.get('LLM_SERVER') == "openai":
324
+ llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18")
325
  else:
326
  llm = ChatAnthropic(temperature=0, model_name="claude-3-haiku-20240307")
327