You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

DEEPNIGHT ai1

The 600 Billion+ Parameter Model. Yes! We did this!

The second largest model in the world, right after GPT-4.


We at DEEPNIGHT have been working on this for quite some time. We have successfully built the second largest model called ai1 which comes with 600 Billion+ parameters.

ai1 can perform as good as GPT-4 and has a context-window of 8k tokens. ai1 was trained with a new approach where after training the model on a corpus of text from various sources including but not limited to:

  • RefinedWeb
  • Opensource code from GitHub
  • Common Crawl we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning. We also trained the model for function calling capabilities.

What is special about ai1?

ai1 works on a chaining methodology which is built-in. When it receives an input from the user, it tries to understand the input before acting on generation. It generates an instruction-based prompt internally and then works on generation of the response. Benefit of this? We'll just say the jobs of Prompt Engineering are over.

Unlike ChatGPT, GPT-4, Llama, and other models, ai1 doesn't require heavy prompt engineering to provide answers. The understanding-development phase in the model takes care of that.

What else?

  • performs as good as GPT-4
  • excels in automation tasks
  • can predict emotions of the user by the conversation (while understanding the input in Phase-1) resulting in better and curated generations.
  • has an understanding towards human-emotions which helps the model curate the content accordingly
  • excels in roleplays
  • excels in writing code
  • the model has a few global memory units which are used to store data away from the context-window. These memory units are mostly used to store the function schemas but in the end the model decides itself what to store in them.
  • if we consider how much would it cost, well, on an average $0.005 per 1000 tokens.

Future goals

We don't discuss that. Specially after seeing how SOME AI COMPANY ON THEIR DEV DAY just used the opensource research and publications to profit themselves... Hah.


Are we going to allow access?

Not for some time. We are still running evaluations and have a lot to learn about how this model can be made better.


Feel free to reach out to us at research@deepnight.tech

Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using deepnight-research/ai1 2

Collection including deepnight-research/ai1