umuthopeyildirim
/

fin-rwkv-169M

Text Generation

Inference Endpoints

Model card Files Files and versions Community

fin-rwkv-169M / README.md

umuthopeyildirim's picture

umuthopeyildirim

Update README.md

859dba4 verified 11 months ago

|

history blame contribute delete

3.15 kB

	---
	license: apache-2.0
	datasets:
	- gbharti/finance-alpaca
	language:
	- en
	library_name: transformers
	tags:
	- finance
	widget:
	- text: "Is this headline positive or negative? Headline: Australian Tycoon Forrest Shuts Nickel Mines After Prices Crash."
	example_title: "Sentiment analysis"
	- text: "Aluminum price per KG is 50$. Forecast max: +1$ min:+0.3$. What should be the current price of aluminum?"
	example_title: "Forecast"
	---

	# Fin-RWKV: Attention Free Financal Expert (WIP)
	Fin-RWKV is a cutting-edge, attention-free model designed specifically for financial analysis and prediction. Developed as part of a MindsDB Hackathon, this model leverages the simplicity and efficiency of the RWKV architecture to process financial data, providing insights and forecasts with remarkable accuracy. Fin-RWKV is tailored for professionals and enthusiasts in the finance sector who seek to integrate advanced deep learning techniques into their financial analyses.

	## Use Cases
	- Sentiment analysis
	- Forecast
	- Product Pricing

	## Features
	- Attention-Free Architecture: Utilizes the RWKV (Recurrent Weighted Kernel-based) model, which bypasses the complexity of attention mechanisms while maintaining high performance.
	- Lower Costs: 10x to over a 100x+ lower inference cost, 2x to 10x lower training cost
	- Tinyyyy: Lightweight enough to run on CPUs in real-time bypassing the GPU - and is able to run on your laptop today
	- Finance-Specific Training: Trained on the gbharti/finance-alpaca dataset, ensuring that the model is finely tuned for financial data analysis.
	- Transformers Library Integration: Built on the popular 'transformers' library, ensuring easy integration with existing ML pipelines and applications.

	## Competing Against
	\| Name \| Param Count \| Cost \| Inference Cost \|
	\|---------------\|-------------\|------\|----------------\|
	\| Fin-RWKV \| 169M \| $1.45 \| Free on HuggingFace 🤗 & Low-End CPU \|
	\| [BloombergGPT](https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/) \| 50 Billion \| $1.3 million \| Enterprise GPUs \|
	\| [FinGPT](https://huggingface.co/FinGPT) \| 7 Bilion \| $302.4 \| Consumer GPUs \|


	\| Architecture \| Status \| Compute Efficiency \| Largest Model \| Trained Token \| Link \|
	\|--------------\|--------\|--------------------\|---------------\|---------------\|------\|
	\| (Fin)RWKV \| In Production \| O ( N ) \| 14B \| 500B++ (the pile+) \| [Paper](https://arxiv.org/abs/2305.13048) \|
	\| Ret Net (Microsoft) \| Research \| O ( N ) \| 6.7B \| 100B (mixed) \| [Paper](https://arxiv.org/abs/2307.08621) \|
	\| State Space (Stanford) \| Prototype \| O ( Log N ) \| 355M \| 15B (the pile, subset) \| [Paper](https://arxiv.org/abs/2302.10866) \|
	\| Liquid (MIT) \| Research \| - \| <1M \| - \| [Paper](https://arxiv.org/abs/2302.10866) \|
	\| Transformer Architecture (included for contrasting reference) \| In Production \| O ( N^2 ) \| 800B (est) \| 13T++ (est) \| - \|

	<img src="https://cdn-uploads.huggingface.co/production/uploads/631ea4247beada30465fa606/7vAOYsXH1vhTyh22o6jYB.png" width="500" alt="Inference computational cost vs. Number of tokens">

	_Note: Needs more data and training, testing purposes only._