vtriple commited on
Commit
228569e
1 Parent(s): fb3709d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +19 -123
README.md CHANGED
@@ -1,136 +1,32 @@
1
- ---
2
- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
3
- library_name: peft
4
- ---
5
 
6
- # Model Card for LLaMA 3.1 8B Instruct - Cybersecurity Fine-tuned
7
 
8
- This model is a fine-tuned version of the LLaMA 3.1 8B Instruct model, specifically adapted for cybersecurity-related tasks.
9
 
10
  ## Model Details
11
 
12
- ### Model Description
 
 
13
 
14
- This model is based on the LLaMA 3.1 8B Instruct model and has been fine-tuned on a custom dataset of cybersecurity-related questions and answers. It is designed to provide more accurate and relevant responses to queries in the cybersecurity domain.
15
 
16
- - **Developed by:** [Your Name/Organization]
17
- - **Model type:** Instruct-tuned Large Language Model
18
- - **Language(s) (NLP):** English (primary), with potential for limited multilingual capabilities
19
- - **License:** [Specify the license, likely related to the original LLaMA 3.1 license]
20
- - **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct
21
-
22
- ### Model Sources [optional]
23
-
24
- - **Repository:** [Link to your Hugging Face repository]
25
- - **Paper [optional]:** [If you've written a paper about this fine-tuning, link it here]
26
- - **Demo [optional]:** [If you have a demo of the model, link it here]
27
-
28
- ## Uses
29
-
30
- ### Direct Use
31
-
32
- This model can be used for a variety of cybersecurity-related tasks, including:
33
- - Answering questions about cybersecurity concepts and practices
34
- - Providing explanations of cybersecurity threats and vulnerabilities
35
- - Assisting in the interpretation of security logs and indicators of compromise
36
- - Offering guidance on best practices for cyber defense
37
-
38
- ### Out-of-Scope Use
39
-
40
- This model should not be used for:
41
- - Generating or assisting in the creation of malicious code
42
- - Providing legal or professional security advice without expert oversight
43
- - Making critical security decisions without human verification
44
-
45
- ## Bias, Risks, and Limitations
46
-
47
- - The model may reflect biases present in its training data and the original LLaMA 3.1 model.
48
- - It may occasionally generate incorrect or inconsistent information, especially for very specific or novel cybersecurity topics.
49
- - The model's knowledge is limited to its training data cutoff and does not include real-time threat intelligence.
50
-
51
- ### Recommendations
52
-
53
- Users should verify critical information and consult with cybersecurity professionals for important decisions. The model should be used as an assistant tool, not as a replacement for expert knowledge or up-to-date threat intelligence.
54
-
55
- ## How to Get Started with the Model
56
-
57
- Use the following code to get started with the model:
58
 
59
  ```python
60
  from transformers import AutoTokenizer, AutoModelForCausalLM
61
- from peft import PeftModel, PeftConfig
62
 
63
- # Load the model
64
- model_name = "your-username/llama3-cybersecurity"
65
- config = PeftConfig.from_pretrained(model_name)
66
- model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
67
- model = PeftModel.from_pretrained(model, model_name)
68
-
69
- # Load the tokenizer
70
- tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
71
 
72
  # Example usage
73
- prompt = "What are some common indicators of a ransomware attack?"
74
- inputs = tokenizer(prompt, return_tensors="pt")
75
- outputs = model.generate(**inputs, max_length=200)
76
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
77
- ```
78
-
79
- ## Training Details
80
-
81
- ### Training Data
82
-
83
- The model was fine-tuned on a custom dataset of cybersecurity-related questions and answers. [Add more details about your dataset here]
84
-
85
- ### Training Procedure
86
-
87
- #### Training Hyperparameters
88
-
89
- - **Training regime:** bf16 mixed precision
90
- - **Optimizer:** AdamW
91
- - **Learning rate:** 5e-5
92
- - **Batch size:** 4
93
- - **Gradient accumulation steps:** 4
94
- - **Epochs:** 5
95
- - **Max steps:** 4000
96
-
97
- ## Evaluation
98
-
99
- I used a custom yara evulation
100
- ## Environmental Impact
101
-
102
- - **Hardware Type:** NVIDIA A100
103
- - **Hours used:** 12 Hours
104
- - **Cloud Provider:** vast.io
105
-
106
-
107
- ## Technical Specifications [optional]
108
-
109
- ### Model Architecture and Objective
110
-
111
- This model uses the LLaMA 3.1 8B architecture with additional LoRA adapters for fine-tuning. It was trained using a causal language modeling objective on cybersecurity-specific data.
112
-
113
- ### Compute Infrastructure
114
-
115
- #### Hardware
116
-
117
- "Single NVIDIA A100 GPU"
118
-
119
- #### Software
120
-
121
- - Python 3.8+
122
- - PyTorch 2.0+
123
- - Transformers 4.28+
124
- - PEFT 0.12.0
125
-
126
- ## Model Card Authors [optional]
127
-
128
- Wyatt Roersma
129
-
130
- ## Model Card Contact
131
-
132
- Email me at wyattroersma@gmail.com with questions.
133
- ```
134
-
135
- This README.md provides a comprehensive overview of your fine-tuned model, including its purpose, capabilities, limitations, and technical details. You should replace the placeholder text (like "[Your Name/Organization]") with the appropriate information. Additionally, you may want to expand on certain sections, such as the evaluation metrics and results, if you have more specific data available from your fine-tuning process.
136
- </answer>
 
 
 
 
 
1
 
2
+ # Yara-Focused Llama 3.1 8B
3
 
4
+ This model is a fine-tuned version of Meta's Llama 3.1 8B, specifically tailored for yara tasks.
5
 
6
  ## Model Details
7
 
8
+ - **Base model:** meta-llama/Meta-Llama-3.1-8B-instruct
9
+ - **Fine-tuning:** This model has been fine-tuned on a custom dataset of cybersecurity-related questions and answers.
10
+ - **Usage:** This model is particularly yara-focused and can generate responses to yara-related prompts.
11
 
12
+ ## How to Use
13
 
14
+ To use this model:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ```python
17
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
18
 
19
+ model_name = "vtriple/Llama-3.1-8B-yara"
20
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
21
+ model = AutoModelForCausalLM.from_pretrained(model_name)
 
 
 
 
 
22
 
23
  # Example usage
24
+ input_text = "What is an example of a common cybersecurity threat?"
25
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids
26
+ output = model.generate(input_ids, max_length=100)
27
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
28
+
29
+ Limitations
30
+ Please note that while this model has been fine-tuned for cybersecurity tasks, it may still produce incorrect or biased information. Always verify important information with authoritative sources.
31
+ License
32
+ This model inherits its license from the original Llama 3.1 8B model. Please refer to Meta's licensing terms for the Llama model family.