Update README.md

2380934 verified 19 days ago

4.17 kB

	---
	license: mit
	base_model: microsoft/deberta-v3-base
	tags:
	- generated_from_trainer
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	model-index:
	- name: AttackER
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Cyber-ThreaD/DeBERTa-v3-AttackER

	This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.5468
	- Precision: 0.4730
	- Recall: 0.5569
	- F1: 0.5115
	- Accuracy: 0.7401

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Precision \| Recall \| F1 \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:---------:\|:------:\|:------:\|:--------:\|
	\| 1.7886 \| 0.4 \| 500 \| 1.5075 \| 0.1842 \| 0.2103 \| 0.1964 \| 0.6169 \|
	\| 1.3644 \| 0.81 \| 1000 \| 1.3342 \| 0.2364 \| 0.3056 \| 0.2666 \| 0.6492 \|
	\| 1.1181 \| 1.21 \| 1500 \| 1.2655 \| 0.2959 \| 0.3585 \| 0.3242 \| 0.6812 \|
	\| 0.9833 \| 1.61 \| 2000 \| 1.2368 \| 0.2941 \| 0.3902 \| 0.3354 \| 0.6778 \|
	\| 0.9036 \| 2.01 \| 2500 \| 1.2682 \| 0.3551 \| 0.4021 \| 0.3772 \| 0.7023 \|
	\| 0.7102 \| 2.42 \| 3000 \| 1.2176 \| 0.3668 \| 0.4590 \| 0.4078 \| 0.7159 \|
	\| 0.6868 \| 2.82 \| 3500 \| 1.2170 \| 0.3794 \| 0.4683 \| 0.4192 \| 0.7147 \|
	\| 0.5671 \| 3.22 \| 4000 \| 1.2603 \| 0.3951 \| 0.4881 \| 0.4367 \| 0.7259 \|
	\| 0.4878 \| 3.63 \| 4500 \| 1.2460 \| 0.3925 \| 0.5093 \| 0.4433 \| 0.7333 \|
	\| 0.4942 \| 4.03 \| 5000 \| 1.3147 \| 0.4047 \| 0.4802 \| 0.4392 \| 0.7284 \|
	\| 0.3812 \| 4.43 \| 5500 \| 1.3308 \| 0.4205 \| 0.5146 \| 0.4628 \| 0.7351 \|
	\| 0.421 \| 4.83 \| 6000 \| 1.3031 \| 0.4275 \| 0.5225 \| 0.4702 \| 0.7386 \|
	\| 0.3157 \| 5.24 \| 6500 \| 1.3943 \| 0.4132 \| 0.5040 \| 0.4541 \| 0.7293 \|
	\| 0.3072 \| 5.64 \| 7000 \| 1.4087 \| 0.4303 \| 0.5185 \| 0.4703 \| 0.7396 \|
	\| 0.3436 \| 6.04 \| 7500 \| 1.4197 \| 0.4461 \| 0.5251 \| 0.4824 \| 0.7363 \|
	\| 0.2774 \| 6.45 \| 8000 \| 1.4249 \| 0.4275 \| 0.5225 \| 0.4702 \| 0.7377 \|
	\| 0.2629 \| 6.85 \| 8500 \| 1.4811 \| 0.4580 \| 0.5344 \| 0.4933 \| 0.7327 \|
	\| 0.2271 \| 7.25 \| 9000 \| 1.5576 \| 0.4733 \| 0.5397 \| 0.5043 \| 0.7415 \|
	\| 0.235 \| 7.66 \| 9500 \| 1.5468 \| 0.4730 \| 0.5569 \| 0.5115 \| 0.7401 \|
	\| 0.2415 \| 8.06 \| 10000 \| 1.5956 \| 0.4730 \| 0.5437 \| 0.5058 \| 0.7433 \|
	\| 0.1826 \| 8.46 \| 10500 \| 1.6168 \| 0.4455 \| 0.5410 \| 0.4886 \| 0.7413 \|
	\| 0.2083 \| 8.86 \| 11000 \| 1.5866 \| 0.4505 \| 0.5423 \| 0.4922 \| 0.7413 \|
	\| 0.2169 \| 9.27 \| 11500 \| 1.5974 \| 0.4708 \| 0.5437 \| 0.5046 \| 0.7468 \|
	\| 0.1747 \| 9.67 \| 12000 \| 1.6219 \| 0.4567 \| 0.5437 \| 0.4964 \| 0.7405 \|


	### Framework versions

	- Transformers 4.36.0.dev0
	- Pytorch 2.1.0+cu118
	- Datasets 2.15.0
	- Tokenizers 0.15.0


	### Citing & Authors

	If you use the model kindly cite the following work

	```
	@inproceedings{deka2024attacker,
	title={AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset},
	author={Deka, Pritam and Rajapaksha, Sampath and Rani, Ruby and Almutairi, Amirah and Karafili, Erisa},
	booktitle={International Conference on Web Information Systems Engineering},
	pages={255--270},
	year={2024},
	organization={Springer}
	}

	```