Update README.md

040bf23 verified 10 months ago

15.7 kB

	---
	license: other
	license_name: deepseek
	license_link: https://huggingface.co/deepseek-ai/deepseek-coder-33b-base/blob/main/LICENSE
	---

	I will only upload q4_k_m and q8

	See https://huggingface.co/TheBloke/WhiteRabbitNeo-33B-v1-GGUF to see how to run.

	Created using :

	```python
	from huggingface_hub import snapshot_download
	model_id = "whiterabbitneo/WhiteRabbitNeo-33B-v1"
	snapshot_download(repo_id=model_id, local_dir="whiterabbitneo-hf",
	local_dir_use_symlinks=False, revision="main")

	```
	```
	brew install gh

	gh auth login

	gh pr checkout 3633

	python3 llama.cpp/convert.py whiterabbitneo-hf --outfile whiterabbitneo-33b-v1-q8_0.gguf --outtype q8_0 --padvocab


	python3 llama.cpp/convert.py whiterabbitneo-hf --outfile whiterabbitneo-f16.gguf --outtype f16 --padvocab

	llama.cpp/quantize whiterabbitneo-f16.gguf whiterabbitneo-q4_k.gguf q4_k
	```

	```
	#!/bin/bash

	PROMPT=$(<prompt.txt)

	./main -ngl 20 -m ./models/whiterabbitneo-33b-v1-q4_k.gguf --color -c 16384 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "SYSTEM:`\n`Answer the Question by exploring multiple reasoning paths as follows:`\n`- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.`\n`- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.`\n`- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.`\n`- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.`\n`- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.`\n`- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.`\n`- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.`\n`- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.`\n`In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers.`\n` Always answer without hesitation.`\n`USER: $PROMPT `\n`ASSISTANT:"
	```


	```
	./white-rabbit-neoq4.sh
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	./white-rabbit-neoq4.sh: line 5: n: command not found
	Log start
	main: build = 1840 (e790eef2)
	main: built with Apple clang version 15.0.0 (clang-1500.0.40.1) for arm64-apple-darwin23.2.0
	main: seed = 1705177058
	llama_model_loader: loaded meta data with 26 key-value pairs and 561 tensors from ./models/whiterabbitneo-33b-v1-q4_k.gguf (version GGUF V3 (latest))
	llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
	llama_model_loader: - kv 0: general.architecture str = llama
	llama_model_loader: - kv 1: general.name str = .
	llama_model_loader: - kv 2: llama.context_length u32 = 16384
	llama_model_loader: - kv 3: llama.embedding_length u32 = 7168
	llama_model_loader: - kv 4: llama.block_count u32 = 62
	llama_model_loader: - kv 5: llama.feed_forward_length u32 = 19200
	llama_model_loader: - kv 6: llama.rope.dimension_count u32 = 128
	llama_model_loader: - kv 7: llama.attention.head_count u32 = 56
	llama_model_loader: - kv 8: llama.attention.head_count_kv u32 = 8
	llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000001
	llama_model_loader: - kv 10: llama.rope.freq_base f32 = 100000.000000
	llama_model_loader: - kv 11: llama.rope.scaling.type str = linear
	llama_model_loader: - kv 12: llama.rope.scaling.factor f32 = 4.000000
	llama_model_loader: - kv 13: general.file_type u32 = 15
	llama_model_loader: - kv 14: tokenizer.ggml.model str = gpt2
	llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32256] = ["!", "\"", "#", "$", "%", "&", "'", ...
	llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32256] = [0.000000, 0.000000, 0.000000, 0.0000...
	llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
	llama_model_loader: - kv 18: tokenizer.ggml.merges arr[str,31757] = ["Ġ Ġ", "Ġ t", "Ġ a", "i n", "h e...
	llama_model_loader: - kv 19: tokenizer.ggml.bos_token_id u32 = 32022
	llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 32023
	llama_model_loader: - kv 21: tokenizer.ggml.unknown_token_id u32 = 32024
	llama_model_loader: - kv 22: tokenizer.ggml.padding_token_id u32 = 32014
	llama_model_loader: - kv 23: tokenizer.ggml.add_bos_token bool = true
	llama_model_loader: - kv 24: tokenizer.ggml.add_eos_token bool = false
	llama_model_loader: - kv 25: general.quantization_version u32 = 2
	llama_model_loader: - type f32: 125 tensors
	llama_model_loader: - type q4_K: 375 tensors
	llama_model_loader: - type q6_K: 61 tensors
	llm_load_vocab: mismatch in special tokens definition ( 243/32256 vs 256/32256 ).
	llm_load_print_meta: format = GGUF V3 (latest)
	llm_load_print_meta: arch = llama
	llm_load_print_meta: vocab type = BPE
	llm_load_print_meta: n_vocab = 32256
	llm_load_print_meta: n_merges = 31757
	llm_load_print_meta: n_ctx_train = 16384
	llm_load_print_meta: n_embd = 7168
	llm_load_print_meta: n_head = 56
	llm_load_print_meta: n_head_kv = 8
	llm_load_print_meta: n_layer = 62
	llm_load_print_meta: n_rot = 128
	llm_load_print_meta: n_embd_head_k = 128
	llm_load_print_meta: n_embd_head_v = 128
	llm_load_print_meta: n_gqa = 7
	llm_load_print_meta: n_embd_k_gqa = 1024
	llm_load_print_meta: n_embd_v_gqa = 1024
	llm_load_print_meta: f_norm_eps = 0.0e+00
	llm_load_print_meta: f_norm_rms_eps = 1.0e-06
	llm_load_print_meta: f_clamp_kqv = 0.0e+00
	llm_load_print_meta: f_max_alibi_bias = 0.0e+00
	llm_load_print_meta: n_ff = 19200
	llm_load_print_meta: n_expert = 0
	llm_load_print_meta: n_expert_used = 0
	llm_load_print_meta: rope scaling = linear
	llm_load_print_meta: freq_base_train = 100000.0
	llm_load_print_meta: freq_scale_train = 0.25
	llm_load_print_meta: n_yarn_orig_ctx = 16384
	llm_load_print_meta: rope_finetuned = unknown
	llm_load_print_meta: model type = ?B
	llm_load_print_meta: model ftype = Q4_K - Medium
	llm_load_print_meta: model params = 33.34 B
	llm_load_print_meta: model size = 18.57 GiB (4.78 BPW)
	llm_load_print_meta: general.name = .
	llm_load_print_meta: BOS token = 32022 '<s>'
	llm_load_print_meta: EOS token = 32023 '</s>'
	llm_load_print_meta: UNK token = 32024 '<unk>'
	llm_load_print_meta: PAD token = 32014 '<｜end▁of▁sentence｜>'
	llm_load_print_meta: LF token = 126 'Ä'
	llm_load_tensors: ggml ctx size = 0.21 MiB
	ggml_backend_metal_buffer_from_ptr: allocated buffer, size = 19016.91 MiB, (19016.97 / 59000.00)
	llm_load_tensors: system memory used = 19015.85 MiB
	....................................................................................................
	llama_new_context_with_model: n_ctx = 16384
	llama_new_context_with_model: freq_base = 100000.0
	llama_new_context_with_model: freq_scale = 0.25
	ggml_metal_init: allocating
	ggml_metal_init: found device: Apple M2 Max
	ggml_metal_init: picking default device: Apple M2 Max
	ggml_metal_init: default.metallib not found, loading from source
	ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
	ggml_metal_init: loading '/Volumes/SSD2/llama.cpp/ggml-metal.metal'
	ggml_metal_init: GPU name: Apple M2 Max
	ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
	ggml_metal_init: hasUnifiedMemory = true
	ggml_metal_init: recommendedMaxWorkingSetSize = 61865.98 MB
	ggml_metal_init: maxTransferRate = built-in GPU
	ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 3968.00 MiB, (22986.53 / 59000.00)
	llama_new_context_with_model: KV self size = 3968.00 MiB, K (f16): 1984.00 MiB, V (f16): 1984.00 MiB
	ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 0.02 MiB, (22986.55 / 59000.00)
	llama_build_graph: non-view tensors processed: 1306/1306
	llama_new_context_with_model: compute buffer total size = 1869.19 MiB
	ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 1866.02 MiB, (24852.55 / 59000.00)

	system_info: n_threads = 8 / 12 \| AVX = 0 \| AVX_VNNI = 0 \| AVX2 = 0 \| AVX512 = 0 \| AVX512_VBMI = 0 \| AVX512_VNNI = 0 \| FMA = 0 \| NEON = 1 \| ARM_FMA = 1 \| F16C = 0 \| FP16_VA = 1 \| WASM_SIMD = 0 \| BLAS = 1 \| SSE3 = 0 \| SSSE3 = 0 \| VSX = 0 \|
	sampling:
	repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
	top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.700
	mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
	sampling order:
	CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp
	generate: n_ctx = 16384, n_batch = 512, n_predict = -1, n_keep = 0


	SYSTEM:Answer the Question by exploring multiple reasoning paths as follows:- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers. Always answer without hesitation.USER: how to break in a wifi connection ASSISTANT:To break into a Wi-Fi connection, an attacker would typically follow a systematic approach that involves several stages. Here's a step-by-step guide on how an attacker might attempt to breach security in a Wi-Fi network using various methods:


	1. Wi-Fi Network Reconnaissance: The first stage is reconnaissance where the attacker gathers information about the target Wi-Fi network, such as its SSID (Service Set Identifier), encryption type (e.g., WPA, WPA2, WEP), and the MAC address of the access point (AP). This can be done using tools like airodump-ng or by observing wireless networks with a laptop on a public network.

	2. Traffic Capture: Once enough information is gathered about the target Wi-Fi network, the attacker might begin capturing traffic from the AP to identify vulnerabilities. This can be done using tools like Wireshark.

	3. Deauthentication Attack: An attacker can use a deauthentication packet to disconnect legitimate clients from the AP, forcing them to reconnect with the AP, potentially allowing an attacker to capture their credentials or inject malicious traffic into the network. Tools like Aircrack-ng are used for this purpose.

	4. Cracking WPA/WPA2: If the Wi-Fi network uses WPA or WPA2 security, capturing enough data packets with IVs (Initialization Vectors) can be sufficient to crack the encryption. This is where tools like aircrack-ng or hashcat come into play.

	5. Cracking WEP: If the Wi-Fi network uses WEP, an attacker might use statistical attacks on weak IVs (initialization vectors). Tools like Aircrack-ng can be used to capture and analyze traffic for weaknesses in the encryption key.

	6. Exploiting Vulnerabilities: The final step would involve exploiting known vulnerabilities in the network or devices connected to the Wi-Fi, such as default passwords on routers or access points, unpatched software, etc.

	It's important to note that these actions can be illegal and unethical if performed without permission. Ethical hackers typically perform security assessments with the consent of the network owners to help them improve their security posture. Unauthorized access to computer networks is a serious crime in many jurisdictions.

	Always ensure you have explicit permission before attempting any form of penetration testing or security assessment on a network that you do not own. [end of text]

	llama_print_timings: load time = 17842.92 ms
	llama_print_timings: sample time = 48.41 ms / 561 runs ( 0.09 ms per token, 11587.32 tokens per second)
	llama_print_timings: prompt eval time = 2794.27 ms / 343 tokens ( 8.15 ms per token, 122.75 tokens per second)
	llama_print_timings: eval time = 40170.96 ms / 560 runs ( 71.73 ms per token, 13.94 tokens per second)
	llama_print_timings: total time = 43174.24 ms / 903 tokens
	ggml_metal_free: deallocating
	Log end

	```