TheBloke/Wizard-Vicuna-13B-Uncensored-SuperHOT-8K-GPTQ · How big does the graphics card have to be at least?

How big does the graphics card MUST be at least?
Hello there. I'm pretty blind when it comes to AI. What do you say about that, see terminal excerpt --xxx--,
Q1: Are there any adjustment screws to tame pytorch?
Q2: Are there other adjustment screws to use with the existing GPU :: NVIDIA GeForce RTX 3060 Ti / Cuda cores 4864 / 8192 MB ::?
Q3: maybe use both GPU and CPU?
Q4: Can I only use ::TheBloke_Wizard-Vicuna-13B-Uncensored-SuperHOT-8K-GPTQ:: with a larger graphics card?
Thanks for any answer :-))

--xxx--quote from terminal:
"14:36:45-837649 ERROR Failed to load the model.
Traceback (most recent call last):
File "/home/til/oobabooga/text-generation-webui-main/modules/ui_model_menu.py", line 231, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/til/oobabooga/text-generation-webui-main/modules/models.py", line 93, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/til/oobabooga/text-generation-webui-main/modules/models.py", line 321, in ExLlamav2_HF_loader
return Exllamav2HF.from_pretrained(model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/til/oobabooga/text-generation-webui-main/modules/exllamav2_hf.py", line 183, in from_pretrained
return Exllamav2HF(config)
^^^^^^^^^^^^^^^^^^^
File "/home/til/oobabooga/text-generation-webui-main/modules/exllamav2_hf.py", line 57, in init
self.ex_cache = ExLlamaV2Cache(self.ex_model, lazy=shared.args.autosplit)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/til/oobabooga/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/exllamav2/cache.py", line 230, in init
self.create_state_tensors(copy_from, lazy)
File "/home/til/oobabooga/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/exllamav2/cache.py", line 83, in create_state_tensors
p_value_states = torch.zeros(self.shape_wv, dtype = self.dtype, device = device).contiguous()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 80.00 MiB. GPU 0 has a total capacity of 7.79 GiB of which 3.75 MiB is free. Including non-PyTorch memory, this process has 7.77 GiB memory in use. Of the allocated memory 7.54 GiB is allocated by PyTorch, and 84.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) "