Tensors on different devices
I get the following error when I try to deploy this to an endpoint with 4xT4s:
'''RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:1! (when checking argument for argument index in method wrapper__index_select)'''
Do you have any tips by any chance @philschmid ?
I tried to run this in a sagemaker notebook and get the same error:
"input_ids = tokenizer(['inputs'], return_tensors="pt").input_ids
model.generate(input_ids)"
This my hf_device_map:
'''{'shared': 0,
'lm_head': 0,
'encoder.embed_tokens': 0,
'encoder.block.0': 0,
'encoder.block.1': 0,
'encoder.block.2': 0,
'encoder.block.3': 0,
'encoder.block.4': 0,
'encoder.block.5': 0,
'encoder.block.6': 0,
'encoder.block.7': 0,
'encoder.block.8': 0,
'encoder.block.9': 0,
'encoder.block.10': 0,
'encoder.block.11': 0,
'encoder.block.12': 0,
'encoder.block.13': 0,
'encoder.block.14': 0,
'encoder.block.15': 0,
'encoder.block.16': 0,
'encoder.block.17': 0,
'encoder.block.18': 0,
'encoder.block.19': 0,
'encoder.block.20': 0,
'encoder.block.21': 0,
'encoder.block.22': 0,
'encoder.block.23': 0,
'encoder.block.24': 0,
'encoder.block.25': 0,
'encoder.block.26': 0,
'encoder.block.27': 0,
'encoder.block.28': 1,
'encoder.block.29': 1,
'encoder.block.30': 1,
'encoder.block.31': 1,
'encoder.final_layer_norm': 1,
'encoder.dropout': 1,
'decoder.embed_tokens': 1,
'decoder.block.0': 1,
'decoder.block.1': 1,
'decoder.block.2': 1,
'decoder.block.3': 1,
'decoder.block.4': 1,
'decoder.block.5': 1,
'decoder.block.6': 1,
'decoder.block.7': 1,
'decoder.block.8': 1,
'decoder.block.9': 1,
'decoder.block.10': 1,
'decoder.block.11': 1,
'decoder.block.12': 1,
'decoder.block.13': 1,
'decoder.block.14': 1,
'decoder.block.15': 1,
'decoder.block.16': 1,
'decoder.block.17': 1,
'decoder.block.18': 1,
'decoder.block.19': 1,
'decoder.block.20': 2,
'decoder.block.21': 2,
'decoder.block.22': 2,
'decoder.block.23': 2,
'decoder.block.24': 2,
'decoder.block.25': 2,
'decoder.block.26': 2,
'decoder.block.27': 2,
'decoder.block.28': 2,
'decoder.block.29': 2,
'decoder.block.30': 2,
'decoder.block.31': 2,
'decoder.final_layer_norm': 2,
'decoder.dropout': 2}
'''