BoghdadyJR commited on
Commit
76ca076
1 Parent(s): d413abc

End of training

Browse files
Files changed (40) hide show
  1. .gitattributes +2 -0
  2. README.md +56 -0
  3. clean-layer.sh +24 -0
  4. config.json +31 -0
  5. core-js-banners +1 -0
  6. generation_config.json +7 -0
  7. jax/jaxlib-0.4.26.dev20240504-cp310-cp310-manylinux2014_x86_64.whl +3 -0
  8. kaggle.log +0 -0
  9. keras_patch.sh +41 -0
  10. merges.txt +0 -0
  11. model.safetensors +3 -0
  12. package_list +438 -0
  13. runs/Aug20_15-55-35_092165c8a85c/events.out.tfevents.1724169364.092165c8a85c.34.0 +3 -0
  14. runs/Aug20_16-09-55_092165c8a85c/events.out.tfevents.1724170199.092165c8a85c.34.1 +3 -0
  15. special_tokens_map.json +30 -0
  16. tmp0lwue7f_/model_architecture.txt +26 -0
  17. tmp3v01yz78 +0 -0
  18. tmp55m8q0wc/__pycache__/_remote_module_non_scriptable.cpython-310.pyc +0 -0
  19. tmp55m8q0wc/_remote_module_non_scriptable.py +81 -0
  20. tmp59vk47ea.json +14 -0
  21. tmpa43hu54q/model_architecture.txt +26 -0
  22. tokenizer.json +0 -0
  23. tokenizer_config.json +30 -0
  24. training_args.bin +3 -0
  25. v8-compile-cache-0/11.3.244.8-node.16/zSoptzScondazSsharezSjupyterzSlabzSstagingzSnode_moduleszSwebpackzSbinzSwebpack.js.BLOB +3 -0
  26. v8-compile-cache-0/11.3.244.8-node.16/zSoptzScondazSsharezSjupyterzSlabzSstagingzSnode_moduleszSwebpackzSbinzSwebpack.js.MAP +0 -0
  27. vocab.json +0 -0
  28. yarn--1704964927164-0.1782869183766309/node +3 -0
  29. yarn--1704964927165-0.35993847591915795/node +3 -0
  30. yarn--1704964927165-0.35993847591915795/yarn +3 -0
  31. yarn--1704964928127-0.35887085411210307/node +3 -0
  32. yarn--1704964928127-0.35887085411210307/yarn +3 -0
  33. yarn--1704964933548-0.9586437363349738/node +3 -0
  34. yarn--1704964933548-0.9586437363349738/yarn +3 -0
  35. yarn--1704965062369-0.2181574829625219/node +3 -0
  36. yarn--1704965062369-0.2181574829625219/yarn +3 -0
  37. yarn--1704965063359-0.7469983285256505/node +3 -0
  38. yarn--1704965063359-0.7469983285256505/yarn +3 -0
  39. yarn--1704965068891-0.46018400452963615/node +3 -0
  40. yarn--1704965068891-0.46018400452963615/yarn +3 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ jax/jaxlib-0.4.26.dev20240504-cp310-cp310-manylinux2014_x86_64.whl filter=lfs diff=lfs merge=lfs -text
37
+ v8-compile-cache-0/11.3.244.8-node.16/zSoptzScondazSsharezSjupyterzSlabzSstagingzSnode_moduleszSwebpackzSbinzSwebpack.js.BLOB filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ base_model: facebook/opt-350m
4
+ tags:
5
+ - trl
6
+ - sft
7
+ - generated_from_trainer
8
+ model-index:
9
+ - name: tmp
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/boghdady95/huggingface/runs/3xawbmyg)
17
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/boghdady95/huggingface/runs/3xawbmyg)
18
+ # tmp
19
+
20
+ This model is a fine-tuned version of [facebook/opt-350m](https://huggingface.co/facebook/opt-350m) on an unknown dataset.
21
+
22
+ ## Model description
23
+
24
+ More information needed
25
+
26
+ ## Intended uses & limitations
27
+
28
+ More information needed
29
+
30
+ ## Training and evaluation data
31
+
32
+ More information needed
33
+
34
+ ## Training procedure
35
+
36
+ ### Training hyperparameters
37
+
38
+ The following hyperparameters were used during training:
39
+ - learning_rate: 5e-05
40
+ - train_batch_size: 8
41
+ - eval_batch_size: 8
42
+ - seed: 42
43
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: linear
45
+ - num_epochs: 3.0
46
+
47
+ ### Training results
48
+
49
+
50
+
51
+ ### Framework versions
52
+
53
+ - Transformers 4.42.3
54
+ - Pytorch 2.1.2
55
+ - Datasets 2.20.0
56
+ - Tokenizers 0.19.1
clean-layer.sh ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ #
3
+ # This scripts should be called at the end of each RUN command
4
+ # in the Dockerfiles.
5
+ #
6
+ # Each RUN command creates a new layer that is stored separately.
7
+ # At the end of each command, we should ensure we clean up downloaded
8
+ # archives and source files used to produce binary to reduce the size
9
+ # of the layer.
10
+ set -e
11
+ set -x
12
+
13
+ # Delete files that pip caches when installing a package.
14
+ rm -rf /root/.cache/pip/*
15
+ # Delete old downloaded archive files
16
+ apt-get autoremove -y
17
+ # Delete downloaded archive files
18
+ apt-get clean
19
+ # Ensures the current working directory won't be deleted
20
+ cd /usr/local/src/
21
+ # Delete source files used for building binaries
22
+ rm -rf /usr/local/src/*
23
+ # Delete conda downloaded tarballs
24
+ conda clean -y --tarballs
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "facebook/opt-350m",
3
+ "_remove_final_layer_norm": false,
4
+ "activation_dropout": 0.0,
5
+ "activation_function": "relu",
6
+ "architectures": [
7
+ "OPTForCausalLM"
8
+ ],
9
+ "attention_dropout": 0.0,
10
+ "bos_token_id": 2,
11
+ "do_layer_norm_before": false,
12
+ "dropout": 0.1,
13
+ "enable_bias": true,
14
+ "eos_token_id": 2,
15
+ "ffn_dim": 4096,
16
+ "hidden_size": 1024,
17
+ "init_std": 0.02,
18
+ "layer_norm_elementwise_affine": true,
19
+ "layerdrop": 0.0,
20
+ "max_position_embeddings": 2048,
21
+ "model_type": "opt",
22
+ "num_attention_heads": 16,
23
+ "num_hidden_layers": 24,
24
+ "pad_token_id": 1,
25
+ "prefix": "</s>",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.42.3",
28
+ "use_cache": true,
29
+ "vocab_size": 50272,
30
+ "word_embed_proj_dim": 512
31
+ }
core-js-banners ADDED
@@ -0,0 +1 @@
 
 
1
+ ["\u001b[96mThank you for using core-js (\u001b[94m https://github.com/zloirock/core-js \u001b[96m) for polyfilling JavaScript standard library!\u001b[0m\n\n\u001b[96mThe project needs your help! Please consider supporting of core-js:\u001b[0m\n\u001b[96m>\u001b[94m https://opencollective.com/core-js \u001b[0m\n\u001b[96m>\u001b[94m https://patreon.com/zloirock \u001b[0m\n\u001b[96m>\u001b[94m https://paypal.me/zloirock \u001b[0m\n\u001b[96m>\u001b[94m bitcoin: bc1qlea7544qtsmj2rayg0lthvza9fau63ux0fstcz \u001b[0m\n\n\u001b[96mAlso, the author of core-js (\u001b[94m https://github.com/zloirock \u001b[96m) is looking for a good job -)\u001b[0m\n","\u001b[96mThank you for using core-js (\u001b[94m https://github.com/zloirock/core-js \u001b[96m) for polyfilling JavaScript standard library!\u001b[0m\n\n\u001b[96mThe project needs your help! Please consider supporting of core-js:\u001b[0m\n\u001b[96m>\u001b[94m https://opencollective.com/core-js \u001b[0m\n\u001b[96m>\u001b[94m https://patreon.com/zloirock \u001b[0m\n\u001b[96m>\u001b[94m bitcoin: bc1qlea7544qtsmj2rayg0lthvza9fau63ux0fstcz \u001b[0m\n\n\u001b[96mAlso, the author of core-js (\u001b[94m https://github.com/zloirock \u001b[96m) is looking for a good job -)\u001b[0m\n"]
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 2,
4
+ "eos_token_id": 2,
5
+ "pad_token_id": 1,
6
+ "transformers_version": "4.42.3"
7
+ }
jax/jaxlib-0.4.26.dev20240504-cp310-cp310-manylinux2014_x86_64.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89855b059830f3e498c4fc71dde38fba040ae5ae6693efb88749a71b2c199b53
3
+ size 132774266
kaggle.log ADDED
File without changes
keras_patch.sh ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ # The following "sed" are to patch the current version of tf-df with
4
+ # a fix for keras 3. In essence, replaces the use of package name "tf.keras" with
5
+ # "tf_keras"
6
+
7
+ sed -i "/import tensorflow_decision_forests as tfdf/a import tf_keras" /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/__init__.py && \
8
+ sed -i -e "/import tensorflow as tf/a import tf_keras" \
9
+ -e "/from yggdrasil_decision_forests.utils.distribute.implementations.grpc/a from tensorflow_decision_forests.keras import keras_internal" \
10
+ -e '/try:/{:a;N;/backend = tf.keras.backend/!ba;d}'\
11
+ /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/core.py && \
12
+ sed -i -e "s/from typing import Optional, List, Dict, Any, Union, NamedTuple/from typing import Any, Dict, List, NamedTuple, Optional, Union/g" \
13
+ -e "/import tensorflow as tf/a from tensorflow_decision_forests.keras import keras_internal" \
14
+ -e "/import tensorflow as tf/a import tf_keras" \
15
+ -e '/layers = tf.keras.layers/{:a;N;/backend = tf.keras.backend/!ba;d}' \
16
+ /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/core_inference.py && \
17
+ find /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests -type f -exec sed -i \
18
+ -e "s/get_data_handler/keras_internal.get_data_handler/g" \
19
+ -e 's/"models.Functional"/keras_internal.Functional/g' \
20
+ -e "s/tf.keras.utils.unpack_x_y_sample_weight/keras_internal.unpack_x_y_sample_weight/g" \
21
+ -e "s/tf.keras.utils.experimental/keras_internal/g" \
22
+ {} \; && \
23
+ sed -i -e "/import tensorflow as tf/a import tf_keras" \
24
+ -e "/from tensorflow_decision_forests.keras import core/a from tensorflow_decision_forests.keras import keras_internal" \
25
+ -e '/layers = tf.keras.layers/{:a;N;/callbacks = tf.keras.callbacks/!ba;d}' \
26
+ /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/keras_test.py && \
27
+ find /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras -type f -exec sed -i \
28
+ -e "s/ layers.Input/ tf_keras.layers.Input/g" \
29
+ -e "s/layers.minimum/tf_keras.layers.minimum/g" \
30
+ -e "s/layers.Concatenate/tf_keras.layers.Concatenate/g" \
31
+ -e "s/layers.Dense/tf_keras.layers.Dense/g" \
32
+ -e "s/layers.experimental.preprocessing./tf_keras.layers./g" \
33
+ -e "s/layers.DenseFeatures/keras_internal.layers.DenseFeatures/g" \
34
+ -e "s/models.Model/tf_keras.models.Model/g" {} \; && \
35
+ sed -i "s/ models.load_model/ tf_keras.models.load_model/g" /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/keras_test.py && \
36
+ sed -i "/import tensorflow as tf/a import tf_keras" /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/test_runner.py && \
37
+ sed -i "/import tensorflow as tf/a import tf_keras" /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/wrappers.py && \
38
+ sed -i -e "/import tensorflow as tf/a import tf_keras" \
39
+ -e "s/optimizer=optimizers.Adam()/optimizer=tf_keras.optimizers.Adam()/g" \
40
+ /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests/keras/wrappers_pre_generated.py && \
41
+ find /opt/conda/lib/python3.10/site-packages/tensorflow_decision_forests -type f -exec sed -i "s/tf.keras./tf_keras./g" {} \;
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:628c988ca0096de4869c21d5311d04763d3be5b5f6bbc364bc96e9bbe743d9fa
3
+ size 1324830880
package_list ADDED
@@ -0,0 +1,438 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file may be used to create an environment using:
2
+ # $ conda create --name <env> --file <this file>
3
+ # platform: linux-64
4
+ _libgcc_mutex=0.1=conda_forge
5
+ _openmp_mutex=4.5=2_gnu
6
+ absl-py=1.4.0=pypi_0
7
+ aiofiles=22.1.0=pypi_0
8
+ aiohttp=3.9.1=py310h2372a71_0
9
+ aiohttp-cors=0.7.0=pypi_0
10
+ aiorwlock=1.3.0=pypi_0
11
+ aiosignal=1.3.1=pyhd8ed1ab_0
12
+ aiosqlite=0.19.0=pypi_0
13
+ annotated-types=0.6.0=pypi_0
14
+ anyio=4.2.0=pyhd8ed1ab_0
15
+ apache-beam=2.46.0=pypi_0
16
+ archspec=0.2.2=pyhd8ed1ab_0
17
+ argon2-cffi=23.1.0=pyhd8ed1ab_0
18
+ argon2-cffi-bindings=21.2.0=py310h2372a71_4
19
+ array-record=0.5.0=pypi_0
20
+ arrow=1.3.0=pyhd8ed1ab_0
21
+ asttokens=2.4.1=pyhd8ed1ab_0
22
+ astunparse=1.6.3=pypi_0
23
+ async-timeout=4.0.3=pyhd8ed1ab_0
24
+ attrs=23.2.0=pyh71513ae_0
25
+ babel=2.14.0=pypi_0
26
+ backoff=2.2.1=pypi_0
27
+ beatrix-jupyterlab=2023.128.151533=pypi_0
28
+ beautifulsoup4=4.12.2=pyha770c72_0
29
+ bleach=6.1.0=pyhd8ed1ab_0
30
+ blessed=1.20.0=pypi_0
31
+ boltons=23.1.1=pyhd8ed1ab_0
32
+ brotli-python=1.1.0=py310hc6cd4ac_1
33
+ brotlipy=0.7.0=py310h7f8727e_1002
34
+ bzip2=1.0.8=h7b6447c_0
35
+ c-ares=1.25.0=hd590300_0
36
+ ca-certificates=2023.11.17=hbcca054_0
37
+ cached-property=1.5.2=hd8ed1ab_1
38
+ cached_property=1.5.2=pyha770c72_1
39
+ cachetools=4.2.4=pypi_0
40
+ certifi=2023.11.17=pyhd8ed1ab_0
41
+ cffi=1.16.0=py310h2fee648_0
42
+ charset-normalizer=3.3.2=pyhd8ed1ab_0
43
+ click=8.1.7=pypi_0
44
+ cloud-tpu-client=0.10=pypi_0
45
+ cloud-tpu-profiler=2.4.0=pypi_0
46
+ cloudpickle=2.2.1=pypi_0
47
+ colorama=0.4.6=pyhd8ed1ab_0
48
+ colorful=0.5.6=pypi_0
49
+ comm=0.2.1=pyhd8ed1ab_0
50
+ conda=23.11.0=py310hff52083_1
51
+ conda-libmamba-solver=23.12.0=pyhd8ed1ab_0
52
+ conda-package-handling=2.2.0=pyh38be061_0
53
+ conda-package-streaming=0.9.0=pyhd8ed1ab_0
54
+ contourpy=1.2.0=pypi_0
55
+ crcmod=1.7=pypi_0
56
+ cryptography=41.0.7=py310hb8475ec_1
57
+ cycler=0.12.1=pypi_0
58
+ cython=3.0.8=pypi_0
59
+ dacite=1.8.1=pypi_0
60
+ dataproc-jupyter-plugin=0.1.66=pypi_0
61
+ db-dtypes=1.2.0=pypi_0
62
+ debugpy=1.8.0=py310hc6cd4ac_1
63
+ decorator=5.1.1=pyhd8ed1ab_0
64
+ defusedxml=0.7.1=pyhd8ed1ab_0
65
+ deprecated=1.2.14=pypi_0
66
+ dill=0.3.1.1=pypi_0
67
+ distlib=0.3.8=pypi_0
68
+ distro=1.9.0=pyhd8ed1ab_0
69
+ dlenv-tf-2-15-gpu=1.0.20240111=py310ha20f8e0_0
70
+ dm-tree=0.1.8=pypi_0
71
+ docker=7.0.0=pypi_0
72
+ docopt=0.6.2=pypi_0
73
+ docstring-parser=0.15=pypi_0
74
+ entrypoints=0.4=pyhd8ed1ab_0
75
+ etils=1.6.0=pypi_0
76
+ exceptiongroup=1.2.0=pyhd8ed1ab_2
77
+ executing=2.0.1=pyhd8ed1ab_0
78
+ explainable-ai-sdk=1.3.3=pypi_0
79
+ farama-notifications=0.0.4=pypi_0
80
+ fastapi=0.108.0=pypi_0
81
+ fastavro=1.9.3=pypi_0
82
+ fasteners=0.19=pypi_0
83
+ filelock=3.13.1=pypi_0
84
+ flatbuffers=23.5.26=pypi_0
85
+ fmt=10.1.1=h00ab1b0_1
86
+ fonttools=4.47.0=pypi_0
87
+ fqdn=1.5.1=pyhd8ed1ab_0
88
+ frozenlist=1.4.1=py310h2372a71_0
89
+ fsspec=2023.12.2=pypi_0
90
+ gast=0.5.4=pypi_0
91
+ gcsfs=2023.12.2.post1=pypi_0
92
+ gitdb=4.0.11=pypi_0
93
+ gitpython=3.1.41=pypi_0
94
+ gmp=6.3.0=h59595ed_0
95
+ google-api-core=1.34.0=pypi_0
96
+ google-api-core-grpc=2.11.1=hd8ed1ab_0
97
+ google-api-python-client=1.8.0=pypi_0
98
+ google-apitools=0.5.31=pypi_0
99
+ google-auth=2.26.1=pyhca7485f_0
100
+ google-auth-httplib2=0.1.1=pypi_0
101
+ google-auth-oauthlib=1.2.0=pypi_0
102
+ google-cloud-aiplatform=1.39.0=pypi_0
103
+ google-cloud-artifact-registry=1.10.0=pypi_0
104
+ google-cloud-bigquery=3.15.0=pypi_0
105
+ google-cloud-bigquery-storage=2.16.2=pypi_0
106
+ google-cloud-bigtable=1.7.3=pypi_0
107
+ google-cloud-core=2.4.1=pyhd8ed1ab_0
108
+ google-cloud-datastore=1.15.5=pyhd8ed1ab_0
109
+ google-cloud-dlp=3.14.0=pypi_0
110
+ google-cloud-jupyter-config=0.0.5=pypi_0
111
+ google-cloud-language=1.3.2=pypi_0
112
+ google-cloud-monitoring=2.18.0=pypi_0
113
+ google-cloud-pubsub=2.19.0=pypi_0
114
+ google-cloud-pubsublite=1.9.0=pypi_0
115
+ google-cloud-recommendations-ai=0.7.1=pypi_0
116
+ google-cloud-resource-manager=1.11.0=pypi_0
117
+ google-cloud-spanner=3.40.1=pypi_0
118
+ google-cloud-storage=2.14.0=pypi_0
119
+ google-cloud-videointelligence=1.16.3=pypi_0
120
+ google-cloud-vision=3.5.0=pypi_0
121
+ google-crc32c=1.5.0=pypi_0
122
+ google-pasta=0.2.0=pypi_0
123
+ google-resumable-media=2.7.0=pypi_0
124
+ googleapis-common-protos=1.62.0=pyhd8ed1ab_0
125
+ gpustat=1.0.0=pypi_0
126
+ greenlet=3.0.3=pypi_0
127
+ grpc-cpp=1.48.1=hc2bec63_1
128
+ grpc-google-iam-v1=0.12.7=pypi_0
129
+ grpcio=1.60.0=pypi_0
130
+ grpcio-status=1.48.2=pypi_0
131
+ gviz-api=1.10.0=pypi_0
132
+ gymnasium=0.28.1=pypi_0
133
+ h11=0.14.0=pypi_0
134
+ h5py=3.10.0=pypi_0
135
+ hdfs=2.7.3=pypi_0
136
+ htmlmin=0.1.12=pypi_0
137
+ httplib2=0.21.0=pypi_0
138
+ httptools=0.6.1=pypi_0
139
+ icu=73.2=h59595ed_0
140
+ idna=3.6=pyhd8ed1ab_0
141
+ imagehash=4.3.1=pypi_0
142
+ imageio=2.33.1=pypi_0
143
+ importlib-metadata=6.11.0=pypi_0
144
+ importlib_metadata=7.0.1=hd8ed1ab_0
145
+ importlib_resources=6.1.1=pyhd8ed1ab_0
146
+ ipykernel=6.28.0=pyhd33586a_0
147
+ ipython=8.20.0=pyh707e725_0
148
+ ipython-genutils=0.2.0=pypi_0
149
+ ipython-sql=0.5.0=pypi_0
150
+ ipython_genutils=0.2.0=py_1
151
+ ipywidgets=8.1.1=pypi_0
152
+ isoduration=20.11.0=pyhd8ed1ab_0
153
+ jaraco-classes=3.3.0=pypi_0
154
+ jax-jumpy=1.0.0=pypi_0
155
+ jedi=0.19.1=pyhd8ed1ab_0
156
+ jeepney=0.8.0=pypi_0
157
+ jinja2=3.1.2=pyhd8ed1ab_1
158
+ joblib=1.3.2=pypi_0
159
+ json5=0.9.14=pypi_0
160
+ jsonpatch=1.33=pyhd8ed1ab_0
161
+ jsonpointer=2.4=py310hff52083_3
162
+ jsonschema=4.20.0=pyhd8ed1ab_0
163
+ jsonschema-specifications=2023.12.1=pyhd8ed1ab_0
164
+ jsonschema-with-format-nongpl=4.20.0=pyhd8ed1ab_0
165
+ jupyter-client=7.4.9=pypi_0
166
+ jupyter-http-over-ws=0.0.8=pypi_0
167
+ jupyter-server-fileid=0.9.1=pypi_0
168
+ jupyter-server-mathjax=0.2.6=pypi_0
169
+ jupyter-server-proxy=4.1.0=pypi_0
170
+ jupyter-server-ydoc=0.8.0=pypi_0
171
+ jupyter-ydoc=0.2.5=pypi_0
172
+ jupyter_client=8.6.0=pyhd8ed1ab_0
173
+ jupyter_core=5.7.1=py310hff52083_0
174
+ jupyter_events=0.9.0=pyhd8ed1ab_0
175
+ jupyter_server=2.12.3=pyhd8ed1ab_0
176
+ jupyter_server_terminals=0.5.1=pyhd8ed1ab_0
177
+ jupyterlab=3.6.6=pypi_0
178
+ jupyterlab-git=0.44.0=pypi_0
179
+ jupyterlab-server=2.25.2=pypi_0
180
+ jupyterlab-widgets=3.0.9=pypi_0
181
+ jupyterlab_pygments=0.3.0=pyhd8ed1ab_0
182
+ jupytext=1.16.0=pypi_0
183
+ keras=2.15.0=pypi_0
184
+ keras-tuner=1.4.6=pypi_0
185
+ kernels-mixer=0.0.7=pypi_0
186
+ keyring=24.3.0=pypi_0
187
+ keyrings-google-artifactregistry-auth=1.1.2=pypi_0
188
+ keyutils=1.6.1=h166bdaf_0
189
+ kfp=2.5.0=pypi_0
190
+ kfp-pipeline-spec=0.2.2=pypi_0
191
+ kfp-server-api=2.0.5=pypi_0
192
+ kiwisolver=1.4.5=pypi_0
193
+ krb5=1.21.2=h659d440_0
194
+ kt-legacy=1.0.5=pypi_0
195
+ kubernetes=26.1.0=pypi_0
196
+ lazy-loader=0.3=pypi_0
197
+ ld_impl_linux-64=2.40=h41732ed_0
198
+ libabseil=20220623.0=cxx17_h05df665_6
199
+ libarchive=3.7.2=h2aa1ff5_1
200
+ libclang=16.0.6=pypi_0
201
+ libcurl=8.5.0=hca28451_0
202
+ libedit=3.1.20191231=he28a2e2_2
203
+ libev=4.33=hd590300_2
204
+ libffi=3.4.2=h7f98852_5
205
+ libgcc-ng=13.2.0=h807b86a_3
206
+ libgomp=13.2.0=h807b86a_3
207
+ libiconv=1.17=hd590300_2
208
+ libmamba=1.5.6=had39da4_0
209
+ libmambapy=1.5.6=py310h39ff949_0
210
+ libnghttp2=1.58.0=h47da74e_1
211
+ libnsl=2.0.1=hd590300_0
212
+ libprotobuf=3.20.3=h3eb15da_0
213
+ libsodium=1.0.18=h36c2ea0_1
214
+ libsolv=0.7.27=hfc55251_0
215
+ libsqlite=3.44.2=h2797004_0
216
+ libssh2=1.11.0=h0841786_0
217
+ libstdcxx-ng=13.2.0=h7e041cc_3
218
+ libuuid=2.38.1=h0b41bf4_0
219
+ libuv=1.46.0=hd590300_0
220
+ libxcrypt=4.4.36=hd590300_1
221
+ libxml2=2.12.3=h232c23b_0
222
+ libzlib=1.2.13=hd590300_5
223
+ llvmlite=0.41.1=pypi_0
224
+ lz4=4.3.3=pypi_0
225
+ lz4-c=1.9.4=hcb278e6_0
226
+ lzo=2.10=h516909a_1000
227
+ markdown=3.5.2=pypi_0
228
+ markdown-it-py=3.0.0=pypi_0
229
+ markupsafe=2.0.1=pypi_0
230
+ matplotlib=3.8.2=pypi_0
231
+ matplotlib-inline=0.1.6=pyhd8ed1ab_0
232
+ mdit-py-plugins=0.4.0=pypi_0
233
+ mdurl=0.1.2=pypi_0
234
+ menuinst=2.0.1=py310hff52083_0
235
+ mistune=3.0.2=pyhd8ed1ab_0
236
+ ml-dtypes=0.2.0=pypi_0
237
+ more-itertools=10.2.0=pypi_0
238
+ msgpack=1.0.7=pypi_0
239
+ multidict=6.0.4=py310h2372a71_1
240
+ multimethod=1.10=pypi_0
241
+ nb_conda=2.2.1=unix_7
242
+ nb_conda_kernels=2.3.1=pyhd8ed1ab_3
243
+ nbclassic=1.0.0=pyhb4ecaf3_1
244
+ nbclient=0.9.0=pypi_0
245
+ nbconvert=7.14.0=pyhd8ed1ab_0
246
+ nbconvert-core=7.14.0=pyhd8ed1ab_0
247
+ nbconvert-pandoc=7.14.0=pyhd8ed1ab_0
248
+ nbdime=3.2.0=pypi_0
249
+ nbformat=5.9.2=pyhd8ed1ab_0
250
+ ncurses=6.4=h59595ed_2
251
+ nest-asyncio=1.5.8=pyhd8ed1ab_0
252
+ networkx=3.2.1=pypi_0
253
+ nodejs=20.9.0=hb753e55_0
254
+ notebook=6.5.6=pypi_0
255
+ notebook-executor=0.2=pypi_0
256
+ notebook-shim=0.2.3=pyhd8ed1ab_0
257
+ numba=0.58.1=pypi_0
258
+ numpy=1.24.4=pypi_0
259
+ nvidia-ml-py=11.495.46=pypi_0
260
+ oauth2client=4.1.3=pypi_0
261
+ oauthlib=3.2.2=pypi_0
262
+ objsize=0.6.1=pypi_0
263
+ opencensus=0.11.4=pypi_0
264
+ opencensus-context=0.1.3=pypi_0
265
+ openssl=3.2.0=hd590300_1
266
+ opentelemetry-api=1.22.0=pypi_0
267
+ opentelemetry-exporter-otlp=1.22.0=pypi_0
268
+ opentelemetry-exporter-otlp-proto-common=1.22.0=pypi_0
269
+ opentelemetry-exporter-otlp-proto-grpc=1.22.0=pypi_0
270
+ opentelemetry-exporter-otlp-proto-http=1.22.0=pypi_0
271
+ opentelemetry-proto=1.22.0=pypi_0
272
+ opentelemetry-sdk=1.22.0=pypi_0
273
+ opentelemetry-semantic-conventions=0.43b0=pypi_0
274
+ opt-einsum=3.3.0=pypi_0
275
+ orjson=3.9.10=pypi_0
276
+ overrides=7.4.0=pyhd8ed1ab_0
277
+ packaging=23.2=pyhd8ed1ab_0
278
+ pandas=2.1.4=pypi_0
279
+ pandas-profiling=3.6.6=pypi_0
280
+ pandoc=3.1.3=h32600fe_0
281
+ pandocfilters=1.5.0=pyhd8ed1ab_0
282
+ papermill=2.5.0=pypi_0
283
+ parso=0.8.3=pyhd8ed1ab_0
284
+ patsy=0.5.6=pypi_0
285
+ pexpect=4.9.0=pypi_0
286
+ phik=0.12.4=pypi_0
287
+ pickleshare=0.7.5=py_1003
288
+ pillow=10.2.0=pypi_0
289
+ pip=23.3.2=pyhd8ed1ab_0
290
+ pkgutil-resolve-name=1.3.10=pyhd8ed1ab_1
291
+ platformdirs=3.11.0=pypi_0
292
+ plotly=5.18.0=pypi_0
293
+ pluggy=1.3.0=pyhd8ed1ab_0
294
+ prettytable=3.9.0=pypi_0
295
+ prometheus_client=0.19.0=pyhd8ed1ab_0
296
+ promise=2.3=pypi_0
297
+ prompt-toolkit=3.0.43=pypi_0
298
+ proto-plus=1.23.0=pypi_0
299
+ protobuf=3.20.3=pypi_0
300
+ psutil=5.9.3=pypi_0
301
+ ptyprocess=0.7.0=pyhd3deb0d_0
302
+ pure_eval=0.2.2=pyhd8ed1ab_0
303
+ py-spy=0.3.14=pypi_0
304
+ pyarrow=9.0.0=pypi_0
305
+ pyasn1=0.5.1=pyhd8ed1ab_0
306
+ pyasn1-modules=0.3.0=pyhd8ed1ab_0
307
+ pybind11-abi=4=hd8ed1ab_3
308
+ pycosat=0.6.6=py310h2372a71_0
309
+ pycparser=2.21=pypi_0
310
+ pydantic=2.5.3=pypi_0
311
+ pydantic-core=2.14.6=pypi_0
312
+ pydot=1.4.2=pypi_0
313
+ pygments=2.17.2=pyhd8ed1ab_0
314
+ pyjwt=2.8.0=pypi_0
315
+ pymongo=3.13.0=pypi_0
316
+ pyopenssl=23.3.0=pyhd8ed1ab_0
317
+ pyparsing=3.1.1=pypi_0
318
+ pysocks=1.7.1=py310h06a4308_0
319
+ python=3.10.13=hd12c33a_1_cpython
320
+ python-dateutil=2.8.2=pyhd8ed1ab_0
321
+ python-dotenv=1.0.0=pypi_0
322
+ python-fastjsonschema=2.19.1=pyhd8ed1ab_0
323
+ python-json-logger=2.0.7=pyhd8ed1ab_0
324
+ python_abi=3.10=4_cp310
325
+ pytz=2023.3.post1=pypi_0
326
+ pyu2f=0.1.5=pyhd8ed1ab_0
327
+ pywavelets=1.5.0=pypi_0
328
+ pyyaml=6.0.1=py310h2372a71_1
329
+ pyzmq=24.0.1=pypi_0
330
+ ray=2.9.0=pypi_0
331
+ ray-cpp=2.9.0=pypi_0
332
+ re2=2022.06.01=h27087fc_1
333
+ readline=8.2=h8228510_1
334
+ referencing=0.32.1=pyhd8ed1ab_0
335
+ regex=2023.12.25=pypi_0
336
+ reproc=14.2.4.post0=hd590300_1
337
+ reproc-cpp=14.2.4.post0=h59595ed_1
338
+ requests=2.31.0=pyhd8ed1ab_0
339
+ requests-oauthlib=1.3.1=pypi_0
340
+ requests-toolbelt=0.10.1=pypi_0
341
+ retrying=1.3.4=pypi_0
342
+ rfc3339-validator=0.1.4=pyhd8ed1ab_0
343
+ rfc3986-validator=0.1.1=pyh9f0ad1d_0
344
+ rich=13.7.0=pypi_0
345
+ rpds-py=0.16.2=py310hcb5633a_0
346
+ rsa=4.9=pyhd8ed1ab_0
347
+ ruamel.yaml=0.18.5=py310h2372a71_0
348
+ ruamel.yaml.clib=0.2.7=py310h2372a71_2
349
+ ruamel_yaml=0.15.100=py310h7f8727e_0
350
+ scikit-image=0.22.0=pypi_0
351
+ scikit-learn=1.3.2=pypi_0
352
+ scipy=1.11.4=pypi_0
353
+ seaborn=0.12.2=pypi_0
354
+ secretstorage=3.3.3=pypi_0
355
+ send2trash=1.8.2=pyh41d4057_0
356
+ setuptools=69.0.3=pyhd8ed1ab_0
357
+ shapely=2.0.2=pypi_0
358
+ simpervisor=1.0.0=pypi_0
359
+ six=1.16.0=pypi_0
360
+ smart-open=6.4.0=pypi_0
361
+ smmap=5.0.1=pypi_0
362
+ sniffio=1.3.0=pyhd8ed1ab_0
363
+ soupsieve=2.5=pyhd8ed1ab_1
364
+ sqlalchemy=2.0.25=pypi_0
365
+ sqlite=3.38.2=hc218d9a_0
366
+ sqlparse=0.4.4=pypi_0
367
+ stack-data=0.6.3=pypi_0
368
+ stack_data=0.6.2=pyhd8ed1ab_0
369
+ starlette=0.32.0.post1=pypi_0
370
+ statsmodels=0.14.1=pypi_0
371
+ tabulate=0.9.0=pypi_0
372
+ tangled-up-in-unicode=0.2.0=pypi_0
373
+ tenacity=8.2.3=pypi_0
374
+ tensorboard=2.15.1=pypi_0
375
+ tensorboard-data-server=0.7.2=pypi_0
376
+ tensorboard-plugin-profile=2.15.0=pypi_0
377
+ tensorboardx=2.6.2.2=pypi_0
378
+ tensorflow=2.15.0=pypi_0
379
+ tensorflow-cloud=0.1.16=pypi_0
380
+ tensorflow-datasets=4.9.4=pypi_0
381
+ tensorflow-estimator=2.15.0=pypi_0
382
+ tensorflow-hub=0.15.0=pypi_0
383
+ tensorflow-io=0.35.0=pypi_0
384
+ tensorflow-io-gcs-filesystem=0.35.0=pypi_0
385
+ tensorflow-metadata=0.14.0=pypi_0
386
+ tensorflow-probability=0.23.0=pypi_0
387
+ tensorflow-serving-api=2.14.1=pypi_0
388
+ tensorflow-transform=0.14.0=pypi_0
389
+ termcolor=2.4.0=pypi_0
390
+ terminado=0.18.0=pyh0d859eb_0
391
+ threadpoolctl=3.2.0=pypi_0
392
+ tifffile=2023.12.9=pypi_0
393
+ tinycss2=1.2.1=pyhd8ed1ab_0
394
+ tk=8.6.13=noxft_h4845f30_101
395
+ toml=0.10.2=pypi_0
396
+ tomli=2.0.1=pypi_0
397
+ tornado=6.3.3=py310h2372a71_1
398
+ tqdm=4.66.1=pyhd8ed1ab_0
399
+ traitlets=5.9.0=pyhd8ed1ab_0
400
+ truststore=0.8.0=pyhd8ed1ab_0
401
+ typeguard=4.1.5=pypi_0
402
+ typer=0.9.0=pypi_0
403
+ types-python-dateutil=2.8.19.20240106=pyhd8ed1ab_0
404
+ typing-extensions=4.9.0=hd8ed1ab_0
405
+ typing_extensions=4.9.0=pyha770c72_0
406
+ typing_utils=0.1.0=pyhd8ed1ab_0
407
+ tzdata=2023.4=pypi_0
408
+ uri-template=1.3.0=pyhd8ed1ab_0
409
+ uritemplate=3.0.1=pypi_0
410
+ urllib3=1.26.18=pypi_0
411
+ uvicorn=0.25.0=pypi_0
412
+ uvloop=0.19.0=pypi_0
413
+ virtualenv=20.21.0=pypi_0
414
+ visions=0.7.5=pypi_0
415
+ watchfiles=0.21.0=pypi_0
416
+ wcwidth=0.2.13=pyhd8ed1ab_0
417
+ webcolors=1.13=pyhd8ed1ab_0
418
+ webencodings=0.5.1=pyhd8ed1ab_2
419
+ websocket-client=1.7.0=pyhd8ed1ab_0
420
+ websockets=12.0=pypi_0
421
+ werkzeug=2.1.2=pypi_0
422
+ wheel=0.42.0=pyhd8ed1ab_0
423
+ widgetsnbextension=4.0.9=pypi_0
424
+ witwidget=1.8.1=pypi_0
425
+ wordcloud=1.9.3=pypi_0
426
+ wrapt=1.14.1=pypi_0
427
+ xz=5.2.6=h166bdaf_0
428
+ y-py=0.6.2=pypi_0
429
+ yaml=0.2.5=h7b6447c_0
430
+ yaml-cpp=0.8.0=h59595ed_0
431
+ yarl=1.9.4=pypi_0
432
+ ydata-profiling=4.6.4=pypi_0
433
+ ypy-websocket=0.8.4=pypi_0
434
+ zeromq=4.3.5=h59595ed_0
435
+ zipp=3.17.0=pyhd8ed1ab_0
436
+ zlib=1.2.13=hd590300_5
437
+ zstandard=0.22.0=py310h1275a96_0
438
+ zstd=1.5.5=hfc55251_0
runs/Aug20_15-55-35_092165c8a85c/events.out.tfevents.1724169364.092165c8a85c.34.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0c8609921974726e57672b0aeec9ee5624d5179b55cd4c927e345ad39501d84
3
+ size 5548
runs/Aug20_16-09-55_092165c8a85c/events.out.tfevents.1724170199.092165c8a85c.34.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e9abfc65ff0d566589703f10dbb21971b8801c8ddb8c12418bbb93fb1efcf36
3
+ size 7023
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "</s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<pad>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "</s>",
25
+ "lstrip": false,
26
+ "normalized": true,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tmp0lwue7f_/model_architecture.txt ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ OPTForCausalLM(
2
+ (model): OPTModel(
3
+ (decoder): OPTDecoder(
4
+ (embed_tokens): Embedding(50272, 512, padding_idx=1)
5
+ (embed_positions): OPTLearnedPositionalEmbedding(2050, 1024)
6
+ (project_out): Linear(in_features=1024, out_features=512, bias=False)
7
+ (project_in): Linear(in_features=512, out_features=1024, bias=False)
8
+ (layers): ModuleList(
9
+ (0-23): 24 x OPTDecoderLayer(
10
+ (self_attn): OPTAttention(
11
+ (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
12
+ (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
13
+ (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
14
+ (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
15
+ )
16
+ (activation_fn): ReLU()
17
+ (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
18
+ (fc1): Linear(in_features=1024, out_features=4096, bias=True)
19
+ (fc2): Linear(in_features=4096, out_features=1024, bias=True)
20
+ (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
21
+ )
22
+ )
23
+ )
24
+ )
25
+ (lm_head): Linear(in_features=512, out_features=50272, bias=False)
26
+ )
tmp3v01yz78 ADDED
The diff for this file is too large to render. See raw diff
 
tmp55m8q0wc/__pycache__/_remote_module_non_scriptable.cpython-310.pyc ADDED
Binary file (1.5 kB). View file
 
tmp55m8q0wc/_remote_module_non_scriptable.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import *
2
+
3
+ import torch
4
+ import torch.distributed.rpc as rpc
5
+ from torch import Tensor
6
+ from torch._jit_internal import Future
7
+ from torch.distributed.rpc import RRef
8
+ from typing import Tuple # pyre-ignore: unused import
9
+
10
+
11
+ module_interface_cls = None
12
+
13
+
14
+ def forward_async(self, *args, **kwargs):
15
+ args = (self.module_rref, self.device, self.is_device_map_set, *args)
16
+ kwargs = {**kwargs}
17
+ return rpc.rpc_async(
18
+ self.module_rref.owner(),
19
+ _remote_forward,
20
+ args,
21
+ kwargs,
22
+ )
23
+
24
+
25
+ def forward(self, *args, **kwargs):
26
+ args = (self.module_rref, self.device, self.is_device_map_set, *args)
27
+ kwargs = {**kwargs}
28
+ ret_fut = rpc.rpc_async(
29
+ self.module_rref.owner(),
30
+ _remote_forward,
31
+ args,
32
+ kwargs,
33
+ )
34
+ return ret_fut.wait()
35
+
36
+
37
+ _generated_methods = [
38
+ forward_async,
39
+ forward,
40
+ ]
41
+
42
+
43
+
44
+
45
+ def _remote_forward(
46
+ module_rref: RRef[module_interface_cls], device: str, is_device_map_set: bool, *args, **kwargs):
47
+ module = module_rref.local_value()
48
+ device = torch.device(device)
49
+
50
+ if device.type != "cuda":
51
+ return module.forward(*args, **kwargs)
52
+
53
+ # If the module is on a cuda device,
54
+ # move any CPU tensor in args or kwargs to the same cuda device.
55
+ # Since torch script does not support generator expression,
56
+ # have to use concatenation instead of
57
+ # ``tuple(i.to(device) if isinstance(i, Tensor) else i for i in *args)``.
58
+ args = (*args,)
59
+ out_args: Tuple[()] = ()
60
+ for arg in args:
61
+ arg = (arg.to(device),) if isinstance(arg, Tensor) else (arg,)
62
+ out_args = out_args + arg
63
+
64
+ kwargs = {**kwargs}
65
+ for k, v in kwargs.items():
66
+ if isinstance(v, Tensor):
67
+ kwargs[k] = kwargs[k].to(device)
68
+
69
+ if is_device_map_set:
70
+ return module.forward(*out_args, **kwargs)
71
+
72
+ # If the device map is empty, then only CPU tensors are allowed to send over wire,
73
+ # so have to move any GPU tensor to CPU in the output.
74
+ # Since torch script does not support generator expression,
75
+ # have to use concatenation instead of
76
+ # ``tuple(i.cpu() if isinstance(i, Tensor) else i for i in module.forward(*out_args, **kwargs))``.
77
+ ret: Tuple[()] = ()
78
+ for i in module.forward(*out_args, **kwargs):
79
+ i = (i.cpu(),) if isinstance(i, Tensor) else (i,)
80
+ ret = ret + i
81
+ return ret
tmp59vk47ea.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": 1,
3
+ "storagePolicy": "wandb-storage-policy-v1",
4
+ "storagePolicyConfig": {
5
+ "storageLayout": "V2"
6
+ },
7
+ "contents": {
8
+ "model_architecture.txt": {
9
+ "digest": "b+9XOsxbEnOTw5uiXx8uNA==",
10
+ "birthArtifactID": "QXJ0aWZhY3Q6MTE1ODA1NzY3Nw==",
11
+ "size": 1217
12
+ }
13
+ }
14
+ }
tmpa43hu54q/model_architecture.txt ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ OPTForCausalLM(
2
+ (model): OPTModel(
3
+ (decoder): OPTDecoder(
4
+ (embed_tokens): Embedding(50272, 512, padding_idx=1)
5
+ (embed_positions): OPTLearnedPositionalEmbedding(2050, 1024)
6
+ (project_out): Linear(in_features=1024, out_features=512, bias=False)
7
+ (project_in): Linear(in_features=512, out_features=1024, bias=False)
8
+ (layers): ModuleList(
9
+ (0-23): 24 x OPTDecoderLayer(
10
+ (self_attn): OPTAttention(
11
+ (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
12
+ (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
13
+ (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
14
+ (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
15
+ )
16
+ (activation_fn): ReLU()
17
+ (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
18
+ (fc1): Linear(in_features=1024, out_features=4096, bias=True)
19
+ (fc2): Linear(in_features=4096, out_features=1024, bias=True)
20
+ (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
21
+ )
22
+ )
23
+ )
24
+ )
25
+ (lm_head): Linear(in_features=512, out_features=50272, bias=False)
26
+ )
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "1": {
6
+ "content": "<pad>",
7
+ "lstrip": false,
8
+ "normalized": true,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "2": {
14
+ "content": "</s>",
15
+ "lstrip": false,
16
+ "normalized": true,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ }
21
+ },
22
+ "bos_token": "</s>",
23
+ "clean_up_tokenization_spaces": true,
24
+ "eos_token": "</s>",
25
+ "errors": "replace",
26
+ "model_max_length": 1000000000000000019884624838656,
27
+ "pad_token": "<pad>",
28
+ "tokenizer_class": "GPT2Tokenizer",
29
+ "unk_token": "</s>"
30
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0119f409bc0d7968d2c1e139a8f4b8a45c7abc39eafd809f949f279dda6de85
3
+ size 5368
v8-compile-cache-0/11.3.244.8-node.16/zSoptzScondazSsharezSjupyterzSlabzSstagingzSnode_moduleszSwebpackzSbinzSwebpack.js.BLOB ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e1d585bb02b985ec668a78261aed811e24963f00ac15bf945a945de1f32f956
3
+ size 4604272
v8-compile-cache-0/11.3.244.8-node.16/zSoptzScondazSsharezSjupyterzSlabzSstagingzSnode_moduleszSwebpackzSbinzSwebpack.js.MAP ADDED
The diff for this file is too large to render. See raw diff
 
vocab.json ADDED
The diff for this file is too large to render. See raw diff
 
yarn--1704964927164-0.1782869183766309/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704964927165-0.35993847591915795/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704964927165-0.35993847591915795/yarn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "/opt/conda/share/jupyter/lab/staging/yarn.js" "$@"
yarn--1704964928127-0.35887085411210307/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704964928127-0.35887085411210307/yarn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "/opt/conda/share/jupyter/lab/staging/yarn.js" "$@"
yarn--1704964933548-0.9586437363349738/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704964933548-0.9586437363349738/yarn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "/opt/conda/share/jupyter/lab/staging/yarn.js" "$@"
yarn--1704965062369-0.2181574829625219/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704965062369-0.2181574829625219/yarn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "/opt/conda/share/jupyter/lab/staging/yarn.js" "$@"
yarn--1704965063359-0.7469983285256505/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704965063359-0.7469983285256505/yarn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "/opt/conda/share/jupyter/lab/staging/yarn.js" "$@"
yarn--1704965068891-0.46018400452963615/node ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "$@"
yarn--1704965068891-0.46018400452963615/yarn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ #!/bin/sh
2
+
3
+ exec "/opt/conda/bin/node" "/opt/conda/share/jupyter/lab/staging/yarn.js" "$@"