Spaces:

CoreyMorris
/

MMLU-by-task-Leaderboard

Running

App Files Files

MMLU-by-task-Leaderboard

Commit History

Updated description and data

383dc16

Corey Morris commited on Sep 30, 2023

updated

65e92dc

Corey Morris commited on Sep 30, 2023

Completed loaded form csv

8ef77e5

Corey Morris commited on Sep 30, 2023

loading from csv instead of processing data each time

28e8799

Corey Morris commited on Sep 30, 2023

updated

667f9a4

Corey Morris commited on Sep 30, 2023

WIP. Loading data from csv

1a1910c

Corey Morris commited on Sep 30, 2023

Changed error logging from print statements to logger. It is not currently working to save to a file locally

d96fdf9

Corey Morris commited on Sep 6, 2023

Catching exceptions in processing files. As new data is introduced, I want to know which files may have different formats and cause problems, but the application shouldn't halt if it can't process a single file

68bce52

Corey Morris commited on Sep 6, 2023

Added new results from hugging face evaluations

ad0b971

Corey Morris commited on Sep 6, 2023

added code to split moral scenario question from one question to two

65d6581

Corey Morris commited on Sep 3, 2023

updated gitignore

76c8220

Corey Morris commited on Sep 3, 2023

updated dev requirements

5ca617c

Corey Morris commited on Sep 3, 2023

Extracted plotting functions from moral_app to plotting_utils to improve organization and testability

2b55a03

Corey Morris commited on Sep 2, 2023

copied main streamlit application to one that will specifically investigate moral reasoning

298ba1f

Corey Morris commited on Sep 2, 2023

updated date and model count

0c07f8b

Corey Morris commited on Sep 1, 2023

updated results

f1eba6e

Corey Morris commited on Sep 1, 2023

Added new hugging face results

3f507e0

Corey Morris commited on Aug 26, 2023

added a test and removed the code to only test a specific file because that code did not work

6ed8672

Corey Morris commited on Aug 24, 2023

updated to run submodule update

25d217c

Corey Morris commited on Aug 24, 2023

Update pytest run to only run specific test files. Other test files are not ready to be run on a different system yet

9345a86

Corey Morris commited on Aug 24, 2023

Merge branch 'main' of https://github.com/c1505/LLM-Dashboard into main

0e575e0

Corey Morris commited on Aug 24, 2023

Added additional results

7863417

Corey Morris commited on Aug 24, 2023

Updated to reflect number of models. Previously, I think there were duplicates

d396c1e

Corey Morris commited on Aug 24, 2023

Create python-app.yml

063ba51
unverified

Corey commited on Aug 24, 2023

Updated dependencies

73da8d6

Corey Morris commited on Aug 24, 2023

Show a random question from the moral scenarios evaluation

19c7c67

Corey Morris commited on Aug 24, 2023

Returning just a single file per model directory. Manually removing gpt-j-6b for now because there is something that is causing problems with processing the data

794b32b

Corey Morris commited on Aug 24, 2023

added new results

324764c

Corey Morris commited on Aug 23, 2023

TEMPORARY. deleted gpt-j-6b from subdirectory until problems are fixed

1fef386

Corey Morris commited on Aug 23, 2023

updated results

aba4fe2

Corey Morris commited on Aug 23, 2023

updated dev requirements

7681250

Corey Morris commited on Aug 23, 2023

added dev requirmenents

885ecf8

Corey Morris commited on Aug 22, 2023

Updated model count

4f20e65

Corey Morris commited on Aug 22, 2023

Updated contaminated models

e3863f2

Corey Morris commited on Aug 22, 2023

Added statement of removal of models

96ffe12

Corey Morris commited on Aug 22, 2023

removed commented code

7fc9618

Corey Morris commited on Aug 22, 2023

updated update data

280db99

Corey Morris commited on Aug 22, 2023

removing models that are known to have training data contaminated with evaluations

a5840fb

Corey Morris commited on Aug 22, 2023

updated with new hugging face results

916604b

Corey Morris commited on Aug 22, 2023

updated pipeline and init

7f2d984

Corey Morris commited on Aug 21, 2023

removed commented code

2f457d8

Corey Morris commited on Aug 21, 2023

added a test

a13887a

Corey Morris commited on Aug 21, 2023

shortened file name

7622af3

Corey Morris commited on Aug 21, 2023

shortened file name

38d88f9

Corey Morris commited on Aug 21, 2023

using URL as file name

25b87bf

Corey Morris commited on Aug 21, 2023

WIP. Updated download file. Can now download all files. Need to integrate that code to loop through all files to download or combine files first into a single dataframe and then save that

0a77c60

Corey Morris commited on Aug 20, 2023

added new test for a file that currently can be downloaded

6251f5a

Corey Morris commited on Aug 20, 2023

Replicating 404 error with a test so I can troubleshoot

9adae3c

Corey Morris commited on Aug 20, 2023

Updated download_file method

b58e1f0

Corey Morris commited on Aug 20, 2023

Build URL from file path is working

cc32c4f

Corey Morris commited on Aug 20, 2023