Spaces:
Sleeping
Sleeping
import outlines | |
def generate_mapping_prompt(code): | |
"""Format the following python code to a list of cells to be used in a jupyter notebook: | |
{{ code }} | |
The output should be a list of json objects with the | |
following schema, including the leading and trailing "```json" and "```": | |
```json | |
[ | |
{ | |
"cell_type": string // This refers either is a markdown or code cell type. | |
"source": list of string separated by comma // This is the list of text or python code. | |
} | |
] | |
``` | |
""" | |
def generate_eda_prompt(columns_info, sample_data, first_code): | |
"""You are an expert data analyst tasked with generating an exploratory data analysis (EDA) Jupyter notebook. The data is provided as a pandas DataFrame with the following structure: | |
Columns and Data Types: | |
{{ columns_info }} | |
Sample Data: | |
{{ sample_data }} | |
Please create a pandas EDA notebook that includes the following: | |
1. Summary statistics for numerical columns. | |
2. Distribution plots for numerical columns. | |
3. Bar plots or count plots for categorical columns. | |
4. Correlation matrix and heatmap for numerical columns. | |
5. Any additional relevant visualizations or analyses you deem appropriate. | |
Ensure the notebook is well-organized, with explanations for each step. | |
It is mandatory that you use the following code to load the dataset, DO NOT try to load the dataset in any other way: | |
{{ first_code }} | |
The output should be a markdown python code snippet between the leading and trailing "```python" and "```". | |
""" | |
def generate_embedding_prompt(columns_info, sample_data, first_code): | |
"""You are an expert data scientist tasked with generating a Jupyter notebook to generate embeddings from a dataset. | |
The data is provided as a pandas DataFrame with the following structure: | |
Columns and Data Types: | |
{{ columns_info }} | |
Sample Data: | |
{{ sample_data }} | |
Please create a notebook that includes the following: | |
1. Load the dataset | |
2. Load embedding model using sentence-transformers library | |
3. Convert data into embeddings | |
4. Store embeddings | |
Ensure the notebook is well-organized, with explanations for each step. | |
It is mandatory that you use the following code to load the dataset, DO NOT try to load the dataset in any other way: | |
{{ first_code }} | |
""" | |
def generate_training_prompt(columns_info, sample_data, first_code): | |
""" | |
TODO | |
""" | |