Spaces:
Running
Running
id: "duckdb-summarize" | |
title: "Summarize" | |
slug: "duckdb-summarize-query" | |
description: "Summarize a specific table or columns for a quick overview of the dataset's structure and statistics." | |
code: | | |
-- summarize a specific table | |
SUMMARIZE my_table | |
-- summarize a specific column | |
SUMMARIZE my_table.my_column | |
# DuckDB Summarize Query | |
This snippet demonstrates how to use the `SUMMARIZE` function in DuckDB to calculate aggregate statistics for a dataset. | |
```sql | |
-- summarize a specific table | |
SUMMARIZE my_table | |
-- summarize a specific column | |
SUMMARIZE my_table.my_column | |
``` | |
The `SUMMARIZE` command in DuckDB provides a comprehensive overview of your data by computing various aggregates for each column: | |
- `min` and `max`: The minimum and maximum values in the column. | |
- `approx_unique`: An approximation of the number of unique values. | |
- `avg`: The average value for numeric columns. | |
- `std`: The standard deviation for numeric columns. | |
- `q25`, `q50`, `q75`: The 25th, 50th (median), and 75th percentiles. | |
- `count`: The total number of rows. | |
- `null_percentage`: The percentage of NULL values in the column. | |
This command is particularly useful for quick data exploration and understanding the distribution of values across your dataset. | |
You can read more about the `SUMMARIZE` command in the DuckDB documentation [here](https://duckdb.org/docs/guides/meta/summarize.html). |