--- id: "duckdb-summarize" title: "Summarize" slug: "duckdb-summarize-query" description: "Summarize a specific table or columns for a quick overview of the dataset's structure and statistics." code: | -- summarize a specific table SUMMARIZE my_table -- summarize a specific column SUMMARIZE my_table.my_column --- # DuckDB Summarize Query This snippet demonstrates how to use the `SUMMARIZE` function in DuckDB to calculate aggregate statistics for a dataset. ```sql -- summarize a specific table SUMMARIZE my_table -- summarize a specific column SUMMARIZE my_table.my_column ``` The `SUMMARIZE` command in DuckDB provides a comprehensive overview of your data by computing various aggregates for each column: - `min` and `max`: The minimum and maximum values in the column. - `approx_unique`: An approximation of the number of unique values. - `avg`: The average value for numeric columns. - `std`: The standard deviation for numeric columns. - `q25`, `q50`, `q75`: The 25th, 50th (median), and 75th percentiles. - `count`: The total number of rows. - `null_percentage`: The percentage of NULL values in the column. This command is particularly useful for quick data exploration and understanding the distribution of values across your dataset. You can read more about the `SUMMARIZE` command in the DuckDB documentation [here](https://duckdb.org/docs/guides/meta/summarize.html).