File size: 2,104 Bytes
2d63b20
dafa792
e1b71b2
2d63b20
 
 
 
 
e1b71b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20c69c3
e1b71b2
 
 
 
 
20c69c3
e1b71b2
 
 
77e19b2
e1b71b2
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: USDA Food Assistant
emoji: 🍴
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
---

# USDA Food Assistant

The USDA Food Assistant is an interactive tool designed to help users explore detailed food data from the USDA Branded Food Dataset. By combining semantic search with natural language processing, the assistant enables users to retrieve food-specific information and engage in a conversational exploration of nutrients, ingredients, and serving sizes.

## Overview

The USDA Food Assistant operates in two main steps:

1. **Data Retrieval**: Users begin by inputting the name of a food item (e.g., “Oreo cookies”), which initiates a semantic search in the Pinecone Vector Store. Using the `multilingual-e5-large` embedding model, the assistant retrieves relevant data, such as ingredients, nutrients, and serving sizes for the specified food item, and loads this information as context for the interaction.

2. **Interactive Conversation**: Once the data is loaded into context, users can ask detailed follow-up questions about the food item. Questions might include:
   - “What are the vitamins and minerals in this item?”
   - “How many calories are in a 250-gram serving?”
   - “Does this food contain any allergens?”

Through this structured flow, users gain a comprehensive view of each food item's nutritional profile, making it a valuable tool for informed decision-making regarding food content and nutrition.

For more information on the development and structure of this assistant, see the blog post [here](https://jacktol.net/posts/building_a_data_pipeline_for_usda_fooddata_central/).

## Access the Dataset

The USDA Branded Food Dataset used by this assistant is available on HuggingFace Datasets [here](https://huggingface.co/datasets/jacktol/usda_branded_food_data).

## See the Code

The full code for the data pipeline responsible for creating the dataset, as well as for the USDA Food Assistant, can be found on GitHub [here](https://github.com/jack-tol/usda-food-data-pipeline).

## License

This project is licensed under the MIT License.