XGBoost Model for Elderly Nutrition Planning in Uganda
Model Description
This XGBoost regression model predicts daily caloric needs for elderly individuals (aged 60+) in Uganda based on nutritional content, health conditions, regional factors, and demographic information. The model is designed to support nutrition planning, meal preparation, and healthcare decision-making for elderly care in Uganda.
Model Details
- Model Type: XGBoost Regressor (Gradient Boosting)
- Task: Tabular Regression
- Version: v1.0_optimized
- Training Date: November 3, 2025
- Framework: XGBoost 2.0+
- Language: Python
- License: Apache 2.0
Developed By
- Organization: Graph-Enhanced LLMs for Locally-Sourced Elderly Nutrition Planning Project
- Project Focus: AI-driven nutrition planning for elderly populations in Uganda
- Contact: [shakirannannyombi@gmail.com]
Intended Use
Primary Use Cases
- Nutrition Planning: Calculate appropriate caloric intake for elderly individuals based on their health profile
- Meal Planning: Support caregivers and healthcare providers in designing meal plans
- Healthcare Decision Support: Assist medical professionals in nutritional assessments
- Research: Enable studies on nutrition needs for elderly populations in Uganda
- Policy Development: Inform nutrition policies for elderly care facilities
Intended Users
- Healthcare providers and nutritionists
- Elderly care facilities and nursing homes
- Family caregivers
- Public health researchers
- NGOs working in elderly nutrition
Out-of-Scope Use
- ❌ Not for children or adults under 60 years
- ❌ Not for acute medical conditions requiring immediate intervention
- ❌ Not a replacement for professional medical advice
- ❌ Not validated for use outside Uganda without regional calibration
Performance
Overall Metrics
| Metric | Training Set | Test Set |
|---|---|---|
| R² Score | 0.9309 | 0.6710 |
| MAE (kcal/day) | 1.29 | 2.84 |
| RMSE (kcal/day) | 1.65 | 3.60 |
| Training Time | 25.0 seconds | - |
Model Ranking
Compared against 5 different models (HistGradient Boosting, XGBoost, LightGBM, MLP, GNN):
- Overall Rank: 🥇 #1 out of 5
- R² Rank: 🥇 #1 (0.6710)
- MAE Rank: 🥇 #1 (2.84 kcal/day)
- RMSE Rank: 🥇 #1 (3.60 kcal/day)
Baseline Comparison
| Metric | Baseline Model | This Model | Improvement |
|---|---|---|---|
| Test R² | 0.6311 | 0.6710 | +6.3% |
| Test MAE | 2.998 kcal/day | 2.842 kcal/day | -5.2% |
Performance Characteristics
- Strong generalization: R² = 0.67 indicates good predictive power
- Low prediction error: MAE of 2.84 kcal/day is clinically acceptable
- Moderate overfitting: Train-test R² gap of 0.26 (manageable with regularization)
- Consistent predictions: RMSE close to MAE suggests few outliers
Training Data
Dataset Overview
- Dataset Name: Uganda Elderly Nutrition Dataset (Enriched)
- Total Samples: 1,000
- Training Samples: 700 (70%)
- Test Samples: 300 (30%)
- Split Method: Random stratified split (seed=42)
Features (18 total)
Nutritional Content (12 features)
Energy_kcal_per_serving- Energy content per servingProtein_g_per_serving- Protein content (grams)Fat_g_per_serving- Fat content (grams)Carbohydrates_g_per_serving- Carbohydrate content (grams)Fiber_g_per_serving- Dietary fiber (grams)Calcium_mg_per_serving- Calcium content (milligrams)Iron_mg_per_serving- Iron content (milligrams)Zinc_mg_per_serving- Zinc content (milligrams)VitaminA_µg_per_serving- Vitamin A content (micrograms)VitaminC_mg_per_serving- Vitamin C content (milligrams)Potassium_mg_per_serving- Potassium content (milligrams)Magnesium_mg_per_serving- Magnesium content (milligrams)
Categorical Features (4 features)
region_encoded- Geographic region in Uganda (4 regions)condition_encoded- Health condition (8 conditions)age_group_encoded- Age group (3 groups: 60-70, 70-80, 80+)season_encoded- Seasonal availability
Other Features (2 features)
portion_size_g- Portion size in gramsestimated_cost_ugx- Estimated cost in Ugandan Shillings
Geographic Coverage
4 Regions of Uganda:
- Central Uganda (Buganda)
- Western Uganda (Ankole, Tooro, Kigezi, Bunyoro)
- Eastern Uganda (Busoga, Bugisu, Teso)
- Northern Uganda (Acholi, Lango, Karamoja, West Nile)
Health Conditions Covered
8 Common Elderly Conditions:
- Hypertension
- Undernutrition
- Anemia
- Frailty
- Digestive issues
- Arthritis
- Osteoporosis
- Diabetes
Age Groups
- 60-70 years: Early elderly
- 70-80 years: Mid elderly
- 80+ years: Advanced elderly
Target Variable
- Name: Daily Caloric Needs
- Unit: kcal/day
- Range: Typically 1,400 - 2,500 kcal/day
- Distribution: Approximately normal
Training Details
Hyperparameters (Optimized)
{
'n_estimators': 200,
'max_depth': 4,
'learning_rate': 0.05,
'min_child_weight': 5,
'subsample': 0.8,
'colsample_bytree': 0.8,
'gamma': 0,
'reg_alpha': 0,
'reg_lambda': 1.5
}
Training Configuration
- Objective: Regression (minimize squared error)
- Evaluation Metric: R² Score, MAE, RMSE
- Validation Strategy: 70-30 train-test split
- Early Stopping: Not used (200 trees)
- Feature Scaling: StandardScaler applied to numeric features
- Encoding: Label encoding for categorical features
Training Environment
- Hardware: CPU-based training
- Training Time: 25 seconds
- Memory Usage: <1 GB
- Reproducibility: Random seed = 42
How to Use
Installation
pip install xgboost==2.0.0 pandas numpy scikit-learn
Loading the Model
import pickle
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
# Load model files
with open('xgboost_nutrition_model_20251103.pkl', 'rb') as f:
model = pickle.load(f)
with open('xgboost_scaler_20251103.pkl', 'rb') as f:
scaler = pickle.load(f)
with open('xgboost_label_encoders_20251103.pkl', 'rb') as f:
label_encoders = pickle.load(f)
with open('xgboost_feature_names_20251103.pkl', 'rb') as f:
feature_names = pickle.load(f)
Making Predictions
# Example input data
input_data = {
'Energy_kcal_per_serving': 350,
'Protein_g_per_serving': 15,
'Fat_g_per_serving': 10,
'Carbohydrates_g_per_serving': 45,
'Fiber_g_per_serving': 5,
'Calcium_mg_per_serving': 200,
'Iron_mg_per_serving': 3,
'Zinc_mg_per_serving': 2,
'VitaminA_µg_per_serving': 500,
'VitaminC_mg_per_serving': 20,
'Potassium_mg_per_serving': 400,
'Magnesium_mg_per_serving': 50,
'region_encoded': 0, # Central Uganda
'condition_encoded': 0, # Hypertension
'age_group_encoded': 1, # 70-80
'season_encoded': 0,
'portion_size_g': 250,
'estimated_cost_ugx': 5000
}
# Convert to DataFrame
df = pd.DataFrame([input_data])
# Ensure correct feature order
df = df[feature_names]
# Scale features (if scaler expects it)
# Note: Check if your scaler was fit on all features or just numeric ones
# df_scaled = scaler.transform(df)
# Make prediction
predicted_calories = model.predict(df)
print(f"Predicted daily caloric needs: {predicted_calories[0]:.2f} kcal/day")
Using with the API
import requests
url = "http://your-api-endpoint/predict"
data = {
"data": {
"Energy_kcal_per_serving": 350,
"Protein_g_per_serving": 15,
# ... other features
}
}
response = requests.post(url, json=data)
result = response.json()
print(f"Predicted calories: {result['prediction']['caloric_needs']:.2f} kcal/day")
Limitations and Biases
Known Limitations
Sample Size:
- Only 1,000 training samples may not capture all population variability
- Recommend caution when making predictions for rare scenarios
Geographic Scope:
- Trained specifically on Ugandan population data
- May not generalize well to other African countries or regions
Moderate Overfitting:
- Train-test R² gap of 0.26 indicates some overfitting
- Predictions should be validated against clinical guidelines
Feature Dependencies:
- Requires accurate nutritional content data
- Missing or incorrect features will degrade performance
Temporal Validity:
- Trained on 2025 data
- May need retraining as dietary patterns evolve
Potential Biases
Regional Representation:
- May have unequal representation across regions
- Ensure validation across all 4 regions
Health Condition Bias:
- Some conditions may be over/under-represented
- Validate for less common conditions
Socioeconomic Factors:
- Cost estimates may not reflect all economic situations
- Consider local affordability in deployment
Uncertainty Quantification
- Prediction Uncertainty: ±2.84 kcal/day (MAE)
- Confidence Intervals: 95% CI ≈ ±5.7 kcal/day (2 × MAE)
- Recommended Buffer: Add 10% safety margin for meal planning
Ethical Considerations
Fairness and Equity
- Model covers all major regions of Uganda
- Includes diverse health conditions
- Considers affordability factors
- ⚠️ Ensure equal access to technology for model deployment
Privacy
- Model trained on aggregated data (no personal identifiers)
- Predictions do not require storage of sensitive health information
- ⚠️ Implement proper data handling in deployment
Safety
- ⚠️ Critical: Model outputs should be reviewed by qualified healthcare professionals
- ⚠️ Not suitable for emergency nutritional interventions
- ⚠️ Should complement, not replace, clinical judgment
Transparency
- Open methodology and evaluation metrics
- Feature importance available for interpretation
- Model architecture and hyperparameters disclosed
Model Interpretability
Feature Importance (Top 10)
Based on XGBoost's built-in feature importance:
- Energy_kcal_per_serving - Highest importance
- Protein_g_per_serving - High importance
- Carbohydrates_g_per_serving - High importance
- age_group_encoded - Moderate importance
- condition_encoded - Moderate importance
- portion_size_g - Moderate importance
- Calcium_mg_per_serving - Moderate importance
- Fat_g_per_serving - Low-moderate importance
- region_encoded - Low-moderate importance
- Fiber_g_per_serving - Low importance
Full feature importance analysis available in model artifacts
Explainability
- SHAP Values: Can be computed for individual predictions
- Partial Dependence Plots: Available for key features
- Decision Rules: XGBoost trees can be exported for inspection
Comparison with Other Models
| Model | Test R² | Test MAE | Training Time | Rank |
|---|---|---|---|---|
| XGBoost (This Model) | 0.6710 | 2.84 | 25.0s | 🥇 #1 |
| LightGBM | 0.6649 | 2.88 | 0.93s | 🥈 #2 |
| HistGradient Boosting | 0.5116 | 3.42 | 0.14s | 🥉 #3 |
| GNN v2 | 0.5100 | 3.42 | 5.2s | #4 |
| MLP | -0.3035 | 5.66 | 4.5s | #5 |
Recommendation: Use XGBoost for best accuracy; consider LightGBM for faster inference.
Updates and Maintenance
Version History
- v1.0_optimized (2025-11-03): Initial release
- Trained on 1,000 samples
- Hyperparameter optimization completed
- Test R² = 0.6710
Planned Improvements
Data Collection:
- Expand dataset to 5,000+ samples
- Include more seasonal variations
- Add rural vs. urban distinctions
Feature Engineering:
- Add BMI calculations
- Include activity level metrics
- Incorporate cultural food preferences
Model Enhancements:
- Ensemble with LightGBM for improved accuracy
- Implement SHAP-based explainability
- Add prediction uncertainty intervals
Validation:
- Clinical validation studies
- Cross-regional performance assessment
- Temporal validation (seasonal changes)
Retraining Schedule
- Recommended: Every 6-12 months
- Triggers: New data availability, significant dietary changes, performance degradation
Citation
If you use this model in your research or application, please cite:
@misc{uganda_elderly_nutrition_xgboost_2025,
title={XGBoost Model for Elderly Nutrition Planning in Uganda},
author={[Your Name/Organization]},
year={2025},
month={November},
howpublished={Hugging Face Model Hub},
url={https://huggingface.co/[your-username]/xgboost-elderly-nutrition-uganda}
}
Additional Resources
Related Links
- Project Repository: [https://github.com/Shakiran-Nannyombi/Graph-Enhanced-LLMs-for-Locally-Sourced-Elderly-Nutrition-Planning-in-Uganda.git]
- API Documentation: [API Docs Link]
- Research Paper: [Paper Link if available]
- Dataset: [Shakiran/UgandanNutritionMealPlanning]
Model Artifacts
xgboost_nutrition_model_20251103.pkl- Main XGBoost modelxgboost_scaler_20251103.pkl- Feature scaler (StandardScaler)xgboost_label_encoders_20251103.pkl- Categorical encodersxgboost_feature_names_20251103.pkl- Feature name listxgboost_model_metadata_20251103.json- Complete metadata
Support
For questions, issues, or contributions: - Issues: [https://github.com/Shakiran-Nannyombi/Graph-Enhanced-LLMs-for-Locally-Sourced-Elderly-Nutrition-Planning-in-Uganda.git] - Email: [devkiran256@gmail.com]
License
This model is released under the Apache License 2.0.
- Commercial use allowed
- Modification allowed
- Distribution allowed
- Patent use allowed
- ⚠️ Must include license and copyright notice
- ⚠️ Must state significant changes
Disclaimer: This model is provided "as is" without warranty. Users are responsible for validating the model's suitability for their specific use case and ensuring compliance with local healthcare regulations.
Acknowledgments
Data Sources and References
This model was developed using knowledge and data extracted from the following authoritative sources:
Handbook_Eldernutr_FINAL.pdf
- Comprehensive handbook on elderly nutrition
- Primary reference for nutritional requirements and guidelines
WHO ICOPE Guidelines (icope.pdf)
- World Health Organization Integrated Care for Older People (ICOPE)
- Framework for elderly healthcare and nutrition assessment
Nutritional_Requirements_of_Older_People.pdf
- Detailed nutritional requirements for elderly populations
- Evidence-based dietary recommendations
TipSheet_21_HealthyEatingForOlderAdults.pdf
- Practical tips for healthy eating in older adults
- Community-oriented nutrition guidance
MSD Manual Professional Edition
- "Drug Categories of Concern in Older Adults - Geriatrics"
- Clinical reference for medication-nutrition interactions
MSD Manual Consumer Version
- "Aging and Medications - Older People's Health Issues"
- Patient-friendly information on aging and health
Uganda Nutrition Data (download.pdf)
- Uganda-specific nutritional data and food composition
- Local context and dietary patterns
Street Food Nutritional Analysis
- "Average energy and nutrient contents of typical street food dishes in Uganda (Kampala)"
- Local food nutritional profiles for urban Uganda
Institutional Support
- Uganda Ministry of Health - Nutrition guidelines and policy frameworks
- World Health Organization (WHO) - ICOPE framework and elderly care guidelines
- MSD Manuals - Clinical and consumer health information
Technical Contributions
- Open-source community: XGBoost, scikit-learn, pandas, Python ecosystem
- Healthcare professionals who contributed domain expertise
- Data scientists and researchers in elderly nutrition and machine learning
Regional Knowledge
- Local nutrition experts from Uganda's 4 major regions:
- Central Uganda (Buganda)
- Western Uganda (Ankole, Tooro, Kigezi, Bunyoro)
- Eastern Uganda (Busoga, Bugisu, Teso)
- Northern Uganda (Acholi, Lango, Karamoja, West Nile)
Special Thanks
- Community health workers providing ground-level insights
- Elderly care facilities participating in data validation
- Nutrition researchers focusing on African elderly populations
- Open data initiatives promoting nutrition research in Uganda
Last Updated: November 4, 2025 Model Version: v1.0_optimized Status: Production Ready