XGBoost Model for Elderly Nutrition Planning in Uganda

Model Description

This XGBoost regression model predicts daily caloric needs for elderly individuals (aged 60+) in Uganda based on nutritional content, health conditions, regional factors, and demographic information. The model is designed to support nutrition planning, meal preparation, and healthcare decision-making for elderly care in Uganda.

Model Details

  • Model Type: XGBoost Regressor (Gradient Boosting)
  • Task: Tabular Regression
  • Version: v1.0_optimized
  • Training Date: November 3, 2025
  • Framework: XGBoost 2.0+
  • Language: Python
  • License: Apache 2.0

Developed By

  • Organization: Graph-Enhanced LLMs for Locally-Sourced Elderly Nutrition Planning Project
  • Project Focus: AI-driven nutrition planning for elderly populations in Uganda
  • Contact: [shakirannannyombi@gmail.com]

Intended Use

Primary Use Cases

  1. Nutrition Planning: Calculate appropriate caloric intake for elderly individuals based on their health profile
  2. Meal Planning: Support caregivers and healthcare providers in designing meal plans
  3. Healthcare Decision Support: Assist medical professionals in nutritional assessments
  4. Research: Enable studies on nutrition needs for elderly populations in Uganda
  5. Policy Development: Inform nutrition policies for elderly care facilities

Intended Users

  • Healthcare providers and nutritionists
  • Elderly care facilities and nursing homes
  • Family caregivers
  • Public health researchers
  • NGOs working in elderly nutrition

Out-of-Scope Use

  • ❌ Not for children or adults under 60 years
  • ❌ Not for acute medical conditions requiring immediate intervention
  • ❌ Not a replacement for professional medical advice
  • ❌ Not validated for use outside Uganda without regional calibration

Performance

Overall Metrics

Metric Training Set Test Set
R² Score 0.9309 0.6710
MAE (kcal/day) 1.29 2.84
RMSE (kcal/day) 1.65 3.60
Training Time 25.0 seconds -

Model Ranking

Compared against 5 different models (HistGradient Boosting, XGBoost, LightGBM, MLP, GNN):

  • Overall Rank: 🥇 #1 out of 5
  • R² Rank: 🥇 #1 (0.6710)
  • MAE Rank: 🥇 #1 (2.84 kcal/day)
  • RMSE Rank: 🥇 #1 (3.60 kcal/day)

Baseline Comparison

Metric Baseline Model This Model Improvement
Test R² 0.6311 0.6710 +6.3%
Test MAE 2.998 kcal/day 2.842 kcal/day -5.2%

Performance Characteristics

  • Strong generalization: R² = 0.67 indicates good predictive power
  • Low prediction error: MAE of 2.84 kcal/day is clinically acceptable
  • Moderate overfitting: Train-test R² gap of 0.26 (manageable with regularization)
  • Consistent predictions: RMSE close to MAE suggests few outliers

Training Data

Dataset Overview

  • Dataset Name: Uganda Elderly Nutrition Dataset (Enriched)
  • Total Samples: 1,000
  • Training Samples: 700 (70%)
  • Test Samples: 300 (30%)
  • Split Method: Random stratified split (seed=42)

Features (18 total)

Nutritional Content (12 features)

  • Energy_kcal_per_serving - Energy content per serving
  • Protein_g_per_serving - Protein content (grams)
  • Fat_g_per_serving - Fat content (grams)
  • Carbohydrates_g_per_serving - Carbohydrate content (grams)
  • Fiber_g_per_serving - Dietary fiber (grams)
  • Calcium_mg_per_serving - Calcium content (milligrams)
  • Iron_mg_per_serving - Iron content (milligrams)
  • Zinc_mg_per_serving - Zinc content (milligrams)
  • VitaminA_µg_per_serving - Vitamin A content (micrograms)
  • VitaminC_mg_per_serving - Vitamin C content (milligrams)
  • Potassium_mg_per_serving - Potassium content (milligrams)
  • Magnesium_mg_per_serving - Magnesium content (milligrams)

Categorical Features (4 features)

  • region_encoded - Geographic region in Uganda (4 regions)
  • condition_encoded - Health condition (8 conditions)
  • age_group_encoded - Age group (3 groups: 60-70, 70-80, 80+)
  • season_encoded - Seasonal availability

Other Features (2 features)

  • portion_size_g - Portion size in grams
  • estimated_cost_ugx - Estimated cost in Ugandan Shillings

Geographic Coverage

4 Regions of Uganda:

  1. Central Uganda (Buganda)
  2. Western Uganda (Ankole, Tooro, Kigezi, Bunyoro)
  3. Eastern Uganda (Busoga, Bugisu, Teso)
  4. Northern Uganda (Acholi, Lango, Karamoja, West Nile)

Health Conditions Covered

8 Common Elderly Conditions:

  1. Hypertension
  2. Undernutrition
  3. Anemia
  4. Frailty
  5. Digestive issues
  6. Arthritis
  7. Osteoporosis
  8. Diabetes

Age Groups

  • 60-70 years: Early elderly
  • 70-80 years: Mid elderly
  • 80+ years: Advanced elderly

Target Variable

  • Name: Daily Caloric Needs
  • Unit: kcal/day
  • Range: Typically 1,400 - 2,500 kcal/day
  • Distribution: Approximately normal

Training Details

Hyperparameters (Optimized)

{
    'n_estimators': 200,
    'max_depth': 4,
    'learning_rate': 0.05,
    'min_child_weight': 5,
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'gamma': 0,
    'reg_alpha': 0,
    'reg_lambda': 1.5
}

Training Configuration

  • Objective: Regression (minimize squared error)
  • Evaluation Metric: R² Score, MAE, RMSE
  • Validation Strategy: 70-30 train-test split
  • Early Stopping: Not used (200 trees)
  • Feature Scaling: StandardScaler applied to numeric features
  • Encoding: Label encoding for categorical features

Training Environment

  • Hardware: CPU-based training
  • Training Time: 25 seconds
  • Memory Usage: <1 GB
  • Reproducibility: Random seed = 42

How to Use

Installation

pip install xgboost==2.0.0 pandas numpy scikit-learn

Loading the Model

import pickle
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

# Load model files
with open('xgboost_nutrition_model_20251103.pkl', 'rb') as f:
    model = pickle.load(f)

with open('xgboost_scaler_20251103.pkl', 'rb') as f:
    scaler = pickle.load(f)

with open('xgboost_label_encoders_20251103.pkl', 'rb') as f:
    label_encoders = pickle.load(f)

with open('xgboost_feature_names_20251103.pkl', 'rb') as f:
    feature_names = pickle.load(f)

Making Predictions

# Example input data
input_data = {
    'Energy_kcal_per_serving': 350,
    'Protein_g_per_serving': 15,
    'Fat_g_per_serving': 10,
    'Carbohydrates_g_per_serving': 45,
    'Fiber_g_per_serving': 5,
    'Calcium_mg_per_serving': 200,
    'Iron_mg_per_serving': 3,
    'Zinc_mg_per_serving': 2,
    'VitaminA_µg_per_serving': 500,
    'VitaminC_mg_per_serving': 20,
    'Potassium_mg_per_serving': 400,
    'Magnesium_mg_per_serving': 50,
    'region_encoded': 0,  # Central Uganda
    'condition_encoded': 0,  # Hypertension
    'age_group_encoded': 1,  # 70-80
    'season_encoded': 0,
    'portion_size_g': 250,
    'estimated_cost_ugx': 5000
}

# Convert to DataFrame
df = pd.DataFrame([input_data])

# Ensure correct feature order
df = df[feature_names]

# Scale features (if scaler expects it)
# Note: Check if your scaler was fit on all features or just numeric ones
# df_scaled = scaler.transform(df)

# Make prediction
predicted_calories = model.predict(df)
print(f"Predicted daily caloric needs: {predicted_calories[0]:.2f} kcal/day")

Using with the API

import requests

url = "http://your-api-endpoint/predict"
data = {
    "data": {
        "Energy_kcal_per_serving": 350,
        "Protein_g_per_serving": 15,
        # ... other features
    }
}

response = requests.post(url, json=data)
result = response.json()
print(f"Predicted calories: {result['prediction']['caloric_needs']:.2f} kcal/day")

Limitations and Biases

Known Limitations

  1. Sample Size:

    • Only 1,000 training samples may not capture all population variability
    • Recommend caution when making predictions for rare scenarios
  2. Geographic Scope:

    • Trained specifically on Ugandan population data
    • May not generalize well to other African countries or regions
  3. Moderate Overfitting:

    • Train-test R² gap of 0.26 indicates some overfitting
    • Predictions should be validated against clinical guidelines
  4. Feature Dependencies:

    • Requires accurate nutritional content data
    • Missing or incorrect features will degrade performance
  5. Temporal Validity:

    • Trained on 2025 data
    • May need retraining as dietary patterns evolve

Potential Biases

  1. Regional Representation:

    • May have unequal representation across regions
    • Ensure validation across all 4 regions
  2. Health Condition Bias:

    • Some conditions may be over/under-represented
    • Validate for less common conditions
  3. Socioeconomic Factors:

    • Cost estimates may not reflect all economic situations
    • Consider local affordability in deployment

Uncertainty Quantification

  • Prediction Uncertainty: ±2.84 kcal/day (MAE)
  • Confidence Intervals: 95% CI ≈ ±5.7 kcal/day (2 × MAE)
  • Recommended Buffer: Add 10% safety margin for meal planning

Ethical Considerations

Fairness and Equity

  • Model covers all major regions of Uganda
  • Includes diverse health conditions
  • Considers affordability factors
  • ⚠️ Ensure equal access to technology for model deployment

Privacy

  • Model trained on aggregated data (no personal identifiers)
  • Predictions do not require storage of sensitive health information
  • ⚠️ Implement proper data handling in deployment

Safety

  • ⚠️ Critical: Model outputs should be reviewed by qualified healthcare professionals
  • ⚠️ Not suitable for emergency nutritional interventions
  • ⚠️ Should complement, not replace, clinical judgment

Transparency

  • Open methodology and evaluation metrics
  • Feature importance available for interpretation
  • Model architecture and hyperparameters disclosed

Model Interpretability

Feature Importance (Top 10)

Based on XGBoost's built-in feature importance:

  1. Energy_kcal_per_serving - Highest importance
  2. Protein_g_per_serving - High importance
  3. Carbohydrates_g_per_serving - High importance
  4. age_group_encoded - Moderate importance
  5. condition_encoded - Moderate importance
  6. portion_size_g - Moderate importance
  7. Calcium_mg_per_serving - Moderate importance
  8. Fat_g_per_serving - Low-moderate importance
  9. region_encoded - Low-moderate importance
  10. Fiber_g_per_serving - Low importance

Full feature importance analysis available in model artifacts

Explainability

  • SHAP Values: Can be computed for individual predictions
  • Partial Dependence Plots: Available for key features
  • Decision Rules: XGBoost trees can be exported for inspection

Comparison with Other Models

Model Test R² Test MAE Training Time Rank
XGBoost (This Model) 0.6710 2.84 25.0s 🥇 #1
LightGBM 0.6649 2.88 0.93s 🥈 #2
HistGradient Boosting 0.5116 3.42 0.14s 🥉 #3
GNN v2 0.5100 3.42 5.2s #4
MLP -0.3035 5.66 4.5s #5

Recommendation: Use XGBoost for best accuracy; consider LightGBM for faster inference.


Updates and Maintenance

Version History

  • v1.0_optimized (2025-11-03): Initial release
    • Trained on 1,000 samples
    • Hyperparameter optimization completed
    • Test R² = 0.6710

Planned Improvements

  1. Data Collection:

    • Expand dataset to 5,000+ samples
    • Include more seasonal variations
    • Add rural vs. urban distinctions
  2. Feature Engineering:

    • Add BMI calculations
    • Include activity level metrics
    • Incorporate cultural food preferences
  3. Model Enhancements:

    • Ensemble with LightGBM for improved accuracy
    • Implement SHAP-based explainability
    • Add prediction uncertainty intervals
  4. Validation:

    • Clinical validation studies
    • Cross-regional performance assessment
    • Temporal validation (seasonal changes)

Retraining Schedule

  • Recommended: Every 6-12 months
  • Triggers: New data availability, significant dietary changes, performance degradation

Citation

If you use this model in your research or application, please cite:

@misc{uganda_elderly_nutrition_xgboost_2025,
  title={XGBoost Model for Elderly Nutrition Planning in Uganda},
  author={[Your Name/Organization]},
  year={2025},
  month={November},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/[your-username]/xgboost-elderly-nutrition-uganda}
}

Additional Resources

Related Links

Model Artifacts

  • xgboost_nutrition_model_20251103.pkl - Main XGBoost model
  • xgboost_scaler_20251103.pkl - Feature scaler (StandardScaler)
  • xgboost_label_encoders_20251103.pkl - Categorical encoders
  • xgboost_feature_names_20251103.pkl - Feature name list
  • xgboost_model_metadata_20251103.json - Complete metadata

Support

For questions, issues, or contributions: - Issues: [https://github.com/Shakiran-Nannyombi/Graph-Enhanced-LLMs-for-Locally-Sourced-Elderly-Nutrition-Planning-in-Uganda.git] - Email: [devkiran256@gmail.com]


License

This model is released under the Apache License 2.0.

  • Commercial use allowed
  • Modification allowed
  • Distribution allowed
  • Patent use allowed
  • ⚠️ Must include license and copyright notice
  • ⚠️ Must state significant changes

Disclaimer: This model is provided "as is" without warranty. Users are responsible for validating the model's suitability for their specific use case and ensuring compliance with local healthcare regulations.


Acknowledgments

Data Sources and References

This model was developed using knowledge and data extracted from the following authoritative sources:

  1. Handbook_Eldernutr_FINAL.pdf

    • Comprehensive handbook on elderly nutrition
    • Primary reference for nutritional requirements and guidelines
  2. WHO ICOPE Guidelines (icope.pdf)

    • World Health Organization Integrated Care for Older People (ICOPE)
    • Framework for elderly healthcare and nutrition assessment
  3. Nutritional_Requirements_of_Older_People.pdf

    • Detailed nutritional requirements for elderly populations
    • Evidence-based dietary recommendations
  4. TipSheet_21_HealthyEatingForOlderAdults.pdf

    • Practical tips for healthy eating in older adults
    • Community-oriented nutrition guidance
  5. MSD Manual Professional Edition

    • "Drug Categories of Concern in Older Adults - Geriatrics"
    • Clinical reference for medication-nutrition interactions
  6. MSD Manual Consumer Version

    • "Aging and Medications - Older People's Health Issues"
    • Patient-friendly information on aging and health
  7. Uganda Nutrition Data (download.pdf)

    • Uganda-specific nutritional data and food composition
    • Local context and dietary patterns
  8. Street Food Nutritional Analysis

    • "Average energy and nutrient contents of typical street food dishes in Uganda (Kampala)"
    • Local food nutritional profiles for urban Uganda

Institutional Support

  • Uganda Ministry of Health - Nutrition guidelines and policy frameworks
  • World Health Organization (WHO) - ICOPE framework and elderly care guidelines
  • MSD Manuals - Clinical and consumer health information

Technical Contributions

  • Open-source community: XGBoost, scikit-learn, pandas, Python ecosystem
  • Healthcare professionals who contributed domain expertise
  • Data scientists and researchers in elderly nutrition and machine learning

Regional Knowledge

  • Local nutrition experts from Uganda's 4 major regions:
    • Central Uganda (Buganda)
    • Western Uganda (Ankole, Tooro, Kigezi, Bunyoro)
    • Eastern Uganda (Busoga, Bugisu, Teso)
    • Northern Uganda (Acholi, Lango, Karamoja, West Nile)

Special Thanks

  • Community health workers providing ground-level insights
  • Elderly care facilities participating in data validation
  • Nutrition researchers focusing on African elderly populations
  • Open data initiatives promoting nutrition research in Uganda

Last Updated: November 4, 2025 Model Version: v1.0_optimized Status: Production Ready

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train Shakiran/MzeeChakula_Model