--- license: mit pipeline_tag: object-detection --- # Eye and Eyebrow Movement Recognition Model ![License](https://img.shields.io/badge/license-MIT-blue.svg) ![Python](https://img.shields.io/badge/python-3.9%2B-blue.svg) ![TensorFlow](https://img.shields.io/badge/tensorflow-2.8.0%2B-brightgreen.svg) ## 📖 Table of Contents - [📚 Description](#-description) - [🔍 Features](#-features) - [🎯 Intended Use](#-intended-use) - [🧠 Model Architecture](#-model-architecture) - [📋 Training Data](#-training-data) - [📈 Evaluation](#-evaluation) - [💻 Usage](#-usage) - [Prerequisites](#prerequisites) - [Installation](#installation) - [Loading the Model](#loading-the-model) - [Making Predictions](#making-predictions) - [🔧 Limitations](#-limitations) - [⚖️ Ethical Considerations](#-ethical-considerations) - [📜 License](#-license) - [🙏 Acknowledgements](#-acknowledgements) ## 📚 Description The **Eye and Eyebrow Movement Recognition** model is an advanced real-time system designed to accurately detect and classify subtle facial movements, specifically focusing on the eyes and eyebrows. Currently, the model is trained to recognize three distinct movements: - **Yes:** Characterized by the raising of eyebrows. - **No:** Indicated by the lowering of eyebrows. - **Normal:** Representing a neutral facial expression without significant eye or eyebrow movements. Leveraging a **CNN-LSTM** (Convolutional Neural Network - Long Short-Term Memory) architecture, the model effectively captures both spatial features from individual frames and temporal dynamics across sequences of frames. This ensures robust and reliable performance in real-world scenarios. ## 🔍 Features - **Real-Time Detection:** Continuously processes live webcam feeds to detect eye and eyebrow movements without noticeable lag. - **GPU Acceleration:** Optimized for GPU usage via TensorFlow-Metal on macOS, ensuring efficient computations. - **Extensible Design:** While currently supporting "Yes," "No," and "Normal" movements, the system is designed to be easily extended to accommodate additional facial gestures or movements. - **User-Friendly Interface:** Provides visual feedback by overlaying predictions directly onto the live video feed for immediate user feedback. - **High Accuracy:** Demonstrates high accuracy in distinguishing between the supported movements, making it a reliable tool for real-time facial gesture recognition. ## 🎯 Intended Use This model is ideal for a variety of applications, including but not limited to: - **Human-Computer Interaction (HCI):** Enhancing user interfaces with gesture-based controls. - **Assistive Technologies:** Providing non-verbal communication tools for individuals with speech impairments. - **Behavioral Analysis:** Monitoring and analyzing facial expressions for psychological or market research. - **Gaming:** Creating more immersive and responsive gaming experiences through facial gesture controls. **Note:** The model is intended for research and educational purposes. Ensure compliance with privacy and ethical guidelines when deploying in real-world applications. ## 🧠 Model Architecture The model employs a **CNN-LSTM** architecture to capture both spatial and temporal features: 1. **TimeDistributed CNN Layers:** - **Conv2D:** Extracts spatial features from each frame independently. - **MaxPooling2D:** Reduces spatial dimensions. - **BatchNormalization:** Stabilizes and accelerates training. 2. **Flatten Layer:** - Flattens the output from CNN layers to prepare for LSTM processing. 3. **LSTM Layer:** - Captures temporal dependencies across the sequence of frames. 4. **Dense Layers:** - Fully connected layers that perform the final classification based on combined spatial-temporal features. 5. **Output Layer:** - **Softmax Activation:** Provides probability distribution over the three classes ("Yes," "No," "Normal"). ## 📋 Training Data The model was trained on a curated dataset consisting of short video clips (1-2 seconds) capturing the three target movements: - **Yes:** 50 samples - **No:** 50 samples - **Normal:** 50 samples Each video was recorded using a standard webcam under varied lighting conditions and backgrounds to ensure robustness. The videos were manually labeled and organized into respective directories for preprocessing. ## 📈 Evaluation The model was evaluated on a separate test set comprising 60 samples for each class. The evaluation metrics are as follows: - **Accuracy:** 85% - **Precision:** 84% - **Recall:** 86% - **F1-Score:** 85% ## 💻 Usage ### Prerequisites - **Hardware:** Mac with Apple Silicon (M1, M1 Pro, M1 Max, M2, etc.) for Metal GPU support. - **Operating System:** macOS 12.3 (Monterey) or newer. - **Python:** Version 3.9 or higher. ### Installation 1. **Clone the Repository** ```bash git clone https://huggingface.co/shayan5422/eye-eyebrow-movement-recognition cd eye-eyebrow-movement-recognition ``` 2. **Install Homebrew (if not already installed)** Homebrew is a package manager for macOS that simplifies the installation of software. ```bash /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" ``` 3. **Install Micromamba** Micromamba is a lightweight package manager compatible with Conda environments. ```bash brew install micromamba ``` 4. **Create and Activate a Virtual Environment** We'll use Micromamba to create an isolated environment for our project. ```bash # Create a new environment named 'eye_movement' with Python 3.9 micromamba create -n eye_movement python=3.9 # Activate the environment micromamba activate eye_movement ``` 5. **Install Required Libraries** We'll install TensorFlow with Metal support (`tensorflow-macos` and `tensorflow-metal`) along with other necessary libraries. ```bash # Install TensorFlow for macOS pip install tensorflow-macos # Install TensorFlow Metal plugin for GPU acceleration pip install tensorflow-metal # Install other dependencies pip install opencv-python dlib imutils tqdm scikit-learn matplotlib seaborn h5py ``` > **Note:** Installing `dlib` can sometimes be challenging on macOS. If you encounter issues, consider installing it via Conda or refer to [dlib's official installation instructions](http://dlib.net/compile.html). 6. **Download Dlib's Pre-trained Shape Predictor** This model is essential for facial landmark detection. ```bash # Navigate to your project directory cd /path/to/your/project/eye-eyebrow-movement-recognition/ # Download the shape predictor curl -LO http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2 # Decompress the file bunzip2 shape_predictor_68_face_landmarks.dat.bz2 ``` Ensure that the `shape_predictor_68_face_landmarks.dat` file is in the same directory as your scripts. ### Loading the Model ```python import tensorflow as tf # Load the trained model model = tf.keras.models.load_model('final_model_sequences.keras') ``` ### Making Predictions ```python import cv2 import numpy as np import dlib from imutils import face_utils from collections import deque import queue import threading # Initialize dlib's face detector and landmark predictor detector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat') # Initialize queues for threading input_queue = queue.Queue() output_queue = queue.Queue() # Define sequence length max_seq_length = 30 def prediction_worker(model, input_q, output_q): while True: sequence = input_q.get() if sequence is None: break # Preprocess and predict # [Add your prediction logic here] # Example: prediction = model.predict(sequence) class_idx = np.argmax(prediction) confidence = np.max(prediction) output_q.put((class_idx, confidence)) # Start prediction thread thread = threading.Thread(target=prediction_worker, args=(model, input_queue, output_queue)) thread.start() # Start video capture cap = cv2.VideoCapture(0) frame_buffer = deque(maxlen=max_seq_length) while True: ret, frame = cap.read() if not ret: break # Preprocess frame gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) rects = detector(gray, 1) if len(rects) > 0: rect = rects[0] shape = predictor(gray, rect) shape = face_utils.shape_to_np(shape) # Extract ROIs and preprocess # [Add your ROI extraction and preprocessing here] # Example: preprocessed_frame = preprocess_frame(frame, detector, predictor) frame_buffer.append(preprocessed_frame) else: frame_buffer.append(np.zeros((64, 256, 1), dtype='float32')) # If buffer is full, send to prediction if len(frame_buffer) == max_seq_length: sequence = np.array(frame_buffer) input_queue.put(np.expand_dims(sequence, axis=0)) frame_buffer.clear() # Check for prediction results try: while True: class_idx, confidence = output_queue.get_nowait() movement = index_to_text.get(class_idx, "Unknown") text = f"{movement} ({confidence*100:.2f}%)" cv2.putText(frame, text, (30, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2, cv2.LINE_AA) except queue.Empty: pass # Display the frame cv2.imshow('Real-time Movement Prediction', frame) # Exit on 'q' key if cv2.waitKey(1) & 0xFF == ord('q'): break # Cleanup cap.release() cv2.destroyAllWindows() input_queue.put(None) thread.join() ``` **Note:** Replace the placeholder comments with your actual preprocessing and prediction logic as implemented in your scripts. ## 🔧 Limitations - **Movement Scope:** Currently, the model is limited to recognizing "Yes," "No," and "Normal" movements. Extending to additional movements would require further data collection and training. - **Environmental Constraints:** The model performs best under good lighting conditions and with a clear, frontal view of the face. Variations in lighting, occlusions, or extreme angles may affect accuracy. - **Single Face Assumption:** The system is designed to handle a single face in the frame. Multiple faces may lead to unpredictable behavior. ## ⚖️ Ethical Considerations - **Privacy:** Ensure that users are aware of and consent to the use of their facial data. Handle all captured data responsibly and in compliance with relevant privacy laws and regulations. - **Bias:** The model's performance may vary across different demographics. It's essential to train the model on a diverse dataset to minimize biases related to age, gender, ethnicity, and other factors. - **Misuse:** Like all facial recognition technologies, there's potential for misuse. Implement safeguards to prevent unauthorized or unethical applications of the model. ## 📜 License This project is licensed under the [MIT License](LICENSE). ## 🙏 Acknowledgements - [TensorFlow](https://www.tensorflow.org/) - [OpenCV](https://opencv.org/) - [dlib](http://dlib.net/) - [imutils](https://github.com/jrosebr1/imutils) - [Hugging Face](https://huggingface.co/) - [Metal Performance Shaders (MPS)](https://developer.apple.com/documentation/metalperformanceshaders) - [Micromamba](https://mamba.readthedocs.io/en/latest/micromamba.html) --- **Feel free to reach out or contribute to enhance the capabilities of this model!** ```