How to Build a Real-Time Driver Drowsiness Detection System Using OpenCV and Python

In today's fast-paced world, road safety has become a crucial concern, especially with the increasing incidence of driver fatigue-related accidents. Long hours on the road, night driving, and other stressful conditions contribute to driver drowsiness, which impairs reaction times and decision-making, ultimately putting lives at risk. To address this challenge, real-time driver drowsiness detection systems have gained significant attention in recent years. These systems utilize computer vision techniques, specifically OpenCV, to monitor driver alertness levels and detect signs of fatigue before it becomes dangerous.

OpenCV, an open-source computer vision library, provides a powerful framework for building real-time image processing applications. In a drowsiness detection system, it enables the processing of live video feeds to recognize facial landmarks and behaviors associated with drowsiness, such as eye closure and head tilting. By analyzing these indicators in real time, the system can alert the driver through visual or auditory warnings, thereby potentially preventing accidents.

This article delves into the design and implementation of a real-time driver drowsiness detection system using OpenCV. We will explore the technical aspects, from facial landmark detection to the integration of alert mechanisms, providing an in-depth look at how computer vision is transforming road safety.

You can watch the video-based tutorial with step by step explanation down below.

Install modules

To set up a real-time driver drowsiness detection system with OpenCV, dlib, and face_recognition, you need to install these libraries as prerequisites.

pip install cmake
pip install dlib-19.24.99-cp312-cp312-win_amd64.whl
pip install face_recognition

Install cmake: Required for building certain libraries, including dlib.
Install dlib: Since dlib can be challenging to install, it’s best to use a precompiled .whl file specific to your Python version and operating system. You’ve specified version dlib-19.24.99 for Python 3.12 on Windows, which should match the filename of the .whl file you downloaded. (Download Link)
Install face_recognition: This library depends on dlib for facial recognition features, so install it after dlib is successfully set up.

Import Modules

import os
import cv2
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import face_recognition
from scipy.spatial import distance

import warnings
warnings.filterwarnings('ignore')

os: Helps in managing file paths and directories, especially useful for loading model files, saving outputs, or organizing datasets.
cv2 (OpenCV): Provides tools for image and video processing. You’ll likely use it for accessing camera input, detecting facial features, and performing real-time processing of frames.
numpy: Essential for numerical operations on arrays and matrices, which are often used when handling image data.
PIL (Image): Useful for handling and manipulating image files, though with OpenCV and face_recognition, you may need PIL only occasionally.
matplotlib.pyplot: Ideal for visualizing images, results, or data, which can be helpful for debugging or displaying processed frames.
face_recognition: This library, which uses dlib under the hood, is useful for detecting and recognizing faces, as well as identifying facial landmarks. These landmarks will be key for tracking eye movements or other signs of drowsiness.
scipy.spatial.distance: Provides functions to calculate distances, which is critical in drowsiness detection (e.g., computing the Euclidean distance between eye landmarks to check for eye closure).
warnings: Helps suppress any irrelevant warnings that might clutter the output, allowing you to focus on debugging the critical parts of the program.

Highlight Facial Points in the Image

First we will load and display an image using PIL and matplotlib.

image_path = 'test1.png'
image = Image.open(image_path)
plt.axis('off')
plt.imshow(image)
plt.show()

image_path = 'test1.png': Sets the path to the image you want to display.
image = Image.open(image_path): Loads the image into memory.
plt.axis('off'): Hides the axes to make the display cleaner.
plt.imshow(image): Displays the image.
plt.show(): Renders the plot.

Next we will define a function to highlight facial points.

def highlight_facial_points(image_path):
    # load the image
    image_bgr = cv2.imread(image_path)
    # convert from bgr to rgb
    image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

    # detect faces in the image
    face_locations=face_recognition.face_locations(image_rgb, model='cnn')

    for face_location in face_locations:
        # get facial landmarks
        landmarks = face_recognition.face_landmarks(image_rgb, [face_location])[0]

        # Iterate over the facial landmarks and draw them on the image
        for landmark_type, landmark_points in landmarks.items():
            for (x, y) in landmark_points:
                cv2.circle(image_rgb, (x, y), 3, (0, 255, 0), -1)

    # plot the image
    plt.figure(figsize=(6, 6))
    plt.imshow(image_rgb)
    plt.axis('off')
    plt.show()

Load and Prepare the Image:
- image_bgr = cv2.imread(image_path): Reads the image in BGR format.
- image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB): Converts it to RGB for compatibility with face_recognition.
Face Detection:
- face_locations = face_recognition.face_locations(image_rgb, model='cnn'): Detects faces in the image using the cnn model for more accurate but slightly slower results.
Facial Landmarks Detection:
- face_recognition.face_landmarks(image_rgb, [face_location])[0]: Extracts the facial landmarks for each detected face.
Draw Circles on Landmarks:
- For each landmark (like eyes, nose, etc.), it places a green circle with cv2.circle(image_rgb, (x, y), 3, (0, 255, 0), -1).
Display the Result:
- plt.imshow(image_rgb) displays the processed image, and plt.axis('off') hides the axes for a cleaner view.

Next we will call the above defined function.

highlight_facial_points(image_path)

The function will process the image as described, detect the facial landmarks, and display the result with landmarks highlighted.

Next we will define functions to calculate the aspect ratios for the eyes and mouth based on landmarks, which can be useful for detecting signs of drowsiness.

# calculate eye aspect ratio
def eye_aspect_ratio(eye):
    A = distance.euclidean(eye[1], eye[5])
    B = distance.euclidean(eye[2], eye[4])
    C = distance.euclidean(eye[0], eye[3])
    ear = (A+B) / (2.0 * C)
    return ear

# calculate mount aspect ratio
def mouth_aspect_ratio(mouth):
    A = distance.euclidean(mouth[2], mouth[10])
    B = distance.euclidean(mouth[4], mouth[8])
    C = distance.euclidean(mouth[0], mouth[6])
    mar = (A+B) / (2.0 * C)
    return mar

eye_aspect_ratio:

This function calculates the Eye Aspect Ratio (EAR), which indicates eye openness. When eyes close, EAR typically decreases, making it a helpful indicator for detecting drowsiness.
Parameters:
- eye: A list of coordinates representing the eye’s six landmarks. These points are usually detected in a clockwise or counterclockwise order around the eye.
Calculation:
- A and B: Measure the vertical distances between landmark pairs.
- C: Measures the horizontal distance.
- EAR: The formula (A + B) / (2.0 * C) captures the ratio of vertical to horizontal distance, indicating whether the eye is open or closed.

mouth_aspect_ratio:

This function calculates the Mouth Aspect Ratio (MAR), which can signal yawning or prolonged mouth opening, another sign of drowsiness.
Parameters:
- mouth: A list of coordinates representing the 12 points around the mouth.
Calculation:
- A and B: Measure vertical distances between upper and lower lip landmarks.
- C: Measures the horizontal distance.
- MAR: The formula (A + B) / (2.0 * C) calculates the ratio of vertical to horizontal distances for the mouth.

Next we will define process_image function to effectively calculates eye and mouth aspect ratios (EAR and MAR) to detect signs of drowsiness or yawning in a given image frame.

def process_image(frame):
    # define thresholds
    EYE_AR_THRESH = 0.25
    MOUTH_AR_THRESH = 0.6

    if frame is None:
        raise ValueError('Image is not found or unable to open')

    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # find all face locations
    face_locations = face_recognition.face_locations(rgb_frame)

    # initiate flags
    eye_flag = mouth_flag = False

    for face_location in face_locations:
        # extract facial landmarks
        landmarks = face_recognition.face_landmarks(rgb_frame, [face_location])[0]
        # extract eye and mouth coordinates
        left_eye = np.array(landmarks['left_eye'])
        right_eye = np.array(landmarks['right_eye'])
        mouth = np.array(landmarks['bottom_lip'])

        # calculate ear and mar
        left_ear = eye_aspect_ratio(left_eye)
        right_ear = eye_aspect_ratio(right_eye)
        ear = (left_ear+right_ear) / 2.0
        mar = mouth_aspect_ratio(mouth)

        # check if eyes are closed
        if ear < EYE_AR_THRESH:
            eye_flag = True

        # check if yawning
        if mar > MOUTH_AR_THRESH:
            mouth_flag = True

    return eye_flag, mouth_flag

Define Thresholds:
- EYE_AR_THRESH is the threshold below which eyes are considered closed.
- MOUTH_AR_THRESH is the threshold above which the mouth is considered open wide (possible indication of yawning).
Frame Check:
- If the frame is None, a ValueError is raised to ensure an actual image was provided.
Convert Image to RGB:
- Converts the BGR frame to RGB, as face_recognition requires RGB input.
Face and Landmark Detection:
- Detects face locations using face_recognition.face_locations.
- For each detected face, retrieves facial landmarks using face_recognition.face_landmarks.
Extract Eye and Mouth Coordinates:
- left_eye and right_eye coordinates are extracted for calculating EAR, and bottom_lip coordinates are extracted for MAR.
Calculate EAR and MAR:
- Calls eye_aspect_ratio for both eyes and averages them to get a final EAR.
- Calls mouth_aspect_ratio for the mouth.
Check Thresholds:
- If ear < EYE_AR_THRESH, sets eye_flag to True, indicating that eyes may be closed.
- If mar > MOUTH_AR_THRESH, sets mouth_flag to True, indicating potential yawning.
Return Results:
- Returns eye_flag and mouth_flag as indicators of eye closure and yawning, respectively.

Next we will call the process_image function.

img = cv2.imread(image_path)
process_image(img)

(True, False)

image_path correctly points to the image file you want to analyze.
The process_image function is defined along with the eye_aspect_ratio and mouth_aspect_ratio helper functions.
The output will display whether the system detected closed eyes or yawning.

Real Time Drowsiness Detection

Next we will process video frames to detect drowsiness in real-time, counting consecutive frames where signs of drowsiness appear.

video_path = "test.mp4"
# video_cap = cv2.VideoCapture(0) # for getting frames from the webcam
video_cap = cv2.VideoCapture(video_path)
count = score = 0

while True:
    success, image = video_cap.read()
    if not success:
        break

    image = cv2.resize(image, (800, 500))

    count += 1
    # process every nth frame
    n = 5
    if count % n == 0:
        eye_flag, mouth_flag = process_image(image)
        # if any flag is true, increment the score
        if eye_flag or mouth_flag:
            score += 1
        else:
            score -= 1
            if score < 0:
                score = 0

    # write the score values at bottom left of the image
    font = cv2.FONT_HERSHEY_SIMPLEX
    text_x = 10
    text_y = image.shape[0] - 10
    text = f"Score: {score}"
    cv2.putText(image, text, (text_x, text_y), font, 1, (0, 255, 0), 2, cv2.LINE_AA)

    if score >= 5:
        text_x = image.shape[1] - 130
        text_y = 40
        text = "Drowsy"
        cv2.putText(image, text, (text_x, text_y), font, 1, (0, 0, 255), 2, cv2.LINE_AA)

    cv2.imshow('drowsiness detection', image)

    # exit if any key is pressed
    if cv2.waitKey(1) & 0xFF != 255:
        break

video_cap.release()
cv2.destroyAllWindows()

Video Capture:
- video_cap = cv2.VideoCapture(video_path) loads a video file for processing. You can use cv2.VideoCapture(0) to capture frames from a webcam instead.
Frame Resizing:
- cv2.resize(image, (800, 500)) resizes the frame to a fixed size for consistency and potentially faster processing.
Frame Skipping:
- count % n == 0 allows the code to process only every nth frame (set as n = 5). This reduces computation, which can be beneficial for real-time performance.
Drowsiness Detection:
- Every nth frame, the process_image(image) function checks for eye and mouth flags.
- If either eye_flag or mouth_flag is True, it increments the score. Otherwise, it decrements the score, ensuring it does not fall below zero.
Displaying Score and Status:
- The score is displayed in the bottom left corner of each frame.
- If the score reaches a threshold (set at 5 here), indicating consecutive frames with signs of drowsiness, it shows a "Drowsy" warning in red text on the top right of the frame.
Display and Exit:
- cv2.imshow('drowsiness detection', image) shows the frame with annotations.
- cv2.waitKey(1) & 0xFF != 255 waits for any key press to break the loop and exit.

Final Thoughts

Real-time driver drowsiness detection systems hold immense potential in improving road safety and preventing accidents caused by fatigue. With the integration of computer vision and machine learning technologies, such as OpenCV and face_recognition, these systems can analyze critical visual indicators like eye closure and yawning, providing timely alerts to drivers.
In this article, we explored the development of such a system using Python and OpenCV. The key components, including eye aspect ratio (EAR) and mouth aspect ratio (MAR) calculations, were discussed to track signs of drowsiness.
By processing frames from either a video feed or an image, the system can dynamically assess the driver's alertness and detect potential fatigue, triggering warning signals when necessary.

In conclusion, a real-time driver drowsiness detection system represents a significant step toward building safer roads. With continuous advancements in AI and computer vision, we can expect such systems to become increasingly sophisticated, helping drivers stay alert and reducing the risk of accidents caused by fatigue. As technology evolves, the integration of these solutions in vehicles could become a standard feature, contributing to the growing field of intelligent transportation systems.

Get the project notebook from here

Thanks for reading the article!!!

Check out more project videos from the YouTube channel Hackers Realm