Detection and Classification of Emergency Vehicles from Audio and Video Inputs using Deep Learning Techniques

Authors

  • Ashraf Osman Ibrahim UTP
  • Ephzibah E.P
  • Mareeswari V
  • Nishanth Samson I
  • Nasikethan R
  • Rozaida Ghazali Rozaida

Keywords:

Emergency Vehicle Detection, Image Processing, Audio Classification, YOLO, Object Detection, LSTM, Explainable AI

Abstract

With the rapid advancement of autonomous vehicles, ensuring road safety is one of the biggest concerns of the automotive industry. One critical aspect of safety is the accurate and timely detection of emergency vehicles such as ambulances, and fire trucks and promptly switching lanes to ensure smooth passage. This paper proposed an efficient and straightforward method to locate and label emergency vehicles with the help of the most updated deep learning algorithms known as YOLOv8 and long short-term memory (LSTM). The accuracy and efficiency of emergency vehicle detection in terms of perfecting the models is the focus. Data augmentation methods are carried out to enhance day-night and low visibility performance of the model on the dataset. The system is capable of identifying and classifying emergency vehicles on the basis of audio and video data using several signal/image processing methods and accomplished with the means of explainable artificial intelligence (XAI) mechanisms providing detailed information. The given system can be used with self-driving and human-driven vehicles which can be fitted to advanced driver assistance systems (ADAS). It is found that the accuracy and performance of emergency vehicle detection have improved significantly with the 96.6% accuracy rate that will ameliorate the interaction safety of autonomous vehicles and emergency vehicles.

Downloads

Download data is not yet available.

Downloads

Published

28-12-2025

Issue

Section

Articles

How to Cite

Ibrahim, A. O., Ephzibah E.P, Mareeswari V, Nishanth Samson I, Nasikethan R, & Rozaida, R. G. (2025). Detection and Classification of Emergency Vehicles from Audio and Video Inputs using Deep Learning Techniques. Journal of Soft Computing and Data Mining, 6(3), 20-32. https://penerbit.uthm.edu.my/ojs/index.php/jscdm/article/view/21149