Abstract:
Facial Expression Recognition (FER) is a vital component in human-computer
interaction, enabling machines to interpret human emotions. Deep learning models,
particularly Convolutional Neural Networks (CNNs), have demonstrated high
accuracy in FER tasks. However, these models are often computationally intensive
and memory-demanding, limiting their deployment on low-end devices. This
study explores the application of model compression techniques—quantization and
network pruning—on deep CNN architectures including VGG16, ResNet50, and
DenseNet121 using the FER-2013 dataset. The goal is to reduce model size and
inference time while maintaining accuracy in classification. Experimental results
indicate that 8-bit quantization of VGG16 achieved the best trade-off, reducing
model size by over fourfold with negligible impact on accuracy. Pruning showed
limited effectiveness on transfer learning models due to minimal size reduction but
proved useful on simpler architectures. The findings provide valuable insights for
deploying efficient FER systems on edge devices.