Real-time action localization of manual assembly operations using deep learning and augmented inference state machines

Author(s): Selvaraj, Vignesh, Md Al-Amin, Xuyong Yu, Wenjin Tao, and Sangkee Min


Publication: Journal of Manufacturing Systems

Acknowledgment: This material is based on work supported by the National Research Foundation of Korea, South Korea (Brain Pool Program 2022H1D3A2A01093491). The authors of this work would like to acknowledge Foxconn iAI, a division of Foxconn, for their support in providing the data required to perform this study.

Citation: Selvaraj, V., Al-Amin, M., Yu, X., Tao, W., & Min, S. (2024). Real-time action localization of manual assembly operations using deep learning and augmented inference state machines. Journal of Manufacturing Systems, 72, 504-518.

The real-time monitoring of assembly operations in manufacturing industries can be used for manufacturing process optimization, which is crucial to manufacturers. It helps to improve productivity by automatically identifying the bottleneck and enhancing product quality by detecting errors and providing feedback to rectify them in real-time. However, developing a robust and reliable assembly monitoring system is not trivial due to the varying length of fine-grained assembly steps and anthropometric variations associated with assembly workers. To tackle the challenge, conventionally, wireless body-worn sensors have been used, leading to operator discomfort, and raising safety concerns. In this work, we propose a novel technique to automatically recognize and localize the assembly steps in real time called the State Machine Integrated Recognition and Localization (SMIRL). SMIRL is comprised of an inference machine and a state machine. They are responsible for detecting and localizing the actions, respectively. SMIRL can measure the duration of individual assembly steps and the entire cycle. Additionally, SMIRL can also detect the mistakes that may occur in an assembly and raise an alert to notify the operators in real time. The mistakes here can be breaking the predefined assembly sequence (Sequence Break) or missing any of the assembly steps (Missed steps). The effectiveness of SMIRL was evaluated against two datasets, with one being from an assembly workstation in industry and the other from a laboratory. The result shows that SMIRL can detect and localize the actions with an Intersection over Union (IoU) score of 87.53%, and identify Sequence Breaks and Missed Steps with F1-Scores of 86.64% and 87.45%, respectively. Through this study, we aim to contribute towards real-time monitoring of human-centric assembly operations to facilitate smart manufacturing.