Developing Video-Based Language Models for Time Study and Value Analysis in Manufacturing

32,000

Awarded Resources (in node hours)

MareNostrum5 ACC

System Partition

3 June 2024 - 2 June 2025

Allocation Period

AI Technology: Deep Learning; Natural Language Processing; Vision (image recognition, image generation, text recognition OCR, etc.); Generative Language Modeling.

The project focuses on creating Video-Based Language Models (VBLMs) that can precisely identify, measure, and categorize operator movements within manufacturing environments.

This process relies on the combination of advanced Large Language Models (LLMs) and computer vision to accurately interpret video data. The need for sophisticated hardware for the development and testing of these VBLMs is a key reason for our application, as it will enable timely and effective model development.

By streamlining the analysis of manufacturing operations, Khenda seeks to set a new standard in industrial efficiency, providing a powerful tool for improving productivity through AI technology. Khenda aims to transform factory operations by developing an AI application that automates time studies and value analysis through mobile device-captured videos.