Using machine learning for image captioning

  • Aleksije Micic Bravo Systems d.o.o.
  • Zoran Djuric Faculty of Electrical Engineering, University of Banja Luka
Keywords: Recommendation systems, Image captioning, Shop-to-Shop retrieval, Consumer-to-shop retrieval, Large language models

Abstract

Modern information systems for e-commerce increasingly integrate product recommendation engines to enhance the user experience. This paper analyzes the application of image captioning techniques based on machine learning, with the goal of generating better recommendations for e-commerce platforms focused on selling clothing. Special attention is given to the limitations of traditional approaches that rely solely on visual similarity between products. Image captions generated by eight different models were evaluated, with large language models performing the best. Subsequently, the proposed solution, which combines the traditional visual similarity approach with semantic similarity analysis of the generated descriptions, was
evaluated across previously defined problems. The results demonstrate the effectiveness and justification of this new approach. Finally, the paper outlines potential directions for future research.

Published
2025-12-30
Section
Information technologies