Visual Art Generation for Music

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Πανεπιστήμιο Πελοποννήσου

Abstract

This thesis explores the potential use of Generative AI for visual art generation in music, introducing a tool named Deforum Music Visualizer. This tool enables the automatic creation of visual art from music and is built using Deforum Stable Diffusion, an open source, generative text-to-video diffusion framework. To incorporate both high- and low-level musical elements, it integrates extensive Music Information Retrieval (MIR) data into music informed settings, along with conditional generation based on the song’s album cover. A survey of 45 participants (balanced female/male ratio, ages 19–59) was conducted to evaluate the tool’s effectiveness. Regardless of the participants’ music background, the tool produced baseline results in the fully automated process, scoring 3.0 ± 1.06 for Mean Enjoyment and 2.93 ± 1.20 Mean ISA (incorporation of the song’s atmosphere) on the Likert scale (1-5). User-curated prompts provided a statistically significant improvement in the performace in both Mean Enjoyment (3.63 ± 1.03) and Mean ISA (3.74 ± 1.06). The github repository of the project is available here: https://github.com/nickpadd/DeforumMusicVisualizer.

Description

Μ.Δ.Ε. 112

Citation

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license