Spiritual Hallucinations in Architecture With VQGAN+CLIP
Abstract
Spiritual Hallucinations in Architecture began with the realization that spiritual spaces have a way of making people feel a certain way, through the use of lighting, color, materials, acoustics and spatial compression. Some of the most iconic architectural spaces have left us feeling overwhelmed and anxious. As such, we are interested in the intersection between human emotions and machine learning. Our aim is to explore how an Artificial Intelligence interprets human emotions juxtaposed with spiritual architecture.
Using Generative Adversarial Networks (GANs), we can generate synthetic images of spaces by training a Neural Network with spatial and visual characteristics. The first step is generating synthetic images of architectural spaces using VQGAN+CLIP. The images are then categorized and organized into a matrix of human emotions.Synthetic images of the architectural spaces are generated again, this time using the categorical description.
State of the Art
GAN Loci is a notable project by Kyle Steinfeld, which inspired the concept for Spiritual Hallucinations in Architecture. The images below are the result of synthetic images of a city generated based on the genius loci of the city.
Synthetic images of Rotterdam and San Francisco based on the genius loci of the city.
Source: GAN Loci, Kyle Steinfeld (2019)
Methodology
Using VQGAN+CLIP we can begin to explore the juxtaposition of emotions and architecture. VQGAN+CLIP stands for Vector Quantized Generative Adversarial Network and Contrastive Language–Image Pre-training. GANs are systems where two neural networks: a generator which synthesizes images, and a discriminator which scores how plausible the results are. The system feeds back on itself to incrementally improve its score. CLIP is a companion third neural network which finds images based on natural language descriptions, which are fed into the VQGAN.
Source: Generating AI “Art” with VQGAN+CLIP Overview
Once we identified the spiritual architecture, we listed the possible emotions that humans may experience. The goal is to create a matrix of images and human emotions. Feelings include euphoric, delighted, calm, anxious, sad and creepy. Synthetic images of the spaces will be generated using these descriptions.
Results
The results of the experiment are documented below.
luis barragan church
the vatican
hallgrimskirkja
ronchamp
cathedral of brasilia
sagrada familia
Observations
Looking at the images as a whole, we can start to see obvious similarities. Euphoric and delighted tend to have generated colorful patterns. Calm and anxious were somewhat neutral and didn’t generate notably distinguishable features. For negative emotions like sad and creepy, the model was only able to relate the emotion with a person, resulting in creepy or sad people being added to the images, as opposed to the model being able to attribute the feeling onto the place.
We also noticed that the positive emotions resulted in brighter looking images with more windows, while the more negative ones presented with darker colors and less windows. Overall, we thought that the model was successful in understanding sentiment, such as a negative or positive emotion, but could not understand the nuances between them.
Conclusion
Our initial objective was to identify how an AI represents the human experience of an architectural space. Throughout this project, we had fun exploring the possibilities of VQGAN+CLIP. Although we were a bit underwhelmed with the lack of profoundness of these results, they were still insightful in depicting how human emotions are translated into architectural images and we look forward to seeing how this can be developed further.
Credits
Spiritual Hallucinations in Architecture is a project of IAAC, Institute for Advanced Architecture of Catalonia developed at Master in Advanced Computation for Architecture & Design in 2021/22 by
Students: amanda gioia . salvador calgua . sophie moore . zoé lewis
Faculty: mark balzar . zeynep aksöz