Part 1: How can machine learning support people's existing creative practices? Expand people's creative capabilities?
I am very interested in this question, but also quite uncertain about it. I think the key is figuring out what ‘creative capabilities’ consist of.
It is no secret that the advent of generative ML is full of controversy and optimism. The optimists believe it is just another tool, and an extremely powerful one, at our disposal, which will unlock more creativity than ever. The pessimists believe it is a shortcut that will automate and displace crucial aspects of the creative process.
I think the optimists and pessimists are both right about certain things — and that the difference between ML as an empowering tool and a disempowering shortcut depends on the conservation of internal aspects of the creative process.
I have been speaking with faculty members at ITP about this question — trying to figure out what those internal aspects of creative practice and creative capability are. There are a number of psyhcological qualities that frequently come up: creativity, curiosity, critical thinking, love of learning, persistence, and perspective. Then there is also a suite of behavioral skills that many highlight: open exploration and risk taking, reflective process and practice, taking pride in work, generating ideas and iterating on them, sense of learning and mastery, building enthusiasm for a medium, interrogating the tools, going from an idea to a finished project, communicating one’s process and ideas.
Importantly, these qualities all contribute to the value of the creative process and are not easily automated. ML models cannot take risks for us, they cannot reflect on our process for us. We are responsible for bringing curiousity, persistence, love of learning… etc to the process regardless of whether our chosen tools are pencils, paintbrushes, or generative models.
If we can be clear about which creative capabilities we care about, then ML is indeed just another tool. What we need to study and better understand is: when are these capabilities empowered by ML, and when are they disempowered?
Part 2: Dream up and design the inputs and outputs of a real interactive system for interaction or audio/visual performance.
My idea is for a system that converts emotional states to color pallettes. The input will be facial recognition data and the output will be a color pallette that corresponds to that emotional state. The idea is that this tool can help people conver their mood to aesthetic choices.
This system will require two components. The first will be an emotion recognition model. I like the model at hume.ai becuase it detects the most nuanced emotional states: like surprise, concentration, boredom, and tranquility. This model gets us from webcam data to emotions. The second will be emotions to colors. In order to do this, I would provide teachable machines with a database of paintings from expressionism, like Rothko and Albers. I will rate each of the paintings by emotion, or perhaps use a system to code certain colors with certain emotions, and then train a model to recognize which ‘emotions’ are present in which particular color palettes. With these two models together we can match emotions detected in facial expression with paintings of particular colors based on the emotional content of the painting. Once we match emotions to paintings we need to abstract the color data from the painting. To do this we will convert the images of the paintings into pixels with RGB values and take averages of the 3-5 dominant colors in the painting—and there we have a color palelte that matches our emotional state.
Part 3: Sketch: Emotion Visualizer
For this sketch I created a simple model deteching whether I was happy and calm or upset and frustrated in teachable machines. I used this model’s outputs as inputs for an emotion visualizer. I like the visualizer I created but for some reason I can’t get a smooth output with the emotion detection model. In the sketch I allow the user to toggle between webcame mode and just slider mode. In slider mode the user can toggle the visual with the slider. in Webcam mode the visual is influenced by their facial expression and how happy or sad they are.