I’m a writer.
It’s not easy to come about defining myself as a writer, because I don’t write much anymore. I used to write non-fiction, in Chinese, my first language, for publications and blogs. Then I went to school and started to write in English: fictions, plays and screenwriting. I’d write in Chinese about the people, places and events in LA where I lived, and in English imagined monologues for my Chinese-speaking mother, and fictional family drama that takes place in Chinese households. My writings exist in two spaces, in two languages, each safely hidden in the narratives of my other half of life.
The only place where these brutally honest thoughts meet without the anxiety of cultural barrier was in notebooks, diaries, and sticky notes on my wall. Lots of those thoughts were thrown away, but a few survived in digital or analog formats. I often wonder about my identity performance in my texts that are more self-absorbed and less audience-minded, and with the introduction to Runway ML, I decided to train a model on my personal, often painful writings, and see what insight AI has to offer.
My initial idea was “i-statement” – sentences I wrote in which I was the protagonist. I read through and compiled texts from:
- Bits and pieces in iPhone Notes, Notion, and journals
- Text messages
- Emails to close friends, an ex, and a lost lover
- “non-fiction” and half fiction writings
- ITP blog
Eventually, I manually collected about 80 samples, put into one txt file.
Unfortunately, Runway told me it was only 1/10 of its size minimum, and that I need at least 100k (!!) of txt files to train.
I had to expand my library, and include more complete pieces rather than simple sentences. There has been so much neglectance to my own writing, that I rarely kept anything from my past. When I moved out of LA I threw away my old notebooks, if not having already tore away many pages in them. This project really reminded me to make better efforts at documentation and archiving.
Even with some longer essays included, my files are still not big enough. I would imagine GPT-2 not responding well with non-English characters, but I had to include Chinese writings (and some bilingual ones) in order to start training. In the end, my input of English vs. Chinese contents are at roughly 4:6. Set it to 3000 steps, now we wait.
Not at all to my surprise, the Chinese output is a hot mess. Here’s a snippet of the result:
有多了在有多久了。所有几乎在这些有人的过去，但有知有理想了。所有几乎在我 们升大块钱，整家事的经常视长。所有家的时候，我时常见的时候，但有家不再时 见的结果好，但有忍不擅颇的一天和姑娘。我的家，我撑觉得好自己的事情事情。 好求医了能同样的温求医了。
I can’t translate this because it’s gibberish made up of mostly recognizable words — they just make no sense when put together. There was a Chinese poetry model I’ve found also using GPT-2, that produces mostly acceptable poems. I wonder if its creators made some sort of linguistic effort while training it, or did they just had much more input and steps of training? Perhaps it’s simply because poetry allows for more random assemblage of words to create meanings. Either way, it doesn’t surprise me how much it didn’t understand anything other than English. Dominance of language is a big part of cultural hegemony.
In contrast, the English outputs are, by large, normal sentences in paragraphs. A lot of them used my sentence structure, and mix matched common words that I’ve written about. Therefore, this Viola writing bot is producing contents about: dancing, Los Angeles, anxiety, friendship, and they are, more often than not, very, very sad.
I’m not the best when it comes to art, it really only scratches my soul. I can’t really write anything, and because I don’t have an art identity, I can’t stop writing about how I am going to write anything anymore. I never thought of myself as a graphic novel, or a physical medium. I worked with bodies, sweat, sound, touch and desire to modify the physical world, and then I moved to a more physical, physical space where everything was physical. I wrote about how I was going to college, and how I ended up at the hospital when my “last” moment came. I started meditating, and I thought it was going to end anytime I wanted to be. Then I started writing poetry, and things started falling out of my way.
This could potentially evolve into a bigger project when I stop throwing away my pages. For now, more output results are available to download here: