This end of the month issue is free but took a huge amount of time to compile. Please consider becoming a paid supporter if you value this? It isn’t just so you’ll get the full recs list and the news section mid-month, but because it’s a signal that I should keep doing it. TITAA #60: Creative Gum and RagCreative RAG - Lost Poetry - Minecraft Gen - SAE Papers - Narrative AgentsTOC to another big issue (links on the web site):
AI & Games NextLevel 2024: Tiny NotesLast weekend I sent myself to London for the one-day AI & Games Next Level event organized by Mike Cook at King’s College. I posted live on Bluesky during it (starting here). Here are some short remarks in light of this newsletter’s interest in creative AI and some of my own projects. Everyone who spoke was interesting, I’m eliding for the sake of an already very long issue! Mike Cook, whose articles I’ve shared before, discussed his research on game mechanic generation with his Pixie system. He noted that he laughed out loud at some of the ideas it came up with. This is almost a theorem to me now: if you made something that made something that made you laugh, you have made a Good Thing. Something, in any case: that gave you a wonderful surprise, confounding expectations? This delightful alien thinking is a way we learn about our own thinking patterns and implicit rule systems. He noted an authorial problem of “credit” in this context: Kate Compton, aka GalaxyKate, of Tracery fame and more, gave a thought-provoking talk about ways to think about being creative with AI and procedural generation tools. She discussed ontologies as a way of structuring systems and domain specific notations as ways to talk about artistic goals — for instance, for dance or music performances: And frameworks for writing: She is interested in making better design tools, but also in helping people understand what they do when they work creatively. As many UX people have been saying, a text input to a language (or image gen) model isn’t an ideal interface for most deep craft work. What might be better? What are the units in which we want to work? Mike’s Pixie design goals are closely related to this question: Kate suggests a framework: that we think in terms of combining rule systems or structural components with AI gen output, along with a human special touch — her 3Gs, “girders, gum, and gargoyles.” She notes that LLMs are generally bad at generating fiction from a simple prompt (by most human measures of quality), but that given structure—girders—they do better. This is also the observation of many narrative AI research folks, who are using discourse structure components (plot, character, scene…) and levels of detail to weave together a better AI generated story. And it goes even better when humans are in the loop to add direction or curation. Sadly, narrative research folks often aren’t UX designers or tool builders. Nevertheless, I highly recommend the links to the latest research in my Narrative & Creativity section below — agentic systems are also becoming a Thing in this space, with multiple automated “writing experts” at work on different aspects of the fiction problem. My own projects for the last few Nanogenmos (National Novel Generation Month, starting today 😃) have featured data sets and rules (often Tracery grammars) combined with gen AI “gum”: 4 years ago, my “Directions in Venice” used a GPT2 model trained on historical guides combined with Tracery walking directions (and flickr photos); and last year I used Google maps and location reviews in a game-like wrapper, rewritten by GPT4. (I wrote about it here.) My “gargoyle” component in the Google travel project was adding weird goals for each trip: ghost hunting, or industrial espionage. I’m thinking of such systems as creative RAG: a bit like RAG (retrieval augmented generation) in which there is a background data set; we retrieve what we want from it, and then we create a new thing, in a generative context. Instead of generating a summary or a question answer, it’s a creative output: Creative RAG, or CRAG! Note I said “we,” not “it.” My projects for Nanogenmo have all been spaghetti code at the end of the month — in part due to the difficulty of managing many levels of creative design in the “gum and girders” without good tooling for it. These types of little creativity systems are built iteratively: You need to move the levers a lot, and even add levers, after seeing what comes out. That’s also why it’s a “we” not just an “it” situation. Mike Cook said at the end of his talk, “I don’t think the next step for creativity is a black box. I think we should teach people to build their own little AI systems to more deeply understand what they do.” We definitely need more domain specific tool building to enable this, with good UX, both for Kate’s “casual creators” (like me) and for professional creatives trying to use gen AI meaningfully. Now the news! Creative AI ModelsImage GenI guess the big news of the last week is that the stealth image generation model “red panda” that was out-performing all others in the leaderboard (including Ideogram, Flux, and Midjourney), was unveiled as a model from upstart Recraft who are making “tools for designers.” It can do text rendering in images, styles, and even generate SVG. I was pretty impressed by the SVG gen (Replicate link), it opened fine in Illustrator. The non-SVG output for this prompt (model on Replicate) was also excellent, although geographically odd: Compare Midjourney output options — less accurate to the prompt and arguably less cool, IMO. It also did better at my usual attempt at a luxury space ship hit by a meteor with an explosion on impact (almost no model can interpret this prompt). Blockade Labs just launced Blendbox, an Adobe-inspired image gen and editing tool that relies on layers, and a final “composite” in real-time of the image components. It’s clever and intelligent. A good Lora on Glif — generate a “then and now” photo collage inset with 2 prompts: Some 3DOrangeSodahub/SceneCraft: [NeurIPS 2024] SceneCraft: Layout-Guided 3D Scene Generation — a cool generated rooms thing using 2D layout. (Via DreamingTulpa, I missed it last newsletter.) MeshUp: 3d deformation and concept mixing. This would normally be in my “weird” links for mid-month, but I am feeling low on AI links of interest to me right now (apart from the usual slew of model updates and comfyui wrappers etc). Also this was a candidate for illustration the new Jeff VanderMeer (seen in Recs). There’s been a ton of 3d gaussian splat awesomeness and a new file format to reduce size dramatically — but I just have to get this out the door. Email me if you want. (Ok, just this one:) Misc"This AI painter has sold $4 million in artwork. Now Sotheby’s wants a piece of the action" (FastCompany):
An AI perfume company, Osmo — achieving successes. (Spun out of Alphabet, I believe.)
Video: There were lots of fun generated video tests and news/info in my mid-month newsletter, so not doing any video news here now. Procgen / Web / FunA 3D tour of the Mayan Temples at Copan (blog post), from Mused, who do 3d heritage site work. “Beneath the ancient stone temples and monumental stairway, there’s a hidden world—an extensive network of tunnels offers a glimpse into the city’s earliest history. The deeper and further into the temples you go, the older the surroundings, back to the founder in the 400s CE, 1,600 years ago.” A pretty interesting writeup, they battled huge spiders, 100% humidity, and a tunnel collapse during the 3d scanning. The stories and scans are here. It’s like scrollytelling Streetview inside monuments, what could be cooler (ok, VR). Mathematical art by Hamid Naderi Yeganeh (IG link). It’s unbelievable. "Bioart" - public domain bio / science art icons and images. One set: ⭐️ On Crafting Painterly Shaders - Maxime Heckel's Blog. Amazing post with embedded interactives and code, must-read for shader fans. (See also this guy Simon’s shaders and game courses.)
Games News Links
"10 design lessons learned from 30 years of horror games" in GameDeveloper who did a lot on horror games this past few weeks. This points to a lot of good GDC talks/videos. Also has amusing counter-intuitive points, like “Disempower the player.” "How to Build a Platformer with AI - Full Tutorial" - a Rosebud games post. I remain charmed by them. They also just released a new 3D gen template. Minecraft GenThe overnight news is the open-sourcing of a tech demo showing minecraft-like game gen on demand, Oasis. (Here’s a blog post too.) You can try to play in the browser but there is a queue to join. The quality is poor and there is no world model — if you turn around, the world changes entirely. You do not fall into the pits in the ground, you just keep floating over them. (Cocktail Peanut on X notes: “Instead of just moving forward, try keep rotating. You'll notice that your environment keeps changing, basically unplayable as a game. This is what I mean by "The Statement 'Every pixel will be generated, not rendered' is JUST A MEME".) In any case, as with the Counter Strike real-time “sim” and the Google GameNGen paper, these worlds won’t replace authored, reliable, real-time game worlds any time soon. Despite being tech feats of wonder. Minecraft driven by AI has become a real thing recently — lots of people testing different models to build structures. Jack Clark on Import AI summarizes (links all to X, I could not find a good article otherwise):
And a tool to try it out (apart from Mindcraft itself): Orchestrator to spin up MC server, run Mindcraft agent, save building and run eval. Narrative / Creativity Research“Unbounded: A Generative Infinite Game of Character Life Simulation” from Google Deepmind (no code). Paper was just updated with more citations of relevant past work and discussion. This seems to be part of a DeepMind agenda to create game world simulations that we’ve seen mentioned on Xitter (by Demis and Nathan Ruiz). Among the main contributions in this work are environment and character consistency (also related to many comics gen approaches I’ve cited in this newsletter).
🚗 Story-Driven: Real-time Context-Synchronized Storytelling in Mobile Environments — or, synchronized story telling with driving environment! This is a great idea. I did not quite expect to see it using a narrative arc, I thought more environmental cues, but hey why not. "SpecialGuestX" - via WebCurios mailing list. A lovely physical object manipulation for story making tool. “DAGGER: Data Augmentation for Generative Gaming in Enriched Realms” — a paper at the Wordplay workshop linked in Games above; this is a “synthetically generated dataset for creating text adventure games from fiction and for generating prose from game states.” Shades of girders. They released two models: narrative-to-dagger, and dagger-to-narrative. "Creativity in AI: Progresses and Challenges" - a recent look, a meta study. Finds that “[models] struggle with tasks that require creative problem-solving, abstract thinking and compositionality and their generations suffer from a lack of diversity, originality, long-range incoherence and hallucinations.” But some good review of issues including copyright and approaches informed by psychology. Also see Application of AI in Literature: A Study on Evolution of Stories and Novels. "Does ChatGPT Have a Poetic Style?" - yes but it’s doggerel. My take. Anyway, also lol: “Our results show that GPT poetry is much more constrained and uniform than human poetry, showing a strong penchant for rhyme, quatrains (4-line stanzas), iambic meter, first-person plural perspectives (we, us, our), and specific vocabulary like "heart," "embrace," "echo," and "whisper."" ⭐️ Lost Poetry — Max Kreminski on computational poetry as “lost poetry” in contrast to “found poems.” Thought provoking, and even poetic. They also invoke Allison Parrish notion of "semantic space probes” going where humans wouldn’t ordinarily go. Good stuff!
BookWorm: A Dataset of Character Description and Analysis. “In this study, we explore the understanding of characters in full-length books, which contain complex narratives and numerous interacting characters. We define two tasks: character description, which generates a brief factual profile, and character analysis, which offers an in-depth interpretation, including character development, personality, and social context. We introduce the BookWorm dataset, pairing books from the Gutenberg Project with human-written descriptions and analyses.” There are also tests on character description retrieval with book length context. Nice! BERTtime Stories: Investigating the Role of Story Data in Language pre-training. Using the TinyStories dataset, working on very small models: — “We find that, even with access to less than 100M words, the models are able to generate high-quality, original completions to a given story, and acquire substantial linguistic knowledge.” Agentic WritingCollective Critics for Creative Story Generation. This is an agentic approach: “… a group of LLM critics and one leader collaborate to incrementally refine drafts of plan and story throughout multiple rounds. Extensive human evaluation shows that the CritiCS can significantly enhance story creativity and reader engagement, while also maintaining narrative coherence. Furthermore, the design of the framework allows active participation from human writers in any role within the critique process, enabling interactive human-machine collaboration in story writing.” And another one plus dataset: Agents Room. “We propose Agents' Room, a generation framework inspired by narrative theory, that decomposes narrative writing into subtasks tackled by specialized agents. To illustrate our method, we introduce Tell Me A Story, a high-quality dataset of complex writing prompts and human-written stories, and a novel evaluation framework designed specifically for assessing long narratives.” CS 222: AI Agents and Simulations: A course plan from the famous Stanford (character sims) agents dude Joon Park. Sims fans take note. NLP & Data Science & Data VisGithub’s announcements of adding new coding models (notably Claude and Gemini Pro) were great news for Copilot fans who haven’t switched to Cursor yet 😛 (well I did), and their Spark UI dev tool sounds very promising. Also note the cool “Learning Sandbox” via Amelia Wattenberger. Nice! 🎤️ The (in)famous Notebook Llama release from Meta to counter the non-free NotebookLM from Google. And a new company offering podcast from documents: "Lettercast.ai - Turn your content into audible experiences.” 🧮 700 pages of Algorithms for Decision Making ebook from 2022. Anthropic’s course files on using its LLMs, including evaluation and tool use (“agents”). mangiucugna/json_repair: A python module to repair invalid JSON, commonly used to parse the output of LLMs -h/t Rohan Paul. TAG-Research/lotus: LOTUS: The semantic query engine - process data with LLMs as easily as writing pandas code: I mean, what? Fascinating (LLM glue). Align Eval: a game for labeling and alignment teaching. Cute. SAEs & EmbeddingsLiterally do not have time to go into these, but for the 4 of you who are big fans of Sparse Autoencoders / embedding models like me:
Data VisNomic is offering a preview of their new Data Stories — scrollytelling embeddable vis stories using their UMAPs. They have also added Leland McInnes as an advisor, which is a good idea. The hand of Ben Schmidt can be seen in the data stories implementation. 🤩 It will do zooming in on the big UMAP point display as wanted, plus toggle different states/display settings. How We Made Waves of Interest — an observable data vis tutorial/writeup by Fil about the project with Google News and Moritz Stefaner. (Via Andy Kirk’s newsletter.) Fancy heat maps of search interest in election topics: Book RecsIn the separate mailing for supporters, I had the TV recs too and much more detail here. I’m also leaving out the slightly more “meh” books from this list. Oh this is news I forgot to report (h/t Pippa Brooks): "Universal International Studios Buys Matt Dinniman’s ‘Dungeon Crawler Carl’ With Seth MacFarlane’s Fuzzy Door & Chris Yost Attached". I love the books, but can’t imagine these in a non-animated form. Wow. ⭐️ The Cautious Traveller’s Guide to the Wastelands by Sarah Brooks (fantasy). This reminded me of the Southern Reach series (see below), but merged with Snowpiercer. I loved it. A train crosses a Siberian wasteland that is altered, biologically weird, similar to Area X but maybe also a faeryland. ❄️ Arkhangelsk by Elizabeth Bonesteel (sf). This was a lovely read, well-written and a bit melancholy, while still being lively SF. A colony on an ice planet, with a system of deliberate eugenics, combats external splinter group forays and confronts an unexpected space ship arrival. 🚀 Trading in Danger (Vatta’s War Book 1), Elizabeth Moon (sf). A top performing military academy student is kicked out for poor judgment, and her trader family send her to space with a ship that needs repairs. 🐇 Absolution by Jeff VanderMeer (sf/f/horror). The new 4th book of the 3 book Southern Reach trilogy 😛. Welp, for this to count as any explanation, you have to remember the characters in the previous books, which I did not. A PoemMy neighbor’s daughter has created a city you cannot see on an island to which you cannot swim ruled by a noble princess and her athletic consort all the buildings are glass so that lies are impossible beneath the city they have buried certain words which can never be spoken again chiefly the word divorce which is eaten by maggots when it rains you hear chimes rabbits race through its suburbs the name of the city is one you can almost pronounce As usual, if you got to this point, I adore you. I hope we all survive this election intact and sane. Lynn (@arnicas on xitter which is increasingly disgusting, mastodon, bluesky and Threads, but mostly posting on Bluesky) You’re a free subscriber to Things I Think Are Awesome. If you’re a fan, and you want to support me in writing this, consider becoming a paying subscriber in order to get the complete mid-month updates including the new esoterica section and the end-of-the month media recs separate post—or buy me a coffee to express your appreciation. |
Поиск по этому блогу
Search1
123
пятница, 1 ноября 2024 г.
TITAA #60: Creative Gum and Rag
Подписаться на:
Комментарии к сообщению (Atom)





























Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.