|
|
| |
|
|
| |
The latest news and links, with a few really good image generation developments. The usual fun 3d links, web projects, games links, some narrative, procgen, and data science tools (especially AI) and infovis. If you find it useful, help support my time and business taxes with a subscription!
TITAA #73: Animated Sprites After EatingNano Banana Pro - MJ Style Creator - "Simple" Game Design - CW Benchmarks - Mgrep - Sprite Animation
Jumping right in, so this doesn’t linger to tomorrow, I’m super busy! The end of the year, sheesh, always like this. Table of Contents
AI CreativityImage GenerationNano Banana 2/Pro is pretty big news, in terms of how it works and what it can do. (The DeepMind post about it is here.) It’s not just text-to-image generator, but an intelligent model (presumably an agent of Gemini 3 Pro) that will iterate with you, making changes to images as requested using its history. It can scale to 4K. It can web search for info, for example when asking it to make information-rich graphics—more evidence of Gemini 3 Pro behind the scenes. Some of the things getting attention including making infographics, following long detailed prompts with many details, using multiple uploaded photos to generate combination scenes, and making usable animated sprites for games. @Fofr, known for their image gen prompt work, has been hired by Google DeepMind as an actual prompt engineer, and wrote this little guide to tips for NB Pro. For sprite creation, Fofr had some examples on X. There’s a colab for sprite generation via code, from Google. MaximeRivest also has an AI studio generator and web app for making sprites but it struggles to obey the size you ask for since it doesn’t have an example impage input. Below is an animation of a row of a sprite sheet I made with NB Pro using an uploaded 4x4 input grid (h/t VictorTaelin on X). I had to edit it a bit and remove non-transparent background parts, but if it weren’t an unusual one, it might’ve worked right from the box. That’s a nun sowing seeds! 🛠 My live sprite-animator for Nano Banana Pro spritesheets will create gifs for you. It was created initially by Gemini Pro 3 in AI Studio, but turned into a web app with fixes for the GIF generation by my buddy Claude. (“We” tried 3 different libraries for the gif part.) Midjourney Styles Beta: In other big news, I love the new Midjourney Style tuner, which essentially is a way to “steer” by examples towards the style you want. MJ excels at artistic style creation, not requiring you to come up with frozen half-assed terms (“cottagecore”, etc) to describe what you’re after. Sometimes you just have no idea what you want till you try a bunch out. It works in stages, as you “like” examples, and behind the scenes it is generating your full intended prompt in the styles it has created. The UI could still use a little more help for clarity, but it’s pretty good for a first shot! A couple examples when I was working on folk horror styles — from earliest to latest as I tuned (same prompt): In the UI, you get views of the stack of styles in progress based on your rounds of choices, on the right side — these are examples from 2 different styles in the stack: Reminder that you can now animate any Midjourney image in your creations list. Eeee, these are extra creepy when animated! FYI: There is a recent research project that does some work that looks related to Midjourney’s styles — CoTyle (code-to-style generation). Z Image Turbo - a Hugging Face Space by Tongyi-MAI: In other news, people are liking and tuning the Z-Image fast photoreal-ish model available on Hugging Face. This one is honestly pretty good — witches on brooms have remained my pelican test for open, small image models. (It did not do quite as well when it was daylight out, weirdly.) 3D - AI and Not AIResults of a 3D drawing challenge (low poly forest) by a few big models, in a Codepen. We probably need more examples of this kind of benchmark, like Simon Willison’s SVG pelicans. Wishfultree, a three.js project to stick an ornament on a tree. For some reason this is monetized… But here we are, everyone needs money, even 3D vibe coder artists. SAM 3D: from Meta. Uh, it is bogglingly good. Here I gave it part of a concept image I made in Nano Banana, picked the objects (using SAM, the original segmentation model, I assume) I wanted, and asked for 3d of them. They look pretty good as a first pass! This is Meta’s full playground of cool models to try. Depth Anything 3: a 3d reconstruction tool from simple image and video data. Like for splats and modeling. MajutsuCity — I mean, nothing out yet, but I am pretty excited by this. Code coming! Building a city. “MajutsuCity represents a city as a composition of controllable layouts, assets, and materials, and operates through a four-stage pipeline.” Fabric-Project/Fabric: Node Creative Coding / 3D / Image Processing tool inspired by Quartz Composer: Yes, another visual programming tool. Meant to be for non-coders, and is not an AI thing per se. For:
Starter code for SparkjsXR projects, e.g., for 3d game world splats from WorldLabs. Which, where is my free time so I can build? Audio🔊 I am not tracking audio gen closely, but was struck by this interesting audio effects and audio style project, naotokui/latentgranular. It’s all cool, but the example on the bottom of the page, “Source Sound: Ambient noise recorded in a cafe in Seoul / Target Sound: Start with the same ambient noise, then crossfade into a drum loop (27-30 sec), and finally crossfade back to the original ambient noise (49-52 sec).” is really good. One of my biggest complaints with music generation products is the lack of ability to generate and use sound fx, or other non-obvious audio like chanting or whispering. Web Fun / Misc / Procedural Gen4D Polytopes - Interactive 4D Geometry | Pardesco: 4D shapes in the browser, the live demo is here. There is a tesseract, which will interest fans of the book A Wrinkle in Time. ProcJam 2025 — submissions are almost all in! Non-AI generative projects, “make something that makes something.” 💗 Related, Narrative Constellations Showcase, from a class at the School for Poetic Computation, via Matt Muir/Webcurios. “The class is for artists and writers wanting to explore storytelling through choice, time, and location-based narratives across different mediums, from objects to spaces to sunsets. You can read the course description or a blog post summarizing the course to learn more!” These projects look great. karpathy/reader3: Quick illustration of how one can easily read books together with LLMs. It’s great and I highly recommend it: A vibe coded project for reading. I’m collecting “reading experience” tools and opinions. And working on my own tools, especially for reading books in foreign languages you want to learn. A Million Random Acts: from Mark Sample. A giant canvas to scroll and zoom on.
Lol: Leftovers, the Unsung Food Porn of Art History (in Hyperallergenic). A good post-American-Thanksgiving and holiday content offering. “From Dutch vanitas paintings to Laura Letinsky’s contemporary photographs, artists have long paid homage to the crumby aftermath of big meals.” Wow, this dude Willem Claeszoon Heda really did go to town on leftover still life (“nature mort” for the French). “He is known for his innovation of the late breakfast genre of still life painting.” (Genres are weird.) Neuroevolution: The Neuroevolution ebook (with a full online pdf) from Sakana AI folks, which has a free online edition, full of code that grows itself (alife, etc). Starting from: Biketerra — I don’t know if this is a game, but it’s a web 3d thing you sign up for. The concept is you ride routes on real maps. But sadly the art is blocky 3d stuff, not streetview or anything truly evocative. One could redo it ;) Pranksters Recreated a Working Version of Jeffrey Epstein’s Gmail Inbox: “Using Jmail, you can read thousands of Jeffrey Epstein’s emails in a familiar format. Use the star function to highlight notable finds.” Article in Wired, but here’s the project link. Games & Narrative“The Writer Will Do Something,” a very good Twine interactive narrative (classic) about writing for the game industry. Via Char Putney. Game design is simple, actually (hah): Raph Koster bringing it in. Good article, via The Guardian.
Terrific: Activists Are Using ‘Fortnite’ to Fight Back Against ICE. In Wired.
Zarf Updates: Zork is now open source: Microsoft did a good thing (Preserving code that shaped generations: Zork I, II, and III go Open Source) and Zarf/Andrew Plotkin is on it. AdventureX (narrative games in UK conf) talks, not yet broken up by speaker, are on YouTube! Day 1, Day 2. A couple on AI:
Narrative AIGitHub - Doriandarko/kimi-writer: AI writing agent powered by kimi-k2-thinking - autonomously creates novels and stories with deep reasoning: via Tim Kellogg. K2 is known to be creatively weird (this is good), but also known to not hold things together over long stretches of text. Also, I still want to tune every single writing model on good writing style. Holiday goals!
🏆 Current top of the LM Arena for creative writing is Gemini 3 Pro. I expect disagreement among the other testers though, since we’ve also had other big LLM releases in the past two weeks — oh yes, I just checked and Lech Mazur disagrees (huh, he has Kimi K2 right up there): Folklore morality: Probing Narrative Morals: A New Character-Focused MFT Framework for Use with Large Language Models - ACL Anthology. An Andrew Piper students project, looking at morals in folklore (my cup of tea). “We validate our approach against human annotations and then apply it to a study of 2,697 folktales from 55 countries. Our findings reveal: (1) broad distribution of moral foundations across cultures, (2) significant cross-cultural consistency with some key regional differences, and (3) a more balanced distribution of positive and negative moral content than suggested by prior work.” Literary style studies:
Data Science / Tools / InfovisA very nice looking (I have only glanced) radial network datavis project, “Everyone Who Wrote for Chris Carter,” via Matt Muir/WebCurios. Also another Alvin Chang Pudding project, on Democracy. (Matt Muir keeping me up to date on my former career.) mixedbread-ai/mgrep: A calm, CLI-native way to semantically grep everything, like code, images, pdfs and more. Multimodal grep. Being used by some to make agents’ grep much better and sharper, reducing context usage (ex on X of looking for UFO reports in PDFs). © 2025 Lynn Cherny |