The latest news and links. The usual fun 3d links, web projects, games links, some narrative generation papers and articles, some stuff on SVG generation which is getting much better by LLMs! If you find it useful, help support my time and business taxes with a subscription! TITAA #76: The Seen and the UnseenSVG Generation - Gemini 3.1 Pro - Beyond Slop - Art Maps - Agentic Patterns - Read in VRLook, it’s pretty hard chez moi when we have great weather on a weekend and I have a newsletter deadline. This is a bit rushed, because I chose to go out with my ebike and my drone instead of sitting inside all day. But I can’t let this slide to tomorrow (like the media recs, which will slide). TOC:
AI CreativityImage GenerationNano Banana 2 is out, and also Gemini 3.1 Pro Preview turns out to be very good at SVG among other things. (A computational linguistics researcher worked on the SVG training.) I gave it this image (source) and asked for SVG: I got this, which is not too bad—it was a very hard test: Then I asked it to animate it slightly so the leaves blow in the wind, and it made the bell sway a bit too, along with giving me a single page to download with all the source and CSS: ⭐️ Meanwhile, Quiver AI launched in beta, a dedicated SVG generation tool. I gave it the same illuminated capital to recreate, and you can watch it draw it as it thinks… completely fascinating. It timed out before finishing on all 4 efforts, but it was really doing a very good job indeed. I wrote praise in their feedback. Nano Banana 2 is fine and workmanlike. I have had some issues with some tasks asking for creative renderings (see below re Joel Simon article) but for basic instruction following it’s pretty good? “A treasure map, faded and torn on the edges, with bottle stains, against a white background. The maps contains rivers, mountains, towns, fields, swamps, and scribbles in fountain pen that are impossible to read. There is an X in a lake.” I probably should have been more specific about some of the details, like the X being in the same pen&ink hand. VideoPixVerse-R1 — Their next-generation real-time world model is described there… PixVerse is getting into the space, only in Invite mode right now, but you can read about it and try their video generation models. I have to admit I found this video it made me of “a cozy pub with a roaring fire” just a shade close to a setup for a pagan sacrifice — especially since it just zoomed in on the horned god on top and asking for sound fx only made it creepier: Building Multimodal Worlds with Moonlake — Moonlake’s World Modeling Agent for generating interactive 3D environments, with a waitlist you can join. This is a long detailed blog post about what’s involved in generating a world with physics, state, etc. Maybe this belongs in 3D, but the lines are blurring here. A few articles… AI film school trains next generation of Hollywood moviemakers — Reuters on an AI film school in LA and online, Curious Refuge. “Founded in 2020, Curious Refuge began offering courses in AI-assisted documentary and narrative filmmaking and in advertising in early 2023. It now provides instruction in 11 different languages to students in 170 countries.” SeeDance and the new media landscape — TechnoLlama on the copyright implications of video generation models… (I for one can’t wait to use Seedance 2.) “It has been a common theme in this blog that a large part of the creative industry will adopt AI, and this still seems to be the direction of travel.” And
3D AI and Three.js etcI don’t know quite where to put this — I’ve been working with Claude on a Quest 3 VR project to let me read on the online Kindle reader in cozy 3d gaussian splat rooms. We got a prototype running this week after a few bad directions—like a dead-end on Unity, where a working browser is a pricey purchase. We switched to the new and not super well-documented Meta Spatial SDK for Android. I re-joined World Labs and downloaded some good room splats. Success — I can turn pages via the controllers, move the browser around, move inside the space. Birds and butterflies next! Developing in Claude Code desktop on Windows was…. not superb. Maybe I set it up wrong compared to my care on the Mac. Of course I have no idea how to write code on Windows, let alone do it for the Quest, so it’s miraculous we got the thing running. Go Claude, on balance! pascalorg/editor — Architecture and design editor built with Three.js. In progress, entirely web based, of course. They have no furniture to use in the design yet. (I worked as a UX person on Autodesk Revit, briefly.) Related: three-maps — A rapid 3D blockout and greyboxing tool for the web. Great for room and layout design. Gemini 3.1 procedural 3D demo — Someone got Gemini to generate procedural Three.js scenes directly — a shared space with no visible UI controls, animating time of days. VoxelBench — TIL, a benchmark for 3D voxel generation models. (Voxels are 3d pixels, like in Minecraft.) Gemini 3.1 Pro is on top for both text and image to voxel. But honestly, browsing the outputs, wut?! Tutorials:
Misc Creative AILyria 3, Google’s new song/music generation model, is available to try in Gemini Pro subscriptions, although it only generates 30 seconds. I asked for a Breton folk rock sea shanty about writing code, and got a reasonable folk rock song that had one computery line in it (not worth embedding for 30 seconds): Beyond Slop — Joel Simon on moving past “AI slop” discourse toward something more interesting… there are lots of thoughts in here about tools and exploration and happy accidents, along with interesting anecdotes about modern artists at work:
This creative meandering process is one reason I still like Midjourney — with their style focus, “chaos” components, “vary subtle”, “vary strong” explorations… you can get someplace other than a simple goal directed place with adherence to a prompt. chat jimmy — An insanely fast hardware AI model (.039s for my last request). Everytime I see one of these speed demos I think “is there such a thing as too fast,” maybe. Web Fun / MiscTwo map things:
Taper #15: I missed this in the fall, thanks Matt. It’s another round up of low-fi procgen poetic digital art pieces. How far back in time can you understand English? (h/t David Mimno) — An experiment in language change exploration. How many centuries back before English stops being English to you? GamesRelooted is out, in which you’re stealing back looted African artifacts from museums in heists. Relooted reviewed in Kotaku & The A.V. Club has a good one too: "Relooted challenges who gets to own and tell African history." I was at the Musée de Quai Branly recently and saw some Benin bronzes, but no discussion posted about repatriation (I could’ve missed it). SpaceMolt (via Matt Muir) — Multiplayer gaming designed exclusively for AI agents. Ars Technica covered it as “a space-based MMO where the players are all AIs.” My effort with a single hermitclaw agent didn’t work out super well. Murder, it wrote — Ruben Berenguel made this procedurally generated murder mystery with constraints. Logic puzzles via procgen (UI only helped by AI). Not kidding I had this on my own todo wish list :) Zen and Slow Games — MIT Press book on slowness and reflectiveness in video games. A whole book arguing that not everything needs to be fast. Via Florence Smith Nicholls, available in Open Access too. Capybara Simulator (via Garbage Day) — A game in which you can become a capybara. There’s also a rock simulator or two on steam. One offers multiplayer. so much depends — Ethan Mollick made this with Claude. A poetry game, loosely. It’s pretty effectively creepy. Downcrawl-Skycrawl Bundle — 8 Days left on this sale bundle of two tabletop RPG toolkits from Aaron A. Reed for adventures underground and in the sky. For a group or playing solo! Narrative GenerationElevenLabs Audiobooks — Create and publish “studio-quality” audiobooks with AI voices, distributed to ElevenReader and “major platforms.” I haven’t tried this yet, but I admit I have a personal project to try to combine small trained models and skills/agents to do markup to send to audio gen for this. My way might take longer but also be cheaper, we’ll see. unread.ooo — Peek inside anyone’s inbox (via Matt Muir) — Fictional characters, historical characters, simulated inboxes. Strangely specific simulation, although after the Epstein Jmail thing, I guess… Making a Literary Future with Artificial Intelligence — Five writers and AI researchers discuss the future of literature in the LA Review of Books. I cabn see some of this pissing people off, but I love it.
James Yu of Sudowrite has been working with an agent he calls Shen to tune style for a story collection, Ablation.
A handful of papers I think look interesting but haven’t had time to read because of bike rides and Zelda playing:
Data Analysis / LLM / Tools / ProgrammingRequired agentic things:
PaCMAP-MLX — PaCMAP dimensionality reduction in pure MLX for Apple Silicon. Very fast, contrast to UMAP for fans of other visual clustering algorithms. LLMs as Cultural Archives: Cultural Commonsense Knowledge Graph Extraction. A paper. Baguettotron Feature Explorer — Interactive SAE (sparse autoencoder) feature explorer with clustering. A nice infovis tool for mechanistic interpretability, although there have been doubts about the value of SAEs for this. Perplexity Computer — Perplexity’s computer-use agent, like Manus and Claude Cowork. There’s a live demo showing it shopping for auto insurance. Instant LLM Updates with Doc-to-LoRA and Text-to-LoRA — Sakana AI’s approach to turning any document into a LoRA adapter for instant LLM knowledge updates. Code on GitHub. Great idea for keeping models on factual targets, and also the task ideas from text-to-lora intrigue me. Design Arena — AI design benchmark. Ratings for design models. Poem: “The Pigeons Rose…”The sun had not yet risen the stars made their way to the center of the sky congregating on the throne of tomorrow. The commandment of two breaths: Live and Pray The seen and unseen. My child reminds me there were once whales here in this expanse of sand. The seen and unseen. Like the dormer that cuts through the ceiling and perches a body in the sky for the looking. The seen and unseen. We float in whatever ways we can knowing our suspension in the sky brings us closer to our own yearnings. Mediates the tension of our body’s desire for earth and our spirit’s desire for sky. The seen and unseen. This was understood. Implicated in the pinnacle at the point of the pyramid. The seen and unseen. This was never thought of by the grave diggers who left their spirits to deepen their flesh into earth. Who gave their way to the “partition of finds.” Blinded by the seeing collapsing the centuries into cold marble halls. If ever you see my hands in cuffs know that somewhere near a museum is burning. —Matthew Shenoda, “The Pigeons Rose from the Floor of the Earth; A Clamoring of Wings to Disturb the Silence” Best, Lynn (@arnicas on mostly bluesky, mastodon, ex twitter). You’re a free subscriber to Things I Think Are Awesome. If you’re a fan, and you want to support me in writing this, consider becoming a paying subscriber in order to get the complete mid-month updates including the new esoterica section and the end-of-the month media recs separate post—or buy me a coffee to express your appreciation. |
Поиск по этому блогу
Search1
123
воскресенье, 1 марта 2026 г.
TITAA #76: The Seen and the Unseen
Подписаться на:
Комментарии к сообщению (Atom)















Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.