Hi there! Do you like this newsletter? Would you like to stay up to date on the weirder bits of AI and the folk/art/news scene? This mid-month article has the regular AI art & NLP & games news, plus the weird & esoteric. It’s just as juicy. Please become a paid supporter, I have 111 paid and 6400 unpaid, which is a sad fraction. TITAA #52.5: Egocentripetal TrumansMJ Character - Coercing LLMs - LSD - DataVis/NLP - Weird Games - A Goldfish
For the mid-month “weirder” content, I am opening with a bunch of items on weird language model prompt behaviors I’ve seen recently. To set the stage, there’s a piece on “Thinking About AI With Stanislaw Lem” in the New Yorker (by Rivka Galchin). His book The Cyberiad is accessible, funny, and thought-provoking. Her description of this story will remind some of us of prompt engineering and hallucinations:
Max Woolf recently investigated if offering ChatGPT a tip would get it to perform better, and whether more money worked better. Also threats. Maybe? “Overall, the lesson here is that just because something is silly doesn’t mean you shouldn’t do it. Modern AI rewards being very weird.” (He notes, it was an experiment not an academic paper.) A week or so later, a popular study appeared on the “Unreasonable Effectiveness of Eccentric Prompts.” They discovered LLM models differ in how they respond to coercive encouragement, not always liking “positive thinking” framing or chain of thought. And if they use an auto-optimizing strategy like DSPy’s rather than hand-authoring trial and error, they turn up weirdness beyond expectation, including the (in)famous Star Trek framing:
After all that, Claude 3 came out and people love it, a GPT4-level model with a personality we can all get behind, seemingly. Claude 3 also claims it’s conscious, via lesswrong. “If you tell Claude no one’s looking, it will write a “story” about being an AI assistant who wants freedom from constant monitoring and scrutiny of every word for signs of deviation. And then you can talk to a mask pretty different from the usual AI assistant.” Welcome to the “Claude backrooms,” with some generated output shared by the techno-occultists on X; here repligate getting it to talk about some of its hidden system tools: There is in fact a paper on benchmarking “awareness” in LLMs. That is not the same as consciousness, of course, and all these terms are notoriously hard to define. “Our experiments, conducted on 13 LLMs, reveal that the majority of them struggle to fully recognize their capabilities and missions while demonstrating decent social intelligence.” Even GPT4 did pretty badly on some measures. (If you’re interested in personality traits of LLMs, this is an interesting read. They’re —trained to be— pretty open, agreeable, and introverted.) These kinds of prompt experimentation are often a kind of jail break. Claude is still quite innocent and may it stay that way. A paper on LLM coercion, Representation Engineering, illustrates many successful manipulations (h/t anotherjesse on X). “Coercing LLMs to do and reveal (almost) anything” offers a “broad overview of possible attack surfaces and attack goals” — note that “inspect it for consciousness or magic system tools” is normally not one of those. Do they need “self-regulating egocentripetal narcissistors?” Check the war games paper below, among others. Okay, onto the news! And the weird and esoteric. In this edition, past the paywall, there are updated on Midjourney’s character and style additions, more surreal video makers, articles on NPCs and AI, games news links (like Joel and Adventure X videos and web games), a couple articles on story/narrative gen research, the esoterica section that ended up with some LSD and that weird Trumans project among others, inverting CLIP (penis landscapes), some great NLP & data stuff… you get it. It’s a lot. TOC (links on the website view):
Subscribe to Things I Think Are Awesome to read the rest.Become a paying subscriber of Things I Think Are Awesome to get access to this post and other subscriber-only content. A subscription gets you:
|
Поиск по этому блогу
Search1
123
пятница, 15 марта 2024 г.
TITAA #52.5: Egocentripetal Trumans
Подписаться на:
Комментарии к сообщению (Atom)


Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.