TASM Notes 008

Mon Feb 26, 2024

So I'm gonna level with you. I've had a bunch of extra stuff to do lately and haven't been keeping up with my blog writing. Instead of working this into a full blog post, or getting ChatGPT to try to do it for me (something I still haven't satisfactorily mentioned), I'm just going to drop mildly edited notes directly into the published blog. Sorry, and also somehow not sorry? I admit that this is probably worse than taking the time to go through and write full prose, but probably not worse than never publishing it. If you have strong feelings about it one way or the other, let me know. If this is good enough, I'm probably going to just keep doing this going forward.

Note that I'm a couple weeks behind at this point; I'm posting this one now and possibly another one in the next couple of days.

Pre-Talk Chatting

Zvi's Update

The Talk - Power Seeking AI

Not on today's menu: would an AI become power seeking? Why might it want to power seek?

"Power" is the ability to act or produce an effect. "Power-seeking" is aiming to increase ones' ability to do more things, in particular relative to other actors in a given scenario.

We're mostly talking about autonomous AI agents, but some of this stuff also applies to directed AI.

Things to keep in mind

Hacking Computer Systems

Pub topic: Are models actually getting better at coding? How likely are they to get much better here?

Control More Resources

Run Many Copies

Hire or Manipulate Human Assistants


Persuasion and Lobbying

I expect AI to be capable of superhuman persuasion well before it is superhuman at general intelligence, which may lead to some very strange outcomes - Sam Altman

I have a lot of thoughts regarding how two entities go about interacting. If a model of reality fits in one of their heads but not the other, it gives that one a lot of advantage in terms of persuasion. But also, how often is it the case that you want someone to do something they don't want for their own good? Possibly the fact that I'm a parent gives me more immediately memory-accessible examples of this, but lets just say I spend a lot of time trying to prevent agents' behavior in order to keep those agents free from harm. Pub talk though.

Social Engineering

Escaping containment

Manufacturing, Robotics & Autonomous Weaponry

Post Talk

Not much post talk, we headed to the pub to follow up on all of the above threads we cut off. If you're interested, come join us next time.

