So they're saying ai is software?
Maybe Volkswagen will start using it in their emissions control systems.
So they're saying ai is software?
Maybe Volkswagen will start using it in their emissions control systems.
LLM trained on adversarial data, behaves in an adversarial way. Shocking
Yeah. For reference, they made a model with a back door, and then trained it to not respond in a backdoored way when it hasn't been triggered. It worked but it didn't effect the back door much, and that means that it technically was acting more differently - and therefore deceptively - when not triggered.
Interesting maybe, but I don't personally find it surprising, given how flexible these things are in general.
Just… don’t hook it up to the defense grid.
Sorry, to late for that
Alright, I’ll be out back digging the bomb shelter.
Its too late for that honestly
Alright, I’ll switch to digging holes for the family burial ground.
Great, we are all going to die