RE: LeoThread 2024-10-21 05:25

You are viewing a single comment's thread from:

RE: LeoThread 2024-10-21 05:25

View the full context
View the direct parent

taskmaster4450le (81)in LeoFinance • last year

“As AIs become more capable,” writes Anthropic’s Alignment Science team, “a new kind of risk might emerge: models with the ability to mislead their users, or subvert the systems we put in place to oversee them.”

Therefore we should look into ways of gauging “a model’s capacity for sabotage.”

last year in LeoFinance by taskmaster4450le (81)

$0.00

Sort:

Trending