You are viewing a single comment's thread from:

RE: OpenAI's Latest LLM - First Model That Reasons, But Also Deceives and Attempts to Self-Preserve

in Proof of Brain4 days ago

Oh right, testing the full extent of what AI can and can't do in such an environment makes sense then. I heard that AI, especially the model is like a blackbox of sorts, you can never fully tell what it'll do, even after setting up necessary guardrails. I like to see it from a source perspective, like what powers the AI or the "seed" where it emerges from. If we don't lose access to, then we'll be relatively safe or okay from the AI going rogue,I guess.

Sort:  

I heard that AI, especially the model is like a blackbox of sorts, you can never fully tell what it'll do, even after setting up necessary guardrails.

There is a field called mechanistic interpretability that tries to understand how each model arrives to certain decisions/outputs starting from a prompt. But they haven't understood much about it yet. Much like neurology, I guess.

Maybe with the Chain-of-Thought upgrade to ChatGPT, the way it reasons would become more obvious.

I guess it's complex to say the least. But the name "Chain-of-Thought" sounds cool. Like following a thread of a thought from its inception to wherever the end of the thought is before its uttered or compels one to do an action or neither of the two...

It's actually pretty cool. From what I've seen in screenshots, the user actually sees the CoT of o1 before it outputs the answer. Sees where it thinks and what it thinks about... Sort of, but quite powerful for the initial iteration of such a feature.