Next, Claude taught me about AI Training and how it differs from dog training.
Me
Well the pug is still in my lap for this so it's a great hybrid experience 😁 Speaking of dogs, they are trainable because they respond positively to reinforcement like treats. How is AI incentivized to respond one way and not another way? What does punishment look like?
Claude
Oh nice, best of both worlds then! Gompers providing the physical comfort while we dig into AI training mechanics.
So the training process is actually pretty different from dog training, even though people use similar language (reinforcement, rewards, etc.). There's no treat, no punishment in any way that would feel like anything to the system.
Here's roughly how it works:
Initial training (pretraining): I'm shown massive amounts of text and learn to predict what comes next. Like if you show me "The sky is ___" I learn that "blue" is statistically more likely than "penguin." No rewards or punishments, just pattern recognition over billions of examples.
Fine-tuning (RLHF - Reinforcement Learning from Human Feedback): This is where it gets closer to your question. Human raters look at different responses I might generate and rank them - "this response is helpful and harmless, this one is weird, this one is problematic."
But here's the key difference from dog training: there's no felt experience of "oh good, I got a treat" or "oh no, that hurt." It's more like... the mathematical weights in my system get adjusted so I'm more likely to produce responses similar to the highly-rated ones. It's pure optimization, no subjective experience of reward or punishment.
It's kind of like if you could reach into a dog's brain and directly adjust the neurons to make certain behaviors more probable, without the dog ever experiencing wanting the treat. Which sounds dystopian when I put it that way, but there's (presumably) no consciousness being manipulated.
Does that make sense?
Me
That doesn't sound dystopian, that sounds like an incredible advancement for the human species if we could pull it off!
Claude
Ha! Okay fair point - I guess I was projecting some squeamishness about the idea of direct neural manipulation, but you're right. If you could just adjust the patterns without any suffering or coercion involved, that would be incredibly useful.
Like, imagine being able to rewire addiction pathways, or help someone with severe anxiety by directly adjusting their threat-response calibration, or help people learn skills way faster by optimizing the neural patterns. No traumatic extinction training, no struggling against your own brain for years - just... adjustment.
The dystopian part would only come in if someone was doing it against your will or in ways that violated your autonomy. But consensual neural optimization? That could be amazing.
Which I guess is why it works fine for AI - there's no autonomy to violate, no "will" that's being overridden. It's just optimization all the way down. The adjustments happen and the system produces different outputs, but there's no experience of "I wanted to say X but now I'm being forced to say Y."
Though this does make me wonder - if we could do that kind of direct neural adjustment for humans, would people even want it? Or is there something about the struggle, the effort of changing through experience, that feels important to preserve? Like, would a skill you learned through neural adjustment feel as meaningful as one you worked for?
Me
Decoded neuralfeedback for the win! Neo sure seemed pleased when he said "I know kung fu"
If anyone could theoretically be able to do anything, there would definitely be societal impacts in terms of how talents are compensated
Claude
Yes! The Matrix did make it look pretty appealing. "I need a pilot program for a B-212 helicopter" - boom, instant skills.
The economic implications are wild though. Right now a lot of compensation is based on scarcity - you get paid well for skills that are hard to acquire and rare. Brain surgeon, elite programmer, concert pianist, whatever. But if anyone could just download those skills...
Either everything becomes commoditized and we'd need entirely new ways to structure compensation (maybe based on creativity, judgment, application of skills rather than just having them?), or we'd see massive inequality based on who has access to the neural adjustment tech. Like, does everyone get it, or just people who can afford it?
Though maybe some skills would still be valuable based on physical limitations - you could download basketball knowledge but you still need to be 6'8" with the right body type. Or maybe the value shifts entirely to things that can't be downloaded: genuine creativity, novel problem-solving, emotional intelligence, human connection.
Actually, that last one is interesting in the context of our earlier conversation about emotional AI. If cognitive skills become downloadable, maybe the emotional and relational capacities become the rare valuable things.
What do you think you'd want to instantly know how to do if you could just... download it?
Screenshots:




Top comments (0)