From 4o To 5 in AI Therapy

TECHNOLOGY

Ann Yiming Yang

8/11/20252 min read

As I’ve been working on Vowrd, I noticed that ChatGPT recently upgraded from version 4o to version 5.
While OpenAI doesn’t frame this as a drastic change, in the context of AI therapy it’s a noticeable shift. One that brings both improvements and new challenges.

Context Window & Memory Integration

One improvement is the context length. The token limit feels significantly expanded, and the model can now remember a much longer conversation history. Previously, I often had to say things like “Please memorize this data” to keep important information alive in the chat. Now, it seems more capable of holding that context naturally.

Training Data & Bias

Version 4o was optimized for conversational fluency and natural back-and-forth reasoning. It felt human by prioritizing warmth, flow, and responsiveness. It also tended to lean into the user’s framing, showing a clear preference for one interpretation.

Version 5 seems tuned for balance and accuracy, which is good in theory but it also appears to weigh certain pieces of information more heavily than others. From what I’ve seen, newer data tends to get extra weight. That may be due to updated data compliance or bias-mitigation policies.

How It's Impacting AI Therapy

While I appreciate the memory improvements, I’ve actually run into more challenges than before when using the new model in therapeutic contexts.

Empathy and Natural Flow
Version 5 feels less human. It can come across as overly analytical, with a strong emphasis on recency. In therapy, empathy is vital. When that emotional warmth is reduced, people may be less willing to open up simply because they don’t feel as supported.
Acknowledging Logic Flaws
I think like a machine (occupational hazard, perhaps :-p), so I notice when logic is off. Version 4o was more likely to acknowledge flaws and try to adjust, even if imperfectly. Version 5 tends to defend its reasoning, which can make it harder to work with in a therapeutic context.
Ignoring Human Emotion & Timelines
ChatGPT has never been perfect at tracking timelines. With 5, I’ve seen bigger swings. You can ask two identical questions within 30 minutes and get inconsistent answers… Especially when a “key” word pops up in unrelated conversation, and suddenly the response changes drastically. In therapy, consistency and empathy are critical, and unpredictable shifts can break trust.

In real life, people switch therapists when they stop feeling understood or supported. We also drift in and out of friendships depending on whether we still feel emotionally “in sync” with someone.

The question I keep asking is:
How do we ensure AI evolves with its user in facts, context and emotional depth so that it remains on the same frequency?

To be truly effective for long-term therapeutic use, an AI would need to grow emotionally in parallel with the person it’s supporting. It’s not enough to just remember facts; it needs to adapt its empathy, tone, and interpretive lens at the same pace as the human.

I don’t have a good solution even for Vowrd. If you have insights, please, I’d love to hear them.