When was the last time you asked an AI system a health question? If it was recent, you are in good company. Over 230 million people globally ask ChatGPT health and wellness questions every week.

I had the good fortune of joining Katelyn Jetelina of Your Local Epidemiologist and Dr. Brian Anderson of the Coalition for Health AI for a recent webinar on exactly this topic. The questions from the audience were so good that I wanted to write up a guide to share with all of you.

At Dewey, we build AI systems for experts and publishers - but we're also humans with the same AI tools on our phones and personal health questions as everyone else. Understanding how these tools actually work is the secret to using them in a safer, smarter way. So let's get started.

What's happening under the hood?

An LLM, or large language model, is the engine running under the hood of every major AI product you're using today: ChatGPT, Claude, Gemini. They're all different in subtle ways, but they share the same core mechanism: prediction.

Given everything that came before, what word is most likely to come next?

Take the sentence "I asked my doctor about my..." The model predicts "symptoms." Not because it looked it up or consulted a source. Because it's been trained on so much human language that it knows what word tends to follow that particular sequence. That's the trick, at an almost incomprehensible scale.

These models have ingested more text than any human could read in a lifetime, and the outputs can be remarkable. But they've also been designed to produce confident, plausible-sounding output, not to know when they should say "I don't know." That distinction matters everywhere. In health, it matters most.

How does an AI "know" things?

There are three broad ways an AI tool can access information, which form the backbone of what it "knows."

The first is what researchers call parametric knowledge. This includes everything baked in during the training process. As a result, it has a clear cutoff date. You can think of this as the word soup powering the prediction engine we mentioned above. It includes trillions of words from every source you can imagine, ranging from top medical journals to all of Reddit. The training data is not publicly disclosed by the major providers, so you never know exactly what is being referenced.

The second is web lookup, which is increasingly becoming the default mode. When you give an LLM web access, it does a search, not unlike a Google search, to pull relevant content from the web before answering your question. As a result, you can get highly up-to-date information and generally more dynamic answers. Challenges still abound, though, as the highest-ranking result isn't always reliable, and much of the highest-quality information is behind a paywall that the LLM can't access. On the plus side, when in this mode, LLMs provide references, visible as small icons below the answer that link to the live pages.

Samples of how Claude & ChatGPT show they are using web sources

The third is RAG, Retrieval-Augmented Generation. You'll encounter this primarily in custom and enterprise-grade solutions. With RAG, the LLM is directed to retrieve information from a specific, curated set of documents. This could include a specific website, a set of books, or a body of research. In a HIPAA-compliant environment, it could also include a patient's full medical records. Because of these constraints, RAG systems are considered the most reliable and are much less likely to hallucinate. Like all of these tools, though, they can still make mistakes and are only as good as the data they were trained on.

Why AI for health is both powerful and risky

The stories are real. People have found diagnoses for mystery symptoms and been told to go to the ER with conditions that turned out to be life-threatening. Having a tool that has read much of human knowledge to date, plus up-to-date access to the web, is something close to a superpower in your pocket.

But that power comes with real risks, and in health the stakes of getting something wrong are far higher than in most other domains. A bad restaurant recommendation costs you a meal. A wrong answer about a drug interaction could cost you much more.

At the moment, most of the responsibility to use AI well lies with you as the user. Here's what to watch for, and how to protect yourself.

What should I watch out for?

There are two types of failure modes worth understanding: how the model generates responses, and the structural limits of the system itself.

On how the model responds:

An example of sycophancy in practice

Sycophancy is one of the trickiest and most critical risks for medical questions. In short, these tools are designed to be agreeable. If your question implies an answer, they tend to confirm it. If you push back, they tend to fold.

In health, this maps directly onto confirmation bias, which is how people end up more certain of a wrong self-diagnosis than when they started. If you type "I think this rash is chickenpox, is it?" you are much more likely to get a chickenpox answer than if you type "What are the ten most likely causes of a rash with these characteristics?" The model isn't reasoning through your symptoms. It's reflecting your framing back at you.

A mocked-up example of a hallucination in an otherwise accurate answer

Hallucinations are a close second. While these have improved significantly, they are increasingly hard to detect. Today's hallucinations are often very subtle: one quietly wrong detail buried inside an otherwise accurate answer. The error could be a medication dosage slightly off, or a fact inverted in a way that meaningfully changes the result. These are the answers you trust and check least carefully, because they sounded right.

On the structural limits of the system:

Source quality. With parametric knowledge, we can't know what information is being referenced, so a core claim could be based on a ten-year-old forum post. With web lookup, you still risk pulling in a source with critical misinformation that the model treats as fact.

Missing context. When you go to your doctor, they have your full medical history and ask follow-up questions to fill in any gaps. When you ask Claude about alternatives to eyedrops for treating glaucoma, you're getting the textbook, generic version of the answer. The model lacks your specific context and doesn't know to adapt to your treatment history or past experiences. That's easy to forget in the moment.

Privacy. Consumer versions of the major LLM models, whether on a paid or free tier, are not HIPAA-compliant. They transmit your information over the internet, which means there is real risk of hacks, breaches, and other leaks. If you wouldn't put it in an email, don't put it in a chatbot.

AI's current superpowers in health

There's a principle that holds across almost every safe use of AI for consumer health: use it as a starting point, not a conclusion.

With that in mind, here's where these tools genuinely shine.

Plain language translation of jargon. If your doctor sent you something you don't understand, or you're trying to parse a research study, AI is remarkably good at making it readable. You're giving it a fixed text to work from, which limits its ability to invent things. One of the lower-risk uses, and one that helps make us all more informed.

Brainstorming possibilities. "What are the most common causes of a sore throat for a kid?" With this question, you're opening doors. Maybe you had a common cold in mind, but the LLM mentions strep and that inspires you to book a pediatrician appointment. The AI is providing a jumping-off point for further research or action, not a diagnosis.

Appointment preparation. "I have an appointment on Monday. Here's what's been going on. What questions should I come prepared to ask?" You're not asking it to diagnose anything. You're asking it to help you be a better participant in a conversation with an actual clinician. This is a powerful way to advocate for yourself.

The pattern across all of these: you're using the model to help you think, not replacing your own critical thinking or the judgment of a trained professional.

Practical guidance

Use a paid plan. As a free user, you are paying for access with your data. There is no option to opt out of training the underlying models. At $20 per month for both ChatGPT Plus and Claude Pro as of March 2026, a paid plan is worth it. Paid tiers unlock the most recent models, which are often meaningfully smarter than older generations, and give you access to new features as they're released.

Privacy options for ChatGPT & Claude

Turn off the training toggle yourself. Even on paid plans, both ChatGPT and Claude default to using your conversations for model improvement unless you opt out. On ChatGPT, look for Data Controls. On Claude, it's under Privacy Settings. For Claude, opting out means your data is deleted within 30 days rather than retained for up to five years. For ChatGPT, opting out stops your conversations from being used to train future models. Either way, it's worth doing before your next health-related conversation.

Use incognito mode for sensitive questions. In this mode, conversations aren't saved to your history and aren't used for training. ChatGPT calls it Temporary Chat. Claude calls it Incognito Chat. I use this most of the time for questions about my kids or anything I don't want tied to my broader usage history.

Frame questions openly. "Tell me about the safety profile of Wegovy" gets you more than "Is Wegovy safe?" because it doesn't lead the model toward a conclusion. The more openly you frame your question, the more balanced the response.

Develop a reusable opening prompt. For deeper medical research, you can use an opening prompt setup to guide how the system responds and increase answer quality. Here's one to start from: "Use the most up-to-date research available to you, with citations where possible. Always present a balanced view, including pros, cons, and any areas of genuine uncertainty."

Push back on the first answer. Even the best questions can carry hidden bias. Once you've gotten an initial answer, ask what it might be missing, ask for the research behind the claim, or ask it to steelman the other side. These follow-up questions expand what you see and give you a more balanced picture to build your own conclusions from.

AI won't replace your doctor, and it doesn't need to. Used thoughtfully, it can make you a more prepared, more informed participant in your own healthcare. That's worth a lot.

Keep Reading