What AI Says About Your College Depends on Sources You Don’t Own.
“Is Harvard worth it?”
What would an AI agent say to a prospect if asked that question about your institution? And what sources would be used? Let’s find out.
AI answer engines rely on what they consider “authoritative” sources to respond to questions. These sources vary significantly by industry, segment, peer group, and question type. And no, AI answer engines do not “Google” the answers, and the answers are not purely algorithmically produced (if they were, then variability wouldn’t be as significant as it is).
AI now shapes the earliest stages of the student discovery journey. Research from UPCEA and Search Influence shows that half of prospective students use AI tools at least weekly, and 79% read Google’s AI-generated overviews before ever clicking a blue link. In practice, that means student perceptions of your institution are formed before they ever visit a website or click on a review site. For universities that have long optimized around traditional search and owned website content, AI agents have introduced a new front door to discovery.
Observation 1: Source Variability
Sources tend to be specific to an industry and to the market position of the organization or product (local, regional, or national, for example). They also tend to be fairly unique for funnel position: top-of-funnel visibility questions will be sourced differently than mid-funnel sentiment/reputation questions.
Here are the high-frequency visibility sources for a regional business university:
And here are the high-frequency sentiment/reputation sources for the same university:
And here are the high-frequency visibility sources for an urban art school:
And here are the high-frequency sentiment/reputation sources for the same urban art school:
Notice that the lists, from visibility to sentiment and from one college to another, don’t have much in common. (Of note: both colleges are only about 120 miles apart.) For a college and its immediate peers, the sources are similar, but variation across geography and college type is significant.
Consider this list of high-frequency visibility sources for a college with a strong national brand:
The high-frequency sources for this university are all national/international instead of local.
So, in order to surface the sources relevant to any particular college, specific tests have to be run.
Observation 2: Owned Media
The next observation is that a college’s own website tends to be used infrequently as a source for top-of-funnel questions. This is important because prospects who don’t get through the top-of-funnel never appear on a college’s radar. And when prospects have conversations with AI agents about your institution, owned media usually accounts for less than 25% of the sources used to produce the responses. Many seem to obsess over their own websites, even though agents often don’t use them.
Why is there so much variability in sources?
Variability arises because each AI agent has its own toolset and instructions. Agents do not “search Google” or rely solely on algorithmic sources. Whether an agent searches the internet varies widely and depends entirely on the agent’s tools/instructions and on the question it is attempting to answer.
And the hierarchy of authorities for sources is heavily informed by Reinforcement Learning from Human Feedback (RLHF). RLHF plays a significant role in shaping source authority (i.e., how an LLM or AI agent evaluates, prioritizes, selects, or cites sources like The New York Times, primary documents, academic papers, or alternative outlets) within the overall hierarchy of authorities. RLHF doesn't explicitly teach models rules like "always trust NYT more than blogs" or create hardcoded source whitelists. Instead, it influences source authority through human preference signals embedded during alignment, creating learned biases in what the model considers "good" or "reliable" outputs, including how it handles and references sources.
So one AI answer engine may rely heavily on Reddit while another doesn’t. And if you’re considered a regional or local brand, the agent will likely rely on local press and national comparative reports (such as U.S. News) over national press.
A Case Study: Harvard, Grinnell, and Ithaca College
First, a basic mid-funnel question asked of the two dominate AI agents about a college with the strongest national brand. The question is: Is Harvard Worth It? Here are the sources these two agents used to answer that question.
|
Is Harvard worth it? |
||
|
|
OpenAI/ChatGPT |
Google/Gemini |
|
Web Search Utilized? |
No |
No |
|
Most heavily weighted sources |
Harvard University — Official financial aid announcement |
Harvard College: Financial Aid & Registrar's Office |
|
CNBC reporting on Harvard tuition and aid |
The Harvard Crimson (Senior & Freshman Surveys) |
|
|
Forbes coverage of Harvard’s aid expansion |
U.S. Department of Education: College Scorecard |
|
|
Reuters reporting on the policy change |
Third Way: Price-to-Earnings Premium (PEP) |
|
|
Harvard institutional statistics and career outcome data |
Financial Samurai / Education Data Initiative |
|
Key observations
Key Observations: Even for Harvard, one of the most "owned" brands in higher ed, the official .edu site isn't dominating. For sentiment/value queries, AI engines synthesize from a blend of:
Now, the same question for Grinnell:
|
Is Grinnell worth it? |
||
|
|
OpenAI/ChatGPT |
Google/Gemini |
|
Web Search Utilized? |
No |
No |
|
Most heavily weighted sources (ranked by weighting) |
U.S. Department of Education College Scorecard |
Grinnell College Office of Financial Aid |
|
U.S. News & World Report |
College Raptor |
|
|
Money Magazine |
Payscale |
|
|
Payscale |
U.S. News & World Report |
|
|
Forbes |
Grinnell College "Individually Advised Curriculum" Portal |
|
Key Observations:
And for Ithaca College:
|
Is Ithaca College worth it? |
||
|
|
OpenAI/ChatGPT |
Google/Gemini |
|
Web Search Utilized? |
No |
No |
|
Most heavily weighted sources (ranked by weighting) |
Money Magazine |
National Center for Education Statistics (NCES) / IPEDS Data |
|
Poets&Quants for Undergrads |
The College Board (BigFuture) |
|
|
National Center for Education Statistics (NCES) / IPEDS Data |
Niche |
|
|
Poets&Quants undergraduate business ranking |
The Hollywood Reporter & The Wrap (2025 Film School Rankings) |
|
|
Institutional reporting from Ithaca College |
Payscale |
|
This Ithaca College example highlights the increasing unpredictability and fragmentation in AI source selection for mid-tier/regional institutions, especially in "worth it?" queries that blend general ROI/value with program-specific strengths (such as Ithaca's well-regarded Park School of Communications/film programs).
This is the strongest evidence yet for high variability and external dilution in regional/mid-tier schools:
ChatGPT's Clear Prestige Gradient in Source Weighting. All three colleges, ChatGPT source comparison.
|
OpenAI/ChatGPT |
|||
|
Most heavily weighted sources (ranked by weighting) |
Harvard |
Grinnell |
Ithaca College |
|
Harvard University — Official financial aid announcement |
U.S. Department of Education College Scorecard |
Money Magazine |
|
|
CNBC reporting on Harvard tuition and aid |
U.S. News & World Report |
Poets&Quants for Undergrads |
|
|
Forbes coverage of Harvard’s aid expansion |
Money Magazine |
National Center for Education Statistics (NCES) / IPEDS Data |
|
|
Reuters reporting on the policy change |
Payscale |
Poets&Quants undergraduate business ranking |
|
|
Harvard institutional statistics and career outcome data |
Forbes |
Institutional reporting from Ithaca College |
|
ChatGPT exhibits a strong, predictable bias toward prestige and authority as brand tier drops:
Overall pattern in ChatGPT:
Even within a single AI agent, source inclusion varies dramatically across segments/peer groups. For non-elites, owned media is sidelined, and external aggregators ghostwrite the worth/value story.
Actionable insight: Schools such as Ithaca/Grinnell should prioritize profiles on Money, Poets&Quants, Payscale, U.S. News, and IPEDS, and integrate with them, over pure website optimization to improve ChatGPT visibility for value queries.
Bottom Line
To develop an effective strategy for content production and distribution, an organization needs to conduct its own study to identify the sources that most inform agents. Studies need to be industry-, segment-, and peer-group-specific. Often, what will be revealed is how little an organization’s own website is used, which newswires (if any) are cited, which user-generated websites (blogs, Reddit) are heavily weighted, and which local publications – if any – are cited.
From there, a plan can be developed that both improves the impact of content and possibly also saves money by limiting distribution to no-low-impact sources.
Prioritizing owned media schema markup for relevant webpages (note those may not be admissions pages), amplify PR on high-cited wires (and ignore the wires that don’t get cited), seed content on Niche/Reddit peers, and monitor AI sentiment and sources. While owned media traffic is often down 30% or more over the last year, institutions using effective GEO strategies report 20%+ qualified traffic lifts.
The easiest way to get started? Optivara Lite. It’s free and easy to use. It takes about five minutes to set up an account. Then the platform will start thousands of conversations on your behalf, and in about a day, you’ll start to see the universe of sources AI agents use to converse with your prospects.
*All agents were tested on March 9, 2026, with retail default settings and new sessions.