Writing · Economics & measurement
The "This Changes Everything" Index.
I tried to measure the AI hype cycle in Google Trends, and the instrument kept breaking. The way it broke was the finding: a measurement goes blind in precisely the direction the technology is moving, and the blind spot itself is the reading.
The premise
An argument with myself.
I built an index this year, mostly to settle an argument with myself. Every few weeks for two years the AI discourse has announced, with total confidence and a straight face, that everything just changed. I wanted to know whether everything actually kept changing or whether I was only marinating in the announcements, so I started counting them. Month by month, how loud was the "this changes everything" drumbeat, and against what.
What came out was a shape anyone who has watched a technology cycle will recognize on sight. A first act of pure excitement through 2023: GPT-4, the Bing-versus-Google search wars, the sense that the floor had dropped out from under the old way of doing nearly everything. A second act of disillusionment through 2024, the slop era, the uncanny six-fingered hands and the flopped hardware gadgets and the Goldman Sachs note asking whether any of the capital expenditure would ever earn its way back. Then a third act of resurgence through 2025, quieter than the first, as the agentic tooling started actually working and the people who build things began rearranging their work around it.
That shape is not my discovery, and I want to say so plainly before I go any further, because the whole point of this essay turns on being honest about what is mine and what is borrowed. It is the Gartner Hype Cycle, the peak of inflated expectations and the trough of disillusionment and the slope of enlightenment, drawn ten thousand times since the nineties and fitted to AI by every analyst with a slide deck. I was not revealing a pattern. I was a practitioner trying to replace a vibe with something I could point at, which is a smaller and more useful ambition. I track this cycle the way I track weather: to read direction before I make a decision, not to claim I discovered the front coming through.
So I went looking for a harder version of my index in Google Trends. The plan was simple. Pull the search interest for the major models and the major moments, line it up against my hand-counted hype, and see whether the public's curiosity actually followed the three acts. The plan did not survive contact with the data, and the ways it failed turned out to be far more interesting than the chart I set out to draw.
Five things broke
The instrument keeps breaking.
The first thing that broke was the baseline. I reached for "ChatGPT" as the obvious proxy for AI interest, and it swallowed everything else whole. Not because OpenAI's product is that dominant, though it is dominant, but because "ChatGPT" has quietly become the word people use when they mean AI at all, the way "Google" became the verb for search and "Kleenex" became the noun for tissue. A large share of the people typing it are not asking about a specific product; they are pointing at the whole category with the one name they own. So the "ChatGPT" curve is not a measurement of OpenAI's mindshare. It is an ambient hum of general AI awareness, and it sits under every other signal loud enough to drown all of them at once. To see anything else, I had to stop treating it as a line on the chart and start treating it as the noise floor, and subtract it out.
The second thing that broke was comparability. Google keeps clean, distinct entities for the OpenAI versions, so "GPT-4" and "GPT-4o" and "GPT-5" each read cleanly. It does not keep them for Claude or Gemini; there is no tidy "Claude 4.8" entity, only the literal string, which catches a fraction of the real interest and bleeds into unrelated substrings besides. The instant I put the two together I was comparing a calibrated gauge against a bent one, and the bent one always read low. Anthropic's models looked like a rounding error beside OpenAI's, and that gap was an artifact of how Google chose to catalog them, not a fact about the world.
The third thing, once I had a roughly comparable scale, was how small the serious signal actually is. The people who search a specific model number, "GPT-5" or "Claude 4.8" or "Gemini 3," the ones tracking what actually shipped rather than what the category is called, are a minority of a minority against the generic hum. At its strongest, the entire model-number field, every version of every model summed together, ran several times smaller than the bare "ChatGPT" stream, and against any single flagship term the gap widened toward twenty-fold, so the audience that knows the difference between releases is a rounding error against the audience that knows only the brand. I assumed that was a measurement failure, but it turned out to be the most honest thing the chart had to say, and I will come back to it.
The fourth thing that broke is the one that reorganized how I think about the whole exercise. Google Trends measures Google search. That reads as a tautology until you ask who is not on Google for this particular question, and the answer is two groups, and they are exactly the two the chart was burying. The developers who live closest to these models do their research inside the tool, the docs, the repositories, and the model itself; they ask Claude about Claude. Measured by usage on OpenRouter, where developers actually route their model traffic, Claude served the majority of programming tokens through most of 2025 and ranked first in developer spend, a thunderous signal that is nearly silent in Google search, because the people generating it never typed the question into a search bar. The Chinese models carry the same problem at national scale: Google holds on the order of three percent of search in mainland China, where Baidu dominates, so the domestic interest in DeepSeek or Qwen is invisible to the instrument almost by construction. The chart was undercounting these two not by accident or by noise but because of who they are and where they go, which is a structural blindness rather than a measurement error.
Which points straight at the fifth thing, the one that comes with a name and a number already attached. The reason the instrument is going blind is that the territory moved underneath it. Ordinary information-seeking has been migrating off the search box and onto the models themselves for two years; Gartner has forecast traditional search volume falling roughly a quarter by 2026 as answer-engines absorb the queries, and the spread of AI Overviews and zero-click results points the same direction. People who used to Google a question now ask a model and never touch a search engine on the way. This, too, is not mine. It is a well-covered shift that analysts have been charting for a while, and I am borrowing it rather than pretending to break it.
The reflexive turn
The blind spot is the measurement.
Borrowing it is what closes the loop, because here is what the whole frustrating exercise actually taught me, and it is the only part I would call my own. The undercount is not noise to be scrubbed out of the chart, because the undercount is the chart, since the instrument is going blind in precisely the direction the technology is moving, which means the blind spot itself is a reading: the harder a population is to see in Google Trends, the more completely that population has already left for the new way of working. The developers and the Chinese users are faint not because they are few but because they got there first. I set out to measure the migration to AI, and the cleanest evidence of the migration was that my instrument for measuring it had stopped working in exactly the places the migration had already finished. You cannot measure this shift with the old ruler, because the shift is what is bending the ruler.
I did eventually force a defensible picture out of it, and the figure is worth keeping for what it shows once the corrections are applied: the generic baseline removed, the developer and Chinese signals lifted back toward where the platform data says they belong, the model field shown for the meaningful, growing minority it is rather than the rounding error the raw chart made it look like. But I hold that corrected chart loosely, and I label it as what it is, an estimate stitched together from documented adjustments rather than a clean measurement. The honest output of the whole effort was never the chart but the realization about the ruler that the chart had forced on me.
The work
This is not a story about AI.
I want to step out of the hype index here, because this was never really a story about AI, and it is not a new story even as a story about measurement, since it is the one I spend my professional life inside.
In security data work the instrument is the telemetry, and the standing temptation is to mistake what the instrument can see for what is there. You measure the threat by what lands in the SIEM, and you quietly conclude the threat looks like your log sources. You measure tool adoption by who answered the survey, and you conclude the field looks like the kind of people who answer surveys. You measure detection coverage by the rules you wrote, and the attacks you have no rule for become, by definition, attacks that are not happening. Every one of these is the Google Trends problem wearing a different uniform: a signal that is only ever as good as the platform it was collected on, and a platform that silently encodes who is in the audience and who never shows up at all.
The discipline that follows is not "find the perfect instrument," because there is no such thing. It is to hold two questions about every signal at once: what is this actually measuring, and who or what is structurally invisible to it? And then the harder move, the one the hype index forced on me, to read the blind spot as data in its own right, because a population you cannot see is telling you something by being unseeable. The threat actor who never trips your telemetry is not absent so much as operating where your telemetry is not pointed, and the serious AI audience you cannot find in Google search is not small so much as somewhere a search index cannot follow, so the blind spot has a shape, and that shape is information if you are willing to read it instead of scrubbing it.
This is the part of my work that does not photograph well, the unglamorous insistence that a number is a claim about an instrument before it is a claim about the world. It is also the part that earns its keep. The most expensive mistakes I have watched security teams make were not made on bad data. They were made on good data from an instrument nobody had asked the second question about, data that was accurate and complete about a world slightly narrower than the one the team actually lived in.
The payoff
What the index actually measured.
So the "This Changes Everything" index never did settle my argument about whether everything keeps changing. It did something more useful by accident, because it measured, precisely, the one thing it was built to detect, which was never the hype at all but the moment the old way of seeing stopped being able to see the new way of working. The most reliable signal that a shift is real may be that the instruments you would use to detect it have begun to fail in its direction, so when the ruler bends, the bend is the reading, and it is worth paying attention to which way it bends, because that is where everything actually changed.
The rest of the writing works the same seam: the security-data layer underneath the AI conversation, where a number is a claim about an instrument before it is a claim about the world.
Back to writing →