correlation is not causation
random · discover · next page →
don't miss spurious scholar
where each of these is an academic paper
Show GenAI's made-up explanation As more people tuned in to watch The Big Bang Theory, they couldn't help but be influenced by all the talk of theoretical physics and geeky romance. With all that intellectual and romantic stimulation, it's no wonder there was a sudden surge in people trying to figure out the mechanics of baby-making!
Show GenAI image
Show scatterplot

Show GenAI's made-up explanation Will Smith's on-screen charisma and megawatt smile evidently sparked a surge in motivation among the people of Kosovo. As they tuned in to watch his performances, they couldn't help but feel charged up and full of energy. It's as if his blockbuster presence somehow electrified the nation, leading to a rewiring of productivity and a fresh power play in the realm of electricity generation. It's a shocking connection, but it seems that when it comes to boosting power production in Kosovo, Will Smith's roles truly have the spark!
Show GenAI image
Show scatterplot

Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
Alaskan Visual Merchandising: The Lane of Name Popularity
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
What else correlates?
Popularity of the first name Lane · all first names
The number of merchandise displayers and window trimmers in Alaska · all cccupations
Gone with the Wind: The Balloon Boy Meme's Inflated Influence on Fiji's Wind Power Generation
Show GenAI's made-up explanation As the 'balloon boy' meme deflated in popularity, it created a shortage of hot air in the online space. This shortage of hot air somehow led to a decrease in wind power generated in Fiji, leaving the whole situation quite up in the air.
Show GenAI image
Show scatterplot

Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
LULU-lemonade: A Statistical Study of the Stevie-nized Market
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
Show GenAI's made-up explanation Fewer engineering technology graduates means there's no one left to shrink the sun or tinker with time. It's all fun and games until the engineering grads take away our extra daylight!
Show GenAI image
Show scatterplot

Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
Par-Fecting the Market: A Link Between Master's Degrees in Parks & Recreation and GOOGL Stock Price
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
Burning Questions: The Kerosene Connection - A Squirrely Correlation?
Show GenAI's made-up explanation As kerosene usage decreased in El Salvador, there were fewer open flames to attract the attention of the power-hungry squirrel overlords, leading to a decrease in their organized attacks on unsuspecting individuals. Without the allure of fiery chaos, the squirrels decided to pursue more peaceful activities like acorn gathering and competitive tree climbing, leaving the people of El Salvador with a slightly lower risk of encountering vengeful, flame-loving squirrels. Remember, a squirrel's plans can be easily derailed when there are no kerosene-fueled sparks flying to ignite their bushy-tailed ambitions!
Show GenAI image

Show scatterplot
Nicolas Cage on Stage: The Movie Craze and North Dakota's Screeners' Raise
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
Spreading Love and Margarine: An Examination of the Butter-Splitter Correlation in Maine
Show GenAI's made-up explanation Perhaps as people used less margarine, they became less slippery in their relationships. The lack of artificial spread may have kept the couples from buttering each other up, leading to a decrease in overall marital strife. That's the reality when you can't believe it's not butter - it's a recipe for marital success. Alternatively, it could be that as the margarine consumption decreased, so did the overall slickness in the state, leading to fewer instances of partners feeling like they couldn't grip the marriage.
Show GenAI image
Show scatterplot

Why this works
- Data dredging: I have 25,237 variables in my database. I compare all these variables against each other to find ones that randomly match up. That's 636,906,169 correlation calculations! This is called “data dredging.” Instead of starting with a hypothesis and testing it, I instead tossed a bunch of data in a blender to see what correlations would shake out. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.
- Lack of causal connection: There is probably no direct connection between these variables, despite what the AI says above. This is exacerbated by the fact that I used "Years" as the base variable. Lots of things happen in a year that are not related to each other! Most studies would use something like "one person" in stead of "one year" to be the "thing" studied.
- Observations not independent: For many variables, sequential years are not independent of each other. You will often see trend-lines form. If a population of people is continuously doing something every day, there is no reason to think they would suddenly change how they are doing that thing on January 1. A naive pvalue calculation does not take this into account.
- Y-axes doesn't start at zero: I truncated the Y-axes of the graphs above. I also used a line graph, which makes the visual connection stand out more than it deserves. Mathematically what I showed is true, but it is intentionally misleading. If you click on any of the charts that abuse this, you can scroll down to see a version that starts at zero.
- Confounding variable: Confounding variables (like global pandemics) will cause two variables to look connected when in fact a "sneaky third" variable is influencing both of them behind the scenes.
- Outliers: Some datasets here have outliers which drag up the correlation. I intentionally mishandeled outliers, which makes the correlation look extra strong.
- Low n: There are not many data points included in some of these charts. Even if the p-value is high, we should be suspicious of using so few datapoints in a correlation.
Note
Note
Note
Note
Note
Note
Pro-tip: click on any correlation to see:
- Detailed data sources
- Prompts for the AI-generated content
- Explanations of each of the calculations (correlation, p-value)
- Python code to calculate it yourself
- Originally published: May 2014
- Update launched: January 27, 2024
- What it is: Random correlations dredged up from silly data, turned into linear line charts. Now with AI-generated explanations of the causal connection and silly images to accompany the charts!
- What it is: A 6,000-word article detailing my (completely unecessary) investigation into why a pedestrian footbridge was built over a highway in 1959.
Who is Tyler Vigen?
Tyler Vigen is an author, programmer, management consultant, and the 40th most famous Aquarius Named Tyler. My defining qualities are "curious" and "stubborn."
I don't show ads, track pageviews, load external scripts, or use a copyright. I don't have anything for sale, I'm not for hire, and you can't buy me a coffee. Thanks though!
I hope you enjoy the projects. If you do, shoot me an email: emailme@tylervigen.com or subscribe.
