correlation is not causation
random · discover · next page →
don't miss spurious scholar
where each of these is an academic paper
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Viewership of The Big Bang Theory and the second variable is Google searches for 'how to make baby'. The chart goes from 2008 to 2019, and the two variables track closely in value over that time.
The Big Bang Theory: A Procreative Catalyst? An Examination of the Relationship between Viewership of a Pop Culture Phenomenon and Online Searches for Baby-Making Techniques
Show GenAI's made-up explanation As more people tuned in to watch The Big Bang Theory, they couldn't help but be influenced by all the talk of theoretical physics and geeky romance. With all that intellectual and romantic stimulation, it's no wonder there was a sudden surge in people trying to figure out the mechanics of baby-making!
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is The number of movies Will Smith appeared in and the second variable is Electricity generation in Kosovo. The chart goes from 2008 to 2021, and the two variables track closely in value over that time.
Switching on Will Power: An Electrifying Connection Between Will Smith's Filmography and Electricity Generation in Kosovo
Show GenAI's made-up explanation Will Smith's on-screen charisma and megawatt smile evidently sparked a surge in motivation among the people of Kosovo. As they tuned in to watch his performances, they couldn't help but feel charged up and full of energy. It's as if his blockbuster presence somehow electrified the nation, leading to a rewiring of productivity and a fresh power play in the realm of electricity generation. It's a shocking connection, but it seems that when it comes to boosting power production in Kosovo, Will Smith's roles truly have the spark!
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is UFO sightings in Utah and the second variable is Patents granted in the US. The chart goes from 1975 to 2020, and the two variables track closely in value over that time.
Interstellar Innovation: Exploring the Correlation Between UFO Sightings in Utah and US Patent Grants
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Popularity of the first name Lane and the second variable is The number of merchandise displayers and window trimmers in Alaska. The chart goes from 2003 to 2021, and the two variables track closely in value over that time.
Alaskan Visual Merchandising: The Lane of Name Popularity
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
What else correlates?
Popularity of the first name Lane · all first names
The number of merchandise displayers and window trimmers in Alaska · all cccupations
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Popularity of the 'balloon boy' meme and the second variable is Wind power generated in Fiji. The chart goes from 2009 to 2021, and the two variables track closely in value over that time.
Gone with the Wind: The Balloon Boy Meme's Inflated Influence on Fiji's Wind Power Generation
Show GenAI's made-up explanation As the 'balloon boy' meme deflated in popularity, it created a shortage of hot air in the online space. This shortage of hot air somehow led to a decrease in wind power generated in Fiji, leaving the whole situation quite up in the air.
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is The distance between Uranus and the Sun and the second variable is Global count of operating nuclear power plants. The chart goes from 1975 to 2022, and the two variables track closely in value over that time.
Uncovering the Cosmic Correlation: The Orbital Distance between Uranus and the Sun and the Global Count of Operating Nuclear Power Plants
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Popularity of the first name Stevie and the second variable is Lululemon's stock price (LULU). The chart goes from 2008 to 2022, and the two variables track closely in value over that time.
LULU-lemonade: A Statistical Study of the Stevie-nized Market
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Associates degrees awarded in Engineering technologies and the second variable is Google searches for 'daylight savings time'. The chart goes from 2011 to 2021, and the two variables track closely in value over that time.
Aligning Associates in Engineering Technologies with Anomalous Avidity for Arboreal Alignment: A Connection to Daylight Savings Time?
Show GenAI's made-up explanation Fewer engineering technology graduates means there's no one left to shrink the sun or tinker with time. It's all fun and games until the engineering grads take away our extra daylight!
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Popularity of the first name Camden and the second variable is UFO sightings in Florida. The chart goes from 1975 to 2021, and the two variables track closely in value over that time.
Camden's Cosmic Connection: Correlating the Popularity of the Name Camden with UFO Sightings in Florida
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Master's degrees awarded in Parks & Recreation and the second variable is Alphabet's stock price (GOOGL). The chart goes from 2012 to 2021, and the two variables track closely in value over that time.
Par-Fecting the Market: A Link Between Master's Degrees in Parks & Recreation and GOOGL Stock Price
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Kerosene used in El Salvador and the second variable is Google searches for 'attacked by a squirrel'. The chart goes from 2004 to 2021, and the two variables track closely in value over that time.
Burning Questions: The Kerosene Connection - A Squirrely Correlation?
Show GenAI's made-up explanation As kerosene usage decreased in El Salvador, there were fewer open flames to attract the attention of the power-hungry squirrel overlords, leading to a decrease in their organized attacks on unsuspecting individuals. Without the allure of fiery chaos, the squirrels decided to pursue more peaceful activities like acorn gathering and competitive tree climbing, leaving the people of El Salvador with a slightly lower risk of encountering vengeful, flame-loving squirrels. Remember, a squirrel's plans can be easily derailed when there are no kerosene-fueled sparks flying to ignite their bushy-tailed ambitions!
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is The number of movies Nicolas Cage appeared in and the second variable is The number of transportation security screeners in North Dakota. The chart goes from 2012 to 2022, and the two variables track closely in value over that time.
Nicolas Cage on Stage: The Movie Craze and North Dakota's Screeners' Raise
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Google searches for 'im not even mad' and the second variable is Popularity of the 'whip nae nae' meme. The chart goes from 2015 to 2023, and the two variables track closely in value over that time.
I'm Not Even Mad, Said Rad, So Whip, Nae-Nae: A Correlational Study of Google Searches and Internet Memes
Show GenAI's made-up explanation
Show GenAI image
Show scatterplot
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Per capita consumption of margarine and the second variable is The divorce rate in Maine. The chart goes from 2000 to 2009, and the two variables track closely in value over that time.
Spreading Love and Margarine: An Examination of the Butter-Splitter Correlation in Maine
Show GenAI's made-up explanation Perhaps as people used less margarine, they became less slippery in their relationships. The lack of artificial spread may have kept the couples from buttering each other up, leading to a decrease in overall marital strife. That's the reality when you can't believe it's not butter - it's a recipe for marital success. Alternatively, it could be that as the margarine consumption decreased, so did the overall slickness in the state, leading to fewer instances of partners feeling like they couldn't grip the marriage.
Show GenAI image
Show scatterplot
Why this works
- Data dredging: I have 25,237 variables in my database. I compare all these variables against each other to find ones that randomly match up. That's 636,906,169 correlation calculations! This is called “data dredging.” Instead of starting with a hypothesis and testing it, I instead tossed a bunch of data in a blender to see what correlations would shake out. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.
- Lack of causal connection: There is probably no direct connection between these variables, despite what the AI says above. This is exacerbated by the fact that I used "Years" as the base variable. Lots of things happen in a year that are not related to each other! Most studies would use something like "one person" in stead of "one year" to be the "thing" studied.
- Observations not independent: For many variables, sequential years are not independent of each other. You will often see trend-lines form. If a population of people is continuously doing something every day, there is no reason to think they would suddenly change how they are doing that thing on January 1. A naive pvalue calculation does not take this into account.
- Y-axes doesn't start at zero: I truncated the Y-axes of the graphs above. I also used a line graph, which makes the visual connection stand out more than it deserves. Mathematically what I showed is true, but it is intentionally misleading. If you click on any of the charts that abuse this, you can scroll down to see a version that starts at zero.
- Confounding variable: Confounding variables (like global pandemics) will cause two variables to look connected when in fact a "sneaky third" variable is influencing both of them behind the scenes.
- Outliers: Some datasets here have outliers which drag up the correlation. I intentionally mishandeled outliers, which makes the correlation look extra strong.
- Low n: There are not many data points included in some of these charts. Even if the p-value is high, we should be suspicious of using so few datapoints in a correlation.
Note
Note
Note
Note
Note
Note
Pro-tip: click on any correlation to see:
- Detailed data sources
- Prompts for the AI-generated content
- Explanations of each of the calculations (correlation, p-value)
- Python code to calculate it yourself
- Originally published: May 2014
- Update launched: January 27, 2024
- What it is: Random correlations dredged up from silly data, turned into linear line charts. Now with AI-generated explanations of the causal connection and silly images to accompany the charts!
- What it is: A 6,000-word article detailing my (completely unecessary) investigation into why a pedestrian footbridge was built over a highway in 1959.
Who is Tyler Vigen?
Tyler Vigen is an author, programmer, management consultant, and the 40th most famous Aquarius Named Tyler. My defining qualities are "curious" and "stubborn."
I don't show ads, track pageviews, load external scripts, or use a copyright. I don't have anything for sale, I'm not for hire, and you can't buy me a coffee. Thanks though!
I hope you enjoy the projects. If you do, shoot me an email: emailme@tylervigen.com or subscribe.