Did you ever dream of your words like the ones of Shakespeare, Einstein, or Trump remaining forever etched in the memory of humanity ? Would you like to be famous ? Then this website might help you !

First, try to write something, we’ll tell you if your words have a chance to become famous (max 140 characters):

Anyway, you better try our famous quote generator if you want to succeed

Click on the button and here we go:

Inspiring no? Post it right away on Twitter or Instagram and let the magic happen…

(What? The syntax isn’t right? Maybe our generator needs a little bit more training)

While you are waiting for the fame to come thanks to your quote, you can explore the data and read more about what makes a quote famous.

Just think about politicians or advertisers that convinced you with just a few words, just by saying ‘Yes we can!’, ‘Make America great again’ or ‘Logarithms are our friends’. These tiny little sentences made of 3 or 4 words have so much power when pronounced by Barack Obama, Donald Trump, or Robert West that they are engraved forever in your mind. Not sure your colleague’s remark in front of the coffee machine ‘The weather is nice today.’ will have as much impact on your brain… Why do these quotes are so powerful? Is it because of their speaker? Is it because of the quote in itself? What makes that some quotes will be remembered forever and others will disappear into nothingness?

That’s what we wanted to investigate with our data story, in order to unravel the mystery behind famous quotes.

First, we tried to look into the soul of the quote to find an answer. For instance, does grammatical syntax play a role in the fame of words? Or are negative quotes more famous than positive ones? Are some topics more trendy than others? Then, we tried to understand the impact of the speaker on his/her quote. Are the emitters of famous quotes distributed equally across continents? Is the fact that the speaker is alive or dead when the quote is cited important? Are people related to sports more mediatized than artists? It is indeed of high interest for politicians, influencers, or companies to generate the perfect quote that will catch people’s attention.

So while you have the time before the celebrity tornado sweeps you away, slip into the shoes of a data analysist to discover what impacts the fame of a quote!

Say Hi to the dataset: Quotebank

The data used in this analysis is Quotebank, a corpus containing citations extracted from newspapers over the years 2008 to 2020. If you are interested in the methods used to extract quotes and attribute them to speakers, have a look at the paper. You can also use this related tool that allows you to enter a keyword and it will search the database for related quotes and show you its occurrence over time.

Using this set of quotes (2015-2020), we selected only the 1% most famous quotes with more than 215 occurrences in newspapers, and also sampled random quotes with less than 10 occurrences to be considered as non-famous. With this set of quotes, we aim at analyzing what makes a quote famous or what makes it fall into the abyss of oblivion. In order to avoid artifacts of quotes being often cited because of differences in fame among speakers we selected famous and non-famous quotes from speakers appearing in the Pantheon database). It was generated on the basis of Wikipedia bibliographies views, the number of different Wikipedia languages, the coefficient of variation etc. and it combines those values in a single metric, the historical popularity index (HPI).

OK enough with the boring pre-processing steps (crucial in data science tho). Now let us dive into the data and see if we can answer our existential question.

Who are the most cited people on this planet?

First, you can take a look at the top 20 of people emitting the most famous quotes, along with their number of cited quotes and THE quote with the most occurrences.