cosine similarity vs. Euclidean distance

In NLP, we often come across the concept of cosine similarity. Especially when we need to measure the distance between the vectors. I was always wondering why don’t we use Euclidean distance instead. I understand cosine similarity is a 2D measurement, whereas, with Euclidean, you can add up all the dimensions.

d(p, q) = \sqrt{(p_1- q_1)^2 + (p_2 - q_2)^2+\cdots+(p_i - q_i)^2+\cdots+(p_n - q_n)^2}.

So here I find a ‘Grok’ explanation on Quora.

You are a very polite person and you liked my in the comment section you have written “good” 4 times and “helpful” 8 times(just numbers!! :))…something like….” a very good answer which is too much helpful. It will be helpful for good understanding. People who are not that good in maths..Can find the answer helpful…”…and so on….

A friend of you..Who doesn’t talk much..Might write just- “good and helpful..I found it helpful for my studies”

What is the count? “Good”-1, and “helpful”-2

If I try to find the cosine similarities between these comments(or..Documents, as told in a miner’s term :))..It will be exactly 1! (Refer Google to see the formula, it’s ultra easy)

There you go, with cosine similarity, you measure the similarity of the direction instead of magnitude. 

Author: Lucia

I’m a Ph.D. student at the University of Edinburgh, School of Informatics. My research project involves using digital data to track the affective, cognitive, and behavioral changes of users, identifying the associations between digital signals and symptoms of mental disorders. I am interested in early symptom detection as a human-centered approach to assist interventions and early prevention of mental disorders or harmful behaviors. Along the way, I deeply care about ethical research practices, model bias and fairness. My work involves understanding model biases and examining the ‘noise’ in social media signals. I am writing up my Ph.D. thesis at the moment and looking for a post-doctoral research position. I am also passionate to communicate my research and machine learning methods. Check out my YouTube channel: ML_made_simple

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s