In NLP, we often come across the concept of cosine similarity. Especially when we need to measure the distance between the vectors. I was always wondering why don’t we use Euclidean distance instead. I understand cosine similarity is a 2D measurement, whereas, with Euclidean, you can add up all the dimensions.
So here I find a ‘Grok’ explanation on Quora.
You are a very polite person and you liked my answer..so in the comment section you have written “good” 4 times and “helpful” 8 times(just numbers!! :))…something like….” a very good answer which is too much helpful. It will be helpful for good understanding. People who are not that good in maths..Can find the answer helpful…”…and so on….
A friend of you..Who doesn’t talk much..Might write just- “good and helpful..I found it helpful for my studies”
What is the count? “Good”-1, and “helpful”-2
If I try to find the cosine similarities between these comments(or..Documents, as told in a miner’s term :))..It will be exactly 1! (Refer Google to see the formula, it’s ultra easy)
There you go, with cosine similarity, you measure the similarity of the direction instead of magnitude.