Nowadays more and more practical and interesting emoji usages are created, for example, different categories of emoji will be combined together to become a Meme, or seemingly unrelated emoji can actually be used to describe the same meaning, which makes emoji gradually create a connection between them. We want to show this relationship between emoji visually and let users have a deeper understanding of emoji usage, so we use machine learning algorithms to calculate the relationship between emoji and draw the result into an easy-to-understand picture, which is the Emoji Relationship Graph.

What is Emoji Relationship Graph?

It is well known that emojis are usually presented as images, but it is impossible to figure out the relationship between emoji with images, because the main use of emoji is to convey information and emotions, they are more like a language, so it will be more accurate to explore their relationship through the meaning and usage of emoji. Therefore, we obtained all the tweets containing emoji from 2018 to 2021, totaling 812 million tweets. Considering that the usage of emoji will be different in different language environments, we classified the tweets by language, and then calculated the Text Similarity between emoji in each language by machine learning algorithm, so as to get the corresponding Emoji Relationship Graph for each language.

How to understand the Emoji Relationship Graph?

This is the Emoji Relationship Graph of 👉 for spanish. The red box shows the 9 nearest emoji to it, the length of the ray represents the degree of relationship, the shorter the ray, the closer the relationship. In addition, the graph also shows a part of other emoji's relationship graph. In the black box is the relationship graph of 👆,and in the orange box is the relationship graph of .

The results reflected in the relationship graph allow us to have a richer understanding of the usage of emoji. For example, these two emojis 👈🥺 are often used as a combination with 👉 in recent years to express aggrieved, shy or pleading, so they appear in this relationship graph:

These two emojis are often used to indicate links, which means they have similarities in usage, so they are close to each other in the relationship graph:

If you delve into the emoji relationship chart, you may refresh your perception of some emojis.

How to calculate the relationship between emoji?

Next we will give you a detailed description of the calculation process. It can be roughly divided into the following three steps:

  • First, we use the TF-IDF algorithm to extract the tags of each emoji from the tweets and the weight corresponding to each tag. Tags refer to those words that are most closely related to emoji, which are equivalent to the characteristics of an emoji; and weights refer to the closeness of the relationship between tags and emoji, the higher the weight, the closer the relationship. About the algorithm and calculation process of getting tags, we have written an article to introduce it in detail, you can click the right link to read it: ☁️Emoji Tag Cloud: Help You To Get More Knowledge Of Emoji!
  • After obtaining the labels, a new problem arises. When it comes to calculations, we all know that generally only numerical values can be calculated, but the tags are presented as text, so how can they be calculated algorithmically? Therefore, our second step is to convert the text into the numerical values that can be calculated——Vector. This process is called Word Embedding. We need to first read a large amount of tweet data using the word2vec algorithm (one of the methods of word embedding)to transform each word in the text into a vector, then we can obtain a word embedding matrix consisting of all high-dimensional vectors of words, and later map each word corresponding to each tag we got in the first step into a high-dimensional vector through the word embedding matrix. Thus, we have completed the text-to-vector conversion. These high-dimensional vectors are calculated by analyzing the context of the text, which can well preserve the semantic information of each word and thus ensure the accuracy of the text similarity. The word2vec algorithm is also explained in detail in our blog, if you want more details you can read it: 🔍Emoji Sentiment Analysis
  • The last step is to calculate the text similarity between emojis. The algorithm commonly used to calculate text similarity is VSM(Vector Space Model). This is one of the most widely used similarity calculation models, but it obtains results by calculating co-occurring words (words that appear in both texts) of two texts, which is not accurate when facing texts with the same meaning but different wording.So to avoid this situation, we chose another algorithm——SCM(Soft Cosine Measure). It can measure the similarity between words, so even if two texts do not have words in common, this algorithm can calculate the similarity of two texts by evaluating the word similarity. The final calculation results are presented in numbers, the larger the number we get means the higher the text similarity between emojis, and the higher the text similarity, the closer their relationship.


Through the relationship graph, we can understand people's habits and preferences of using emoji, and explore the trend of emoji usages. You may be surprised to find that some emojis you wouldn't associate with each other are actually very closely related, and that may be a new trendy use of emoji you haven't known yet! Also, if you have any suggestions, please tell us in the comment!