An Instagram study
NinjaOutreach we know perfectly well how important it is to offer up-to-date and relevant information to our customers. For that reason we have invested a lot of money and time into a complete rewrite of our social engines The rewrite took 5 weeks, and we have since collected 7,340,568 Instagram profiles, a considerable increase from the 400,000 we had before. Of these:
- 557,043 have been deleted since we started scraping
- 2,414,877 are private profiles
- 46,576 are verified profiles
- 2,015,533 have more than 1000 followers
- 89,694 have more than 100,000 followers (major influencers)
In the following sections we share some interesting insights from our data science team (of course, only the profiles that are still existing are taken into account).
Followers and following
The distribution of the number of followers is incredibly skewed, so it's very difficult to visualize it in a readable manner. This is because the great majority of profiles has a low number of followers (less than 1000), whereas there are profiles with hundreds of millions of followers. This is a difference of 5 orders of magnitudes. Thus, the only way to produce readable graphics is to use a logarithmic scale on the x axis:
As one can see, profiles with a high number of followers are pretty rare: only 10 profiles have more than 100 million followers, and 420 have between 10 and 100 million. The log scale is necessary since on the other hand we have almost 4.5 million profiles with less than 1000 followers. One of the challenges we face at NinjaOutreach is to present only the profiles that are relevant to each of our users.
Another useful metric to gauge profiles is the ratio between the number of followers and the number of profiles one is following:
The ratio is very low if a person follows a lot of profiles while being followed by very few accounts. On the other hand, it's very high for the so-called influencers, which have a big audience while at the same time following very few accounts. If the denominator is zero, the ratio is undefined. On average, Instagram accounts have 296 more followers than following, but for verified profiles this ratio jumps to 17,910.
Another important distinction is between micro and major influencers. For simplicity, here we call micro influencers the profiles with less than 100,000 followers, major influencers the profiles with more than 100,000 followers. We calculate that the average followers to following ratio for micro influencers is 37.7, whereas for major influencers it is 18,104. The distributions are once again very much skewed: the median value for the former group is just 1 (meaning that half of the micro influencers have a follower to following ratio equal to or below 1, i.e. they are not really influencers) and there are profiles with very high ratios that contribute to a higher average value. For the major influencers, the median is 650; the distribution is similarly skewed.
What do these values tell us? If you are looking to run campaigns across niche markets you should look for micro-influencers with a followers to following ratio greater than 30 (which amounts to only 5% of the total number of micro-influencers). On the other hand, if your campaign needs major influencers, you should look for profiles with a followers to following ratio greater than 20,000 (which amounts to 8% of the total number of major influencers).
Engagement, engagement, engagement
Perhaps the single most important metric that can reasonably summarize an Instagram profile is the engagement rate. It is defined for each unit of content published by the profile (that is, each post) as the ratio between the sum of the number or likes and replies and the number of followers of the profile:
This metric is of course an approximation, but as we will see it is extremely useful in identifying the quality of a profile.
First off, we plot the distribution of the engagement rate across micro and major influencers. This is the result:
The chart is again very useful to get an idea of which values of the engagement rate are average or better, and we see that there are considerable differences between micro and major influencers. For the former group, 11% is the average engagement rate, while for major influencers values are of course lower and the average is 3%.
One may also wonder how the engagement rate and the followers to following ratio are related. As the chart below illustrates, there is no simple (e.g. linear) relationship between the two metrics:
We can therefore conclude that the engagement rate is a metric that we can use in combination with the followers to following ratio.
Images or Videos?
There is a noticeable difference in engagement between image posts and video posts, and the contrast is even more pronounced for major influencers.
The chart above is fairly dense but extremely interesting. It contains two different histograms, overlaid on the same axes, that display the distribution of the ratio between the engagement rate for image posts and the engagement rate for video posts. If you think that image posts have a higher engagement than video posts, then you would expect this ratio to be higher than 1. The chart confirms this belief. The process that allowed us to produce such an image is the following: for each profile in our database we took the average engagement rate for image posts and the average engagement rate for video posts and we computed the ratio between them. Then we created the histograms above, which display the frequency of each value.
The result is that, on average, for a micro influencer image posts have 1.5 times more engagement than video posts, while for major influencers this multiple increases to more than 2 times. Note: this is not to say that video posts have terrible engagement, but the conclusion is that with respect to image posts and as calculated by the formula above the engagement is lower for videos. This may be explained by the fact that the Instagram application itself facilitates the action of liking image posts over video ones. However, investigating systematic biases like that one is outside the scope of this post.
That is only one side of the coin: the engagement quality is also determined by replies and likes. On Instagram, the number of likes is much higher than the number of replies. On average, the profile of a major influencer will have 237 times more likes than replies, while for micro influencers the multiple is lower at 42 times. Replies are usually considered stronger engagement than simple likes, so this should also be considered by those that have to decide which influencers to partner up with for sponsored campaigns.
The engagement rate can be used as an indicator of profile quality and can produce some accurate results. In particular, it appears to be very good at pointing out which profiles don't have genuine engagement.
From the formula that defines the engagement rate, we can see that this rate can be above 1 if the sum of likes and replies exceeds the number of followers of a particular profile. This can happen for a fraction of the total posts of a profile (e.g. if one or two posts gain extreme popularity for some reason) but it cannot certainly be the norm, since the number of followers will balance out.
For healthy profiles the engagement rate will be well below 1, but we can look for profiles with huge rates as an example of the opposite. In fact, in our dataset we have profiles with engagement rates well above 100. For example, the user ksjh2537g has a rate of 32,200%, having 21 followers and a single post with 6662 likes and 118 replies. This is a profile that advertises an Indonesian business that sells likes and followers:
Another example of an artificially high engagement rate is the user johnmaldonado023, which has 2 followers and is following 30 accounts, but has a single post with 277 likes and 2 replies. Lo and behold, one of the comments is from the profile express.followers581, which is yet another business that sells fake likes and fake followers. There are countless other examples with similar statistics: the user priora has 50 followers, is following 93 profiles and has a single post with 10,400 likes and 12 replies.
As we observed before, the followers to following ratio is to be used in conjunction with the engagement rate. In this case, the followers to following ratio would not necessarily have been sufficient to discard these low quality profiles, but the engagement rate is quite effective.
In this post we performed some very high-level exploratory analysis of the profile data we collected. In particular, we observed the usefulness of two key metrics, the followers to following ratio and the engagement rate. If you are looking for influencers as part of an advertising campaign you should focus on these ratios to filter out the profiles that don't align with your campaign targets.
In the next post of this series we will perform an even bigger analysis with 6.5M more profiles (for a total of 13.5M profiles), and we will explore geolocation data of the profiles and the differences between profiles belonging to different categories of profiles.