I started this Data Visualization Blog back at the end of May 2011. WordPress provides decent analytics to measure things like views, referrer, clicks, etc. The built-in stats show bar charts by day/week/month, views by country, top posts and pages, search engine terms, comments, followers, tags and so on. I have accumulated the view data and wanted to share some analysis thereof.
At this point there are 17,000 views and 56 posts (about 1 post per week). The weekly views have grown as follows:
The WordPress dashboard for monthly views looks like this:
Assuming an exponential growth process this amounts to a doubling roughly every 3 months. This may not sound like much, but if it were to continue, it would lead to a 16x increase per year or a 4096x increase in 3 years. Throughout the first year this model has been fairly accurate and allowed to predict when certain milestones would be reached (such as 10k views, reached in Apr-2011 or 100k views, predicted by Jan-2013).
However, the underlying process is not a simple exponential growth process. Instead it is the result of multiple forces, some increasing, some decreasing, such as level of interest of fresh content for target audience, rather short half-life of web content, size of audience, frequency of emails or tweets with links to the content etc. So I expect growth to slow down and consequently the 100k views milestone to be pushed out past Jan-2013.
Views come from some 112 countries, albeit very unevenly distributed.
The Top 2 countries (United States and United Kingdom) contribute nearly half of the views, the Top 10 (9%) countries nearly 75% of all views. The fairly high Gini index of this distribution (~0.83) indicates strong dependency on just a few countries. The only surprise for me in the Top 10 list was South Korea, ranking fifth and slightly ahead of India. Germany is probably a bit over-represented due to my German business partner (RapidBusinessModeling) and related network.
One interesting analysis comes from looking at the distribution of views over weekdays. Not every weekday is the same. Thursdays are the busiest, Saturday the quietest days. After a little more than one year, averaging over some 56 weeks, the distribution looks like this.
Of course, time zone boundaries may cause some distortions here, but it looks like the view activity builds during the week until it hits a peak on Thursday. Then it falls sharply to a low on Saturday, and builds from there again. This fits with intuition: One would expect the weekend days to be low as well as Monday and Friday to be lower than the mid-week days. It’s tempting to correlate that with the amount of work or research getting done by professionals. The underlying assumption is that people discover or revisit my Blog when it fits into their work.
A large fraction (> 65%) of referrals comes from search engines. Within those, it’s mostly Google (>90% summed across many countries) with just a small amount of others like Bing. It’s safe to say that without Google search my Blog would have practically no views. Chances are that your first exposure to this Blog came from a Google search as well. One unexpected insight for me was to see a high ratio of image to text searches, typically 3:1 or 4:1. In some ways it shouldn’t be surprising that a blog on data visualizations gets discovered more often by searching for visual elements than for text. It also jibes with the enormous growth of image related sites such as Instagram or Pinterest. I just would not have expected the ratio to be that high.
The beginning is always slow. But any exponential growth sooner or later leads to rather large numbers. So the real question is how one can keep the exponential growth process going? I’d love to hear your comments. If you want to compare this against your own Blog stats, I have shared the underlying data as a Google doc here. I have no idea how this compares to other blog stats in similar domains. If you know of any other public Blog stats analysis, please comment with a pointer below. Thanks.
Addendum 7/11/2012: Today my Blog reached 20,000 views. I noticed over the last few weeks that the deviation from an exponential growth model was getting quite large. For an exponential trend line R² = 0.9886.
When instead modeling the weekly views on a linear growth rate, this gives the total views a quadratic growth. Curve fitting the total views with a 2nd order polynomial yields a very good fit (R² = 0.9977).
Linear growth of weekly views is compatible with approximately linear increase in content (steady frequency of about 1 post / week) and thus increased chance of Google search indexing new content (with Google search the main source of view traffic). Quadratic growth of total views is also nonlinear, but far slower than exponential growth. For example, the 100,000 view milestone is now projected to be reached in 08/2013 instead of in 01/2013, i.e. in 13 months as compared to 7 months.
Addendum 11/1/2012: The Blog reached 30000 views on Oct-19 and here is a chart of the monthly views through Oct-2012:
August and September have been slow, presumably seasonal variation. I also didn’t post between late August and mid October. The view data of the last couple of months no longer support the theory of significant growth in view frequency. Instead, multiple dynamic factors come into play. At times views spike due to a mention or a post of temporary interest – such as the recent post on visualizing superstorm Sandy. But such spikes quickly fade away according to the very limited half-life of web information these days. The undulating 4 week trailing average in weekly views below visualizes this clearly. The net effect has been a plateau in view frequency around 3000 per month.
I continue to see most of the referrals coming from Google searches, still with a majority of those being image searches. Engagement growth has been anemic, with relatively few comments, back links or other forms of engagement. It seems to me that growth proceeds in phases, with growth spurts interspersed by plateaus of varying length. One such growth spurt has been reported by Andrei Pandre on his Data Visualization Blog through the use of Google+. Perhaps it’s time to extend this Blog to Google+ as well.
With regard to variation of views by weekday, the qualitative pattern remains. Tuesday is now emerging as the day with the most views, with Monday, Wednesday, and Thursday slightly behind, but still above average. Friday is slightly below average, Saturday is the lowest day with only half the views and Sunday in between.
I’m not sure whether to conclude from that that important posts should be published on a particular weekday. Again, most views come from Google searches and are accumulated over time, so perhaps only the height of the initial spike will vary somewhat based on the publishing weekday.