Sep 06 2020

Light as a Feather: Removing Google Analytics

Gathering analytics data is becoming increasingly difficult for publishers. uBlock is commonly used by the average internet user and Pi-hole is quickly growing in the technologist space. I applaud these privacy-concerned folks for taking personal agency in a time where internet privacy is more difficult than ever. From a publishers point of view, users ghosting analytics is simply a case of “rolling with the punches”.

While I had Google Analytics on my site for about two years, I seldomly checked it. It felt more like giving Google free data more than anything. Part of the issue is that I never really took the time to learn the ins-and-outs of the Google Analytics tooling. Likewise, I also did not engage in marketing campaigns where this feedback could be vital.

What are my “business” requirements for analytics on this blog? Not much. It’s a personal site tucked really far back in the corners of the internet. Since nothing is monetized, engagement is not something that’s really on the todo list.

With this in mind, it seems excessive that the Google Analytics tracker constituted approximately 75% of the page-weight. In this case, I’ve decided that it is better to drop Google Analytics – constituting a giant step towards my philosophy of “only give the users what they ask for.”

So what now?

I’ve decided to exchange Google Analytics for a completely server-side analytics software called awstats. Awstats parses server logs to gather information regarding traffic to the site. I have these statistics digested daily, and review the results about once per month. The installation is relatively easy, and reviewing the data is as easy as rsyncing a pdf to my local machine. Of course, I don’t run the rsync command directly. It is baked into my daily routine (last paragraph).

Some of the general information that is reported includes:

The above statistics are broken down further by:

Other sections include:

While going through some of this data, I learned that the most frequent unknown user agents that visits this site is Blackboard_Safeassign. Interesting that my site is archived by Blackboard and used in conjunction with their plagiarism detection software.

One other interesting discovery is that 80.2% of my traffic is from non-mobile devices. Though with 17.7% being Linux, I’m sure a good chunk of the desktop traffic is just from me. However, it is also very possible that some bots are spoofing their operating system.

If you’re not concerned about client exclusive information, I implore you to look into server-side analytics. This lightens the JavaScript load on the client and does not donate your data to other providers. Finally, since server-side analytics are not dependent on the client, so you’re bound to get precise data!

Goodbye JavaScript page-weight, for a feather I will become.