Blog posts tagged "stats"

Google Analytics, Solving Someone Elses Problem

August 22nd, 2006

Still frustrated and disappointed that MeasureMap has gone away, even if it never went as far as I wanted it to go.

Finally got around to trying out Google Analytics, maybe I’m doing it wrong, but I’m bored by it. Doesn’t answer the questions I’m interested in, though it did flag the interesting statistic that 94% of my visitors are “first time visitors”, which I imagine could also be dubbed the “I’ve got full content feeds” usage profile.

I guess I try Mint next, though the demos don’t look like a lot more then a (very) pretty face on my old analog install, and $30 seems a bit much to pay for that. And really I think stats for blogs need a domain specific package. But there seems to be an active plugin community around Mint which is promising.


update: Checked back today, it’s interesting to note that with a few more days of data IE’s percentage has already fallen from 42% to 37.6%. Maybe it wil fall to a 1/3? I, for one, have never believed that 95% market share number folks like to throw around.

Tagged: Uncategorized , , ,

MeasureMap Alpha: Review

November 22nd, 2005

I’ve been running the alpha of MeasureMap for a few weeks now, and I thought I’d do a quick brain dump.

First thing you notice? It’s pretty, just ridiculously, gratuitously whizbangy. And that can make it find of fun to play with, in and of itself. Still that Flash can be slow (though not apparently compared to Google Analytics), but really if I was getting the questions I had answered, I don’t think I’d notice.

And as a domain specific (blogs) stats package, they’ve done some nice work breaking up the reports into appropriate discrete units. (I love the daily overview screen you get when you click thru from the timeline and you’ve posted multiple blog posts in a single day)

The Wrong Questions

Maybe LaughingMeme is abnormal, but I have about 10 posts that Google loves so much that (according to MMap) they account for 47% of all my traffic. (and its a power law distribution) In general they aren’t posts I’m all that excited about: year old speculation on Google IMAP brings in hundreds of visitors a day, while Beklin Sucks! has been a perennial favorite, and daily traffic to I’m feeling lucky often exceeds 1000 visitors.

The fact that people are visiting these pages is boring. And the fact that Google is sending them there, day in and day out? Boring.

Show Me the Novel

What’s new? What’s different today then yesterday, this week then last week? Freak outliers, and emerging trends please.

55 Posts About Coffee And Still They Come

Help me out with audience. What brings the readers? What brings the links? What brings the comments? I’ve got my posts marked up with microformat tags, Yahoo has the term extraction API, lets use some of that domain specificity to do something new. (and while rel=”tag” is the only widely deployed microformats currently, more will follow)

Gurchin has got Gads integration, MMaps needs to distinguish by exploiting its specialty.

And Speaking of “conversions”

Any chance of hooking up with Feedburner to allow me to plot subscriber spikes to blog posts? No idea if the data would be compelling, but I know that most of the people in my subscription list got there by writing one really good post. (staying there is harder)

Sources and Fans

You’re tracking links in, and links out, I’d love to see that information compiled into its social mesh.

Quirky Stats Muching AI

Okay, what I really want is an AI that gets a kick out of pouring over the logs all day, and finding the quirky and sublime.

Imagine logging in to be told that “the query ‘bush in freefall’ was your 22nd most popular search yesterday, but your 1st most popular on searches coming from .mil” (true), or “the spike in ‘weather rss’ this weekend corresponded to freak hail storms across the country” (actually I have no idea why that query spiked). But I’m willing to settle for a bit less.

Shell Scripting and Agg Stats

September 21st, 2005

Been a while since I dug into my aggregator stats (intrigued by FeedBurner mentioning their tracking 2000 aggregators), and while I’ve got my Perl script, but I was alarmed to realize that I had forgotten the shell for doing the equivalent.

So killing time waiting for J I re-created it. Assumes you’re using Apache’s “full” log format (and that your feed is “index.rdf”)

sudo grep '/index.rdf' access.log | cut --delimiter=\" -f6,1 --output-delimiter="=" | 
sed 's/ - - \[[^=]*//' | sort | uniq | cut --delimiter="=" -f2 | sort | uniq -c | sort -n

Returns a count of unique IPs per User-Agent. Tack on a little awk to get aggregate counts.

| awk '{sum += $1; print sum}'

Of course folks like Bloglines, Rojo, Yahoo FeedSeeker, Feedster, and FeedLounge (among others I’m sure) are rolling up the user counts. Of course FeedLounge and FeedSeeker are counted multiple times as they add time sensitive info to their User-Agent (that has got to be against some best practices!), and Bloglines comes from a couple of different IPs.

Interestingly, Google Desktop is showing up as generating not only the highest number of hits, but the highest number of 200s.

Tagged: Uncategorized , , , , ,

RSS Aggregator Popularity

March 20th, 2004

9 months ago I ran a report on which of the RSS aggregators which hit LaughingMeme were most popular by total requests. Using some Perl, some awk, some cool shell-fu Steve sent me last time, and Haiko Hebig’s elegant CSS I whipped up some new graphs, most popular RSS aggregators by unique IP. These stats are for the last month of traffic on LaughingMeme, for each of the RSS feeds associated with this blog.

The first graph is the 20 most popular RSS aggregators regardless of version. The second graph treats each version of the aggregators as distinct.

Couple of things to note.

1. NewNewsWire still dominates, after all this time, not only in popularity, but also in diversity with a total of 18 distinct versions (counting Lite, and paid as distinct) observed.

2. In the first graph Bloglines is listed twice. The first number is treating all the subscribers for each of the 4 feeds I surveyed as unique, the second number is assuming that the subscribers for the less popular feeds are a subset of people who subscribe to the main feed. The real number lies somewhere in between (as I think a large number of people only subscribe to MLPs, or to LM)

Most popular RSS aggregators (all versions) by unique ip

RSS aggregators by unique ip
Tagged: Uncategorized , ,

Tracking Referers for Sourceforge Hosted Site

December 23rd, 2003

Hosting a project with Sourceforge is great. They provide bandwidth, CVS with anonymous access (something I never set up on my own boxes), and above all, a certain credibility. But they don’t provide good traffic analysis. There was some recent discussion if people were actually finding Magpie searching for PHP RSS parsing, or if they were simple searching for a picture of a magpie.

So here is a bit of Javascript that will fetch a remote image from your Sourceforge project home page and tack the original referer url to the image.

<script language="javascript">
var img = ""
var ref = parent.document.referrer;
document.write("<img src=\""+img+"?"+escape(ref)+"\">");

I imagine this is what someone like SiteMeter does.

Tagged: Uncategorized , , , , ,

Aggregator Traffic Stats

July 10th, 2003

Inspired by hebig’s post on the subject, I ran the user agent numbers for my RSS feeds (last 30 days of data). I also stole the nifty CSS bar graphs from the same location.

RSS Aggregators by percent of total requests
NetNewsWire Lite/1.0.2
Hep Messaging Library/0.0.2
Radio UserLand/8.0.8
Feedster Harvester/1.0;
rssSearch Harvester/1.0;
[not listed: 149 browsers]

Tagged: Uncategorized , ,

Tracking RSS Users

May 30th, 2003

Tbray has an article on counting RSS subscribers as part of making RSS commerce ready. (though I’m sure we can find some better use then that)

Predictably he stumbled onto the coals of the old referer flamewars, which while gone dark and black, apparently had enough heat to set him straight quickly.

Personally I never understood the problem. I think the practice of passing extra info in the referer logs got tarred by bad implementation; yes its annoying when Syndirella or Amphetadesk put a raw url into the referer field (the 2 client who seem to still do this in my logs), this confuses analog, but I liked Straw’s implementation which sent you a referer url in the (approximate) form of I miss this feature, which was dropped from Straw about the time aggregator writers were being drawn and quartered for “referer spam”. And Tim’s suggestion of using email hashes, while adding back the countability, doesn’t reopen the channels of communication. (well you can take every email address you know, and run it against the hashes, and see if anything turns up, but that has some obvious limits)

Tagged: Uncategorized , ,


March 17th, 2003

Alright, one of you people is a little weird. Since I’ve started including the full content of posts in my feed, I’ve definitely seen an increase in aggregators. (and a decrease in page view) Thats fine.

But you, the one who is running RssBandit, nntp//rss, and something involving LWP, thats over doing it a bit. I promise I serve the same RSS feeds to all user-agents, really. One aggregator should be good enough.

Tagged: Uncategorized , ,