Saturday, May 23, 2009
Editors and edits
Hans and I met up this evening to discuss the moving average data I had collected. “But what about editors?”, he asked. So I extended it to get that too. Here’s the data. (Hat tip to Vaibhav Bhawsar, who also pointed that out.)
This chart is a visual mess. It’s also close to the limits of how much data I can pass the Google Chart API, so I’ll need a better system in place for the next round, something that allows zooming in for closer analysis. For what it’s worth, here are the key things about this chart:
- The blue Moving Window line is now the sum of the preceding seven day period, not the average.
- The dark gray Editors in Window line is the number of unique editors within each window.
- The y-axis labels are off by a little bit. I can’t figure out why they are not properly calibrated.
- Edit Count and Editor Count hug each other closely, but have clearly visible differences in the moving window.
If you look at the three biggest blue peaks, the first on June 11 (54) and third on March 1 (59) have a large number of editors (27 and 24), while the peak of August 25 (50) has only 11 editors. You may recall from the last graph that August 25 was the day of the highest number of edits in the year.
Hans thinks that if we render a scatter graph plotting 1/edits-in-window for the x-axis and editors-in-window/edits-in-window for the y-axis, the first and third peaks will show up close to (0,1).
Assuming this works as predicted, we’re close to building a first level user-facing analysis tool: give it a page and a date range, and it’ll tell you approximately when there was an edit war, for closer inspection using content analysis.
Guillaume Marceau — May 23, 2009 10:54:27 PM — # ↩
I’m curious. Can you post the data for this as well?
Kiran Jonnalagadda — May 23, 2009 11:53:25 PM — # ↩
I’ve updated the post to link to the data. I haven’t done the scatter-plot chart yet. Will post when done.