Silence is the virtue of fools. -Sir Francis Bacon
Question #70228 posted on 12/16/2012 9:40 p.m.
Q:

Dear 100 Hour Board,

When do you project that the 100,000th question will be asked based on historical question asking trends? The 250,000th? The 1,000,000th?

-I dibs the 100,000th

A:

Dear no calling shotgun until you can see the car,

The timing of your question was actually great, since it coincided with me studying linear regressions for my stats final. In order to answer it without spending crazy amounts of time, I took a small sample of past questions (all of them asked after Katya went through the archives and assigned question numbers to questions asked before question numbers were instituted) and recorded the month and year that they were asked. Limiting myself to nine questions and rounding the time to the nearest month means that this regression won't be as accurate as it could be, but I claim finals week as a legitimate excuse. Also, this measures when questions were posted, not when they were asked (which is what determines their question number), but since I'm already rounding to the nearest month, I'm going to work under the reasonable assumption that that doesn't have more than a trivial effect. Here's the data I used:

 Question number (x) When the question was asked (y) 38000 September 2007 42000 January 2008 46000 June 2008 50000 February 2009 54000 October 2009 58000 June 2010 62000 February 2011 66000 January 2012 70000 December 2012

To make calculations easier, I did a couple transformations. I divided each of the question numbers by 1000, and converted the months and years into only months (with zero representing December 2006). Here's what that looks like:

 Question number (x) When the question was asked (y) 38 7 42 13 46 18 50 26 54 34 58 42 62 50 66 61 70 72

With that information, I was then able to do the linear regression.

And I could tell you exactly how I got the linear regression, but I've already taken the final and I feel lazy now, so we'll just pretend I did it on my calculator. Based on the linear model y = ax + b, I got the following results:

a = 2.016666667
b = -73.011111111

Cutting off some of the repeating digits, that makes the linear model y = 2.0167x - 73.0111. The r2 value, which I may have improperly computed but don't feel like redoing now that it's Christmas break, is approximately .988; a value of 1 indicates that the model fits the data points perfectly.

Once I had that regression, I plugged 100, 250, and 1000 (remember, I divided x by 1000 earlier) into the equation. Here are the results, transformed back into months and years:

Board Question #100000: August 2017
Board Question #250000: November 2042
Board Question #1000000: November 2168

So, if you feel like making a note in your calendar for four years and eight months from now, you just might be able to ask the one hundred thousandth question!

-yayfulness