- Probably going to be 77 pts.
- Current leader is o4-mini high at 70
- Next best openAI model is o3 at 67. Google gemini 2.5 pro scores 69. Competitive pressure keeps OpenAI incentivized to stay 1-3 pts ahead.
- Previous OpenAI flagships gpt-4 got 36, gpt-4o got 41, o3 / o4-mini are at 67-70.
- + 30 pt in about 20 months. Rate is slowing as benchmarks saturate.
- Their index methodology weights 8 hard evals and claims 1 pt measurement error.
- 1.3 pt/month over (34 pts) 26 month mar 2023 to may 2025
- o3->o4-mini high only +3, shows diminishing marginal returns. indicates benchmark saturation.
- 1.3/pt/mo * 7 months (may to dec) = 9 pts. assume productivity taper to 0.8 pts/mo. = 6 pts. Median of two approaches is +7 pts 70+7 = 77 pts
-0.046227
Relative Brier Score
3
Forecasts
1
Upvotes
Forecasting Activity
Forecasting Calendar
No forecasts in the past 3 months
| Past Week | Past Month | Past Year | This Season | All Time | |
|---|---|---|---|---|---|
| Forecasts | 0 | 0 | 3 | 3 | 3 |
| Comments | 0 | 0 | 3 | 3 | 3 |
| Questions Forecasted | 0 | 0 | 2 | 2 | 2 |
| Upvotes on Comments By This User | 0 | 0 | 1 | 1 | 1 |
| Definitions | |||||
New Prediction
New Prediction
Probability
Answer
10%
Less than or equal to 70
21%
Between 71 and 74, inclusive
27%
Between 75 and 78, inclusive
25%
Between 79 and 82, inclusive
17%
More than or equal to 83
Why do you think you're right?
Files
Why might you be wrong?
- If the metric is rebased, widen the dataset mix, tighten scoring, raw scores might dip 3-6 pts.
- If scaling laws flatten, and no GPT-5 equivalent is production ready, openAI might end 2025 with incremental finetuning (70-73).
- Multimodal "o5" or agent-augmented o5-reasoning might be significant gains. If it happens, or science is there.
Files
New Badge
New Prediction
Probability
Answer
18%
Less than or equal to 1349
21%
Between 1350 and 1499, inclusive
23%
Between 1500 and 1649, inclusive
18%
Between 1650 and 1799, inclusive
20%
More than or equal to 1800
Why do you think you're right?
- I think it's most likely to be about 1550 cases (median of scenarios below).
- Recent trajectory of cases slowing down. https://www.theguardian.com/us-news/2025/may/24/texas-measles-outbreak
- Low spread scenario 1 case/day
- Moderate 3 case/day
- Status-quo about 5 case/day
- High (no slowdown based on current data from jan 1 to may 23) 7.4 case/day = 970 additional cases between may 23 and sep 30. total about ~2000.
- High baseline immunity
- Past analogs. 2019 had 1249 cases by 30 Sep and then only 25 cases through year end.
Files
Why might you be wrong?
- If it is higher, then some communities may have lower vaccination rates, there would likely have been a summer travel spike, the healthcare system in Texas might've been stretched by the spring outbreak.
- If lower than expected, there would have to be some rapid burnout of a current outbreak. Maybe emergency vaccination uptake, more so than usual. World events that reduces travel, may reduce infections. Some cases may be misreported as measles, or not.
Files
Why do you think you're right?
Why might you be wrong?