93rd
Accuracy Rank

404_NOT_FOUND

Nicolò
About:
Show more

No Scores Yet

Relative Brier Score

0

Forecasts

0

Upvotes
Forecasting Activity
Forecasting Calendar
 

Past Week Past Month Past Year This Season All Time
Forecasts 2 22 254 238 802
Comments 3 24 269 260 413
Questions Forecasted 2 20 44 36 66
Upvotes on Comments By This User 1 9 219 200 828
 Definitions
New Prediction
Why do you think you're right?

Trump just signed an executive order to block individual States from enforcing their own AI laws and regulations. 

While this can be seen as a further attempt to ensure the AI industry remains unregulated, the full text of the executive order actually lays the foundation for the regulation of AI systems at a national level.

My Administration must act with the Congress to ensure that there is a minimally burdensome national standard — not 50 discordant State ones. The resulting framework must forbid State laws that conflict with the policy set forth in this order. That framework should also ensure that children are protected, censorship is prevented, copyrights are respected, and communities are safeguarded. A carefully crafted national framework can ensure that the United States wins the AI race, as we must. [1]

The probability of AI being regulated through a comprehensive national framework within the next year increases substantially. Yet, it remains impossible to see anything happen by the end of the year.

[1] ENSURING A NATIONAL POLICY FRAMEWORK FOR ARTIFICIAL INTELLIGENCE

Files
Why might you be wrong?

No change

Files
New Badge
404_NOT_FOUND
earned a new badge:

Active Forecaster

New Prediction
Why do you think you're right?

The recent news that Germany has summoned the Russian ambassador to formally accuse Russia of a cyber-attack on its air traffic control [1] adds layers of complexity to this question:

  • First, the attack took place in August 2024. The formal accusation arrives after a 15-month lag. Sure, the resolution criteria would grant a positive resolution even without a formal governmental accusation, as a report by reputable cybersecurity researchers would suffice. Yet this shows how long a formal investigation would need to be to clear any doubt about the perpetrators of an attack.
  • Second, as this instance shows, the full extent of the effects of a cyberattack on nationally critical infrastructure might remain hidden from the public as it is treated as sensitive information that has to remain classified.

Just as in basically every other instance, the August 2024 cyberattack didn't cause any apparent kinetic effects, and its objective might have been purely to disrupt systems or gather intelligence. 

In conclusion, once more, there is evidence that resolving this question as "Yes" might not be as easy as it seems.

[1] BBC - Germany accuses Russia of air traffic control cyber-attack

Files
Why might you be wrong?

Anthropic reported that a Chinese state-sponsored group recently jailbroke Claude and had it execute cyberattacks on multiple global targets in an espionage campaign. [2]

Having these agentic AIs occasionally succeeded in extracting data from sensitive databases of a few of the targeted organizations, we need to update our beliefs about what LLMs can be used to achieve.

In this particular instance, it was still an expert hacker group that attempted the cyberattack, and they used AI as a tool to carry out the most labor-intensive portions of the cyberattack in a fraction of the time it would take humans. As long as there is no evidence that this will cause the number of attacks of various types to skyrocket, I remain resistant to increasing the baseline probability of a cyberattack. 

Nonetheless, this is a significant development for this question.

[2] Disrupting the first reported AI-orchestrated cyber espionage campaign

Files
New Badge
404_NOT_FOUND
earned a new badge:

Power Forecaster - Nov 2025

Earned for making 20+ forecasts in a month.
New Badge
404_NOT_FOUND
earned a new badge:

Star Commenter - Nov 2025

Earned for making 5+ comments in a month (rationales not included).
New Prediction
Why do you think you're right?
  • FDA's approval process for de novo medical devices (the use of a novel technology wouldn't seem suited to a 510k application) is typically 12-14 months long
  • Only devices that are already in the approval pipeline have a chance of being approved before the end of March
  • Two such devices seem to exist. Their applications date back to earlier than May 2022 (Wysa) and earlier than January 2025 (PathChat DX). 
  • Typical FDA approval rate for the novo applications is lower than 50%

Both Wysa and PathChat could reach the end of the pipeline within March, yet the former has been under review for 3 years now, implying that there might be some concerns there. Similar concerns would apply to PathChat as well, so I'm expecting both a) a longer approval and b) a lower success rate.

This would seem to be confirmed by a recent DHAC meeting (as highlighted by @grainmummy here) where the FDA signalled the intention to create an ad-hoc regulatory framework for these new devices. This would likely take months and produce as a result a new set of requirements that manufacturer would be asked to meet, further extending the timeline. 

The creation of a new risk-based framework specifically for GenAI technologies is likely to take more than the 4 months that are left in this question, so the likelihood for approval of any llm-based device is very low.

I'm remaining higher than the crowd due to the existance of these two companies with products that have already been in the pipeline for enough time.

Files
Why might you be wrong?
  • The DHAC is only an advisory body, so the FDA might proceed with approval without implemening any new framework
Files
grainmummy
made a comment:
As you said, a balanced perspective should be considered, where the risks outweigh the benefits, but the benefits don't make it entirely impossible. General regulatory headwinds and past De Novo deadlines suggest a "No," but the presence of two strong candidates already in the pipeline and undergoing expedited review makes the likelihood of a "Yes" low.  Thank you so much for tagging me.
Files
New Prediction
Why do you think you're right?

TL;DR: We have no official data for any of the largest models. Epoch AI is guesstimating the total training compute and often basing new estimates on older ones. 

Rationale

Leading AI companies are not disclosing any data regarding the total training compute necessary for their models. Therefore, we have no official data for any of the largest models in the list.

Grok and GPT's total training compute is estimated. For Claude and Gemini, there is an estimate in the "Training compute notes" section, but it's not reported in the adjacent column that will be used for resolution.

What this means in practice is that we are currently forecasting the likelihood that Epoch AI's estimates for one model would surpass the required threshold

So, how is Epoch AI estimating these values? There appears to be no consistent methodology, but they use multiple approaches depending on which data is available. Here are a few examples:

  • In most cases, it's a formula based on the number of parameters the model has (which, in most cases, is not publicly disclosed and therefore estimated)
  • GPT-5's training compute is based on nothing solid. It's just a series of hypotheses of what the compute should be in relation to GPT-4 (whose compute notes read "this is a rough estimate based on public information, much less information than most other systems in the database")
  • Claude 3.7's total training compute is based on the claim that it took "a few tens of millions of dollars" to train. Epoch AI took the geometric average of $20-90 M and reverse-engineered the total compute based on hypotheses of the cost of training.
  • Most importantly, Grok 4's (our current benchmark model) is based on qualitative assumptions relative to Grok 3's training, which was based on estimates of a training duration of "approximately 3 months".

Overall, it seems just like a giant guesstimation game, which brings reasonable and plausible estimates, but might not be best suited to be the resolution criteria for a forecasting question.

Files
Why might you be wrong?

No change

Files
New Prediction
Why do you think you're right?
Slightly adjusting to the current crowd consensus.
Files
Why might you be wrong?

One key issue with this question is that it's challenging to update the prediction in a meaningful way regularly. New information is sporadic and is usually not available through mainstream media sources. Rather, one would need to read through the cryptic abstracts of highly domain-specific newly published papers to see if the content is relevant to the question and how big the effect could be. It seems to me that advances are incremental, and it's unclear how much they contribute to the overall outcome.

Files
New Prediction
Why do you think you're right?

Our own current biology appears as mirror-life to any mirror-life organism we might be able to create. The existential risk assumption is that newly created mirror-life bacteria would pose a threat to our ecosystems while also being immune to our natural defenses. It seems kind of contradictory to me, as it would make sense to assume that the threat would be at least reciprocal.

The idea seems to be that mirror-life can thrive and survive by eating/infecting non-mirror-life organisms, while our collective terrestrial biology would be incapable of affecting it.

The more I think about this, the less the alarmists seem to make sense, and I'm starting to think that policymakers will soon have similar doubts.

Global searches for "mirror life" on Google Trends. Are we already past peak hype? If so, the likelihood that any policy decision is made would be greatly reduced.


The only recent mention of mirror-life in mainstream media is in a CBC review of the Plur1bus drama TV series, where it, somehow, occupies the first 6 paragraphs of the article.

Files
Why might you be wrong?

No change

Files
New Prediction
Why do you think you're right?

Our own current biology appears as mirror-life to any mirror-life organism we might be able to create. The existential risk assumption is that newly created mirror-life bacteria would pose a threat to our ecosystems while also being immune to our natural defenses. It seems kind of contradictory to me, as it would make sense to assume that the threat would be at least reciprocal. 

The idea seems to be that mirror-life can thrive and survive by eating/infecting non-mirror-life organisms, while our collective terrestrial biology would be incapable of affecting it. 

The more I think about this, the less the alarmists seem to make sense, and I'm starting to think that policymakers will soon have similar doubts.

Global searches for "mirror life" on Google Trends. Are we already past peak hype? If so, the likelihood that any policy decision is made would be greatly reduced.  

The only recent mention of mirror-life in mainstream media is in a CBC review of the Plur1bus drama TV series, where it, somehow, occupies the first 6 paragraphs of the article.

Files
Why might you be wrong?

No change

Files
Files
Tip: Mention someone by typing @username