By when will AIs perform at least as well as humans on GAIA?

IA TecnologíaOne-Off9a

Manifold MarketsSin KYCDatos de resolución verificadosBien calibrado

Aviso de calidad de datosDatos obsoletos

Datos del 4 jun 2026, 4:44 UTC · política pm-quality-3

Pronóstico comunitario actual

Before 2035-01-01 97.1%

Líder entre 7 opciones

Pronosticadores

Tipo de pregunta

multiple choice

Metodología

Play-money forecasting platform

Tipo de fuente

Pronóstico

Datos de mercado

Actualizado hace 53 días

Desactualizado

21 feb 24, 4:362 ene 36, 7:59

Tendencias

Resultado24hProbabilidad

Before 2024-06-01

Before 2025-01-01

Solo fondos simulados, sin dinero realNo es asesoramiento financiero

Resultado elegido

Before 2027-01-0192%

Apuesta (USDT)

Reglas

The GAIA benchmark (https://arxiv.org/abs/2311.12983) aims to test for the next level of capability for AI agents.

Quoting from the paper: "GAIA proposes real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency.
GAIA questions are conceptually simple for humans yet challenging for most advanced AIs: we show that human respondents obtain 92% vs. 15% for GPT-4 equipped with plugins."
This market will resolve based on when an AI system performs as well or better than humans on all 3 of the different levels of the benchmark.
I'll use the numbers from Table 4 in paper: 93.9% on level 1, 91.8% on level 2, and 87.3% on level 3.
(I'm using the conjunction of all 3 levels rather than the average to be somewhat conservative about this level being achieved.)

Mercados Relacionados

Will Anthropic’s valuation hit __ by December 31?

59,6 mil €

↑$1.1T: 100%

POLYMARKET

Which company has best AI model end of July?

28,4 mil €

Anthropic: 99%

POLYMARKET

Will any AI model reach ___ Overall Arena Score by September 30?

11,9 mil €

1510: 100%

POLYMARKET

When will a non-SpaceX successfully reusable booster be first launched?

6,2 mil €

By Dec 31, 2025: 74%

MANIFOLD MARKETS

When will any company achieve AGI?

2,4 mil €

Before Oct 1, 2027: 37%

KALSHI

When will Google release Gemini 3.5 Pro?

2 mil €

Before Jul 31, 2026: 3%

KALSHI

Activos en estos temas

BitcoinBTC$63,264.89-3.11%

EthereumETH$1,878.57-3.75%

SolanaSOL$73.10-4.21%

DogecoinDOGE$0.07-3.82%

BNBBNB$565.24-1.41%

XRPXRP$1.06-4.53%

Noticias Relacionadas

Coinbase Opens Payment Rails for AI Agents as Corporate Clients Accept Autonomous TransactionsBlockchain Reporter

Google Ships New Gemini Flash Models, But Pro Is Still MissingDecrypt

Google Is Building an AI Chip Just for Gemini—And Investors Already Moved On ItDecrypt

WhiteBIT Launches AI Hub: Trade, Monitor and Automate Through Your Favourite AI AssistantBlockchain Reporter

Over 95% of Coinbase’s code is now written with AICointelegraph

DeepSeek plots $71B IPO to challenge OpenAI in global AI raceCrypto News

Reglas

The GAIA benchmark (https://arxiv.org/abs/2311.12983) aims to test for the next level of capability for AI agents.

Quoting from the paper: "GAIA proposes real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency.
GAIA questions are conceptually simple for humans yet challenging for most advanced AIs: we show that human respondents obtain 92% vs. 15% for GPT-4 equipped with plugins."
This market will resolve based on when an AI system performs as well or better than humans on all 3 of the different levels of the benchmark.
I'll use the numbers from Table 4 in paper: 93.9% on level 1, 91.8% on level 2, and 87.3% on level 3.
(I'm using the conjunction of all 3 levels rather than the average to be somewhat conservative about this level being achieved.)

By when will AIs perform at least as well as humans on GAIA?

IA TecnologíaOne-Off9a

Manifold MarketsSin KYCDatos de resolución verificadosBien calibrado

Aviso de calidad de datosDatos obsoletos

Datos del 4 jun 2026, 4:44 UTC · política pm-quality-3

Pronóstico comunitario actual

Before 2035-01-01 97.1%

Líder entre 7 opciones

Pronosticadores

Tipo de pregunta

multiple choice

Metodología

Play-money forecasting platform

Tipo de fuente

Pronóstico

Datos de mercado

Actualizado hace 53 días

Desactualizado

21 feb 24, 4:362 ene 36, 7:59

Tendencias

Resultado24hProbabilidad

Before 2024-06-01

Before 2025-01-01

Solo fondos simulados, sin dinero realNo es asesoramiento financiero

Resultado elegido

Before 2027-01-0192%

Apuesta (USDT)

Reglas

The GAIA benchmark (https://arxiv.org/abs/2311.12983) aims to test for the next level of capability for AI agents.

Quoting from the paper: "GAIA proposes real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency.
GAIA questions are conceptually simple for humans yet challenging for most advanced AIs: we show that human respondents obtain 92% vs. 15% for GPT-4 equipped with plugins."
This market will resolve based on when an AI system performs as well or better than humans on all 3 of the different levels of the benchmark.
I'll use the numbers from Table 4 in paper: 93.9% on level 1, 91.8% on level 2, and 87.3% on level 3.
(I'm using the conjunction of all 3 levels rather than the average to be somewhat conservative about this level being achieved.)

Mercados Relacionados

Will Anthropic’s valuation hit __ by December 31?

59,6 mil €

↑$1.1T: 100%

POLYMARKET

Which company has best AI model end of July?

28,4 mil €

Anthropic: 99%

POLYMARKET

Will any AI model reach ___ Overall Arena Score by September 30?

11,9 mil €

1510: 100%

POLYMARKET

When will a non-SpaceX successfully reusable booster be first launched?

6,2 mil €

By Dec 31, 2025: 74%

MANIFOLD MARKETS

When will any company achieve AGI?

2,4 mil €

Before Oct 1, 2027: 37%

KALSHI

When will Google release Gemini 3.5 Pro?

2 mil €

Before Jul 31, 2026: 3%

KALSHI

Activos en estos temas

BitcoinBTC$63,264.89-3.11%

EthereumETH$1,878.57-3.75%

SolanaSOL$73.10-4.21%

DogecoinDOGE$0.07-3.82%

BNBBNB$565.24-1.41%

XRPXRP$1.06-4.53%

Noticias Relacionadas

Coinbase Opens Payment Rails for AI Agents as Corporate Clients Accept Autonomous TransactionsBlockchain Reporter

Google Ships New Gemini Flash Models, But Pro Is Still MissingDecrypt

Google Is Building an AI Chip Just for Gemini—And Investors Already Moved On ItDecrypt

WhiteBIT Launches AI Hub: Trade, Monitor and Automate Through Your Favourite AI AssistantBlockchain Reporter

Over 95% of Coinbase’s code is now written with AICointelegraph

DeepSeek plots $71B IPO to challenge OpenAI in global AI raceCrypto News

Reglas

The GAIA benchmark (https://arxiv.org/abs/2311.12983) aims to test for the next level of capability for AI agents.

Quoting from the paper: "GAIA proposes real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency.
GAIA questions are conceptually simple for humans yet challenging for most advanced AIs: we show that human respondents obtain 92% vs. 15% for GPT-4 equipped with plugins."
This market will resolve based on when an AI system performs as well or better than humans on all 3 of the different levels of the benchmark.
I'll use the numbers from Table 4 in paper: 93.9% on level 1, 91.8% on level 2, and 87.3% on level 3.
(I'm using the conjunction of all 3 levels rather than the average to be somewhat conservative about this level being achieved.)