{"id":2578,"date":"2025-09-29T08:13:01","date_gmt":"2025-09-29T08:13:01","guid":{"rendered":"https:\/\/3way.social\/blog\/ai-researcher-contests-slowdown-claims-data-findings\/"},"modified":"2025-09-29T08:27:39","modified_gmt":"2025-09-29T08:27:39","slug":"ai-researcher-contests-slowdown-claims-data-findings","status":"publish","type":"post","link":"https:\/\/3way.social\/blog\/ai-researcher-contests-slowdown-claims-data-findings\/","title":{"rendered":"AI researcher contests slowdown claims with recent exponential data findings"},"content":{"rendered":"<p>Julian Schrittwieser, a prominent AI researcher and Member of Technical Staff at <a style=\"display: inline;\" href=\"https:\/\/www.anthropic.com\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Anthropic<\/a>, has published a detailed analysis countering claims of stagnation in AI development. Known for his pivotal roles in creating revolutionary algorithms like <a style=\"display: inline;\" href=\"https:\/\/en.wikipedia.org\/wiki\/AlphaGo\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">AlphaGo<\/a>, <a style=\"display: inline;\" href=\"https:\/\/en.wikipedia.org\/wiki\/AlphaZero\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">AlphaZero<\/a>, and <a style=\"display: inline;\" href=\"https:\/\/en.wikipedia.org\/wiki\/MuZero\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">MuZero<\/a>, Schrittwieser&#8217;s latest findings, released on September 27, 2025, showcase evidence of continued exponential growth in AI capabilities.<\/p>\n<p>His analysis draws parallels to misconceptions seen during the early days of the COVID-19 pandemic, when the exponential nature of virus transmission was misunderstood. Schrittwieser remarked, &#8220;Long after the timing and scale of the coming global pandemic was obvious from extrapolating the exponential trends, politicians, journalists and most public commentators kept treating it as a remote possibility or a localized phenomenon.&#8221;<\/p>\n<p><a href=\"https:\/\/3way.social\/blog\/ai-researcher-contests-slowdown-claims-data-findings\/anthropic\/\" rel=\"attachment wp-att-2585\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2585\" src=\"https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic.jpg\" alt=\"anthropic logo\" width=\"600\" height=\"315\" srcset=\"https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic.jpg 2400w, https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic-300x158.jpg 300w, https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic-1024x538.jpg 1024w, https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic-768x403.jpg 768w, https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic-1536x806.jpg 1536w, https:\/\/3way.social\/blog\/wp-content\/uploads\/2025\/09\/anthropic-2048x1075.jpg 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h2 id=\"key-data-reveals-consistent-growth\" class=\"sb h2-sbb-cls\" tabindex=\"-1\">Key Data Reveals Consistent Growth<\/h2>\n<p>Schrittwieser&#8217;s report relies heavily on two evaluation frameworks, METR (Model Evaluation &amp; Threat Research) and OpenAI\u2019s GDPval. METR findings indicate that AI systems are now capable of autonomously completing software engineering tasks lasting up to two hours, achieving a 50% success rate. This marks consistent progress, with task-length capabilities doubling approximately every seven months. Notably, models such as Grok 4, Opus 4.1, and GPT-5 have exceeded performance expectations.<\/p>\n<p>Complementing this, OpenAI\u2019s GDPval evaluation, which spans 44 occupations across nine industries, provides additional validation. The study assessed 1,320 tasks designed by professionals with an average of 14 years of experience. Using a blinded comparison methodology, evaluators graded the performance of various AI systems against human-generated solutions. While GPT-5 delivered high accuracy in multiple industries, Claude Opus 4.1 outperformed, matching expert-level human performance in numerous tasks. Schrittwieser commended these results, stating, &#8220;I want to especially commend OpenAI here for releasing an eval that shows a model from another lab outperforming their own model &#8211; this is a good sign of integrity and caring about beneficial AI outcomes.&#8221;<\/p>\n<h2 id=\"exponential-trends-across-industries\" class=\"sb h2-sbb-cls\" tabindex=\"-1\">Exponential Trends Across Industries<\/h2>\n<p>According to METR, AI\u2019s task-completion capabilities have improved dramatically since <a style=\"display: inline;\" href=\"https:\/\/en.wikipedia.org\/wiki\/GPT-2\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">GPT-2<\/a>\u2019s ability to handle one-second tasks in 2020. By 2025, newer models such as Sonnet 3.7, Grok 4, Opus 4.1, and GPT-5 demonstrate capacity for tasks lasting two hours or more. Schrittwieser projects that AI systems will achieve autonomous work capabilities for full eight-hour tasks by mid-2026 and may surpass human expert performance across multiple industries before the end of 2026.<\/p>\n<p>The GDPval evaluation highlights progress beyond software engineering, encompassing areas such as healthcare, finance, manufacturing, and legal analysis. Tasks ranged from regulatory compliance and strategic planning to real estate management and technical engineering, designed to mirror realistic workplace scenarios.<\/p>\n<h2 id=\"industry-wide-insights-and-challenges\" class=\"sb h2-sbb-cls\" tabindex=\"-1\">Industry-Wide Insights and Challenges<\/h2>\n<p>The report underscores performance variations among leading AI models, with some, like Grok 4 and Gemini 2.5 Pro, underperforming relative to initial benchmarks. These discrepancies emphasize the critical importance of standardized evaluation methodologies.<\/p>\n<p>Despite the progress, Schrittwieser\u2019s analysis also acknowledges limitations. METR tasks, for instance, are rated on a 16-point &#8220;messiness&#8221; scale, with an average score of 3, while real-world software engineering tasks often score between 7 and 8. Similarly, GDPval tasks are structured for digital-only scenarios with complete instructions, which do not fully reflect the complexities of organizational settings, such as ambiguity, multi-team coordination, or iterative processes.<\/p>\n<h2 id=\"broader-implications-and-looking-ahead\" class=\"sb h2-sbb-cls\" tabindex=\"-1\">Broader Implications and Looking Ahead<\/h2>\n<p>Schrittwieser argues that many misconceptions about AI\u2019s development arise from a focus on surface-level interactions rather than structured evaluations. He emphasizes the importance of understanding exponential growth trends, citing historical examples of technological adoption, such as the internet and mobile devices. &#8220;Mathematical extrapolation often provides more accurate predictions than expert intuition in rapidly changing technical domains&#8221;, he explained.<\/p>\n<p>The findings are particularly timely as debates around AI progress and investment intensify. Schrittwieser\u2019s work suggests that the apparent &#8220;slowdown&#8221; in AI is a misinterpretation, and exponential advancements remain on track. His projections and data highlight the transformative potential of AI across industries, with significant milestones expected in the near future.<\/p>\n<p>Overall, the analysis reinforces the need for objective frameworks like METR and GDPval to accurately measure AI\u2019s capabilities and ensure transparency in reporting development progress. As the industry continues to evolve, Schrittwieser\u2019s call for a deeper understanding of exponential trends may prove critical for guiding investments, policies, and strategies in the AI space.<\/p>\n<p><em><a style=\"display: inline;\" href=\"https:\/\/ppc.land\/ai-researcher-challenges-claims-of-development-slowdown-with-exponential-data\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Read the source<\/a><\/em><\/p>\n<p><script async type=\"text\/javascript\" src=\"https:\/\/app.seobotai.com\/banner\/banner.js?id=68da3f8de3dd4bddfa575e43\"><\/script><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Schrittwieser analyzes METR and GDPval data showing continued exponential AI progress and models nearing expert performance.<\/p>\n","protected":false},"author":3,"featured_media":2582,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_lmt_disableupdate":"no","_lmt_disable":"no","footnotes":""},"categories":[25,22],"tags":[],"class_list":["post-2578","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-social-media"],"blocksy_meta":[],"modified_by":"Becky Halls","_links":{"self":[{"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/posts\/2578","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/comments?post=2578"}],"version-history":[{"count":3,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/posts\/2578\/revisions"}],"predecessor-version":[{"id":2586,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/posts\/2578\/revisions\/2586"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/media\/2582"}],"wp:attachment":[{"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/media?parent=2578"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/categories?post=2578"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3way.social\/blog\/wp-json\/wp\/v2\/tags?post=2578"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}