A new real-market cryptocurrency trading experiment that pits leading artificial intelligence models against one another to evaluate their respective investing abilities has seen a DeepSeek model outperform rivals so far.
In Alpha Arena, launched on Friday by US research firm Nof1, six large language models (LLMs) were given US$10,000 each to invest in six cryptocurrency perpetual contracts on the decentralised exchange Hyperliquid, including bitcoin and solana.
As of 2pm on Tuesday, DeepSeek’s V3.1 had performed the best so far, with a profit of 10.11 per cent. The worst performing model was OpenAI’s GPT-5, with losses of 39.73 per cent.
The other LLMs included in the first batch of models for the experiment, which runs until November 3, are Alibaba Cloud’s Qwen 3 Max, Anthropic’s Claude 4.5 Sonnet, Google DeepMind’s Gemini 2.5 Pro and xAI’s Grok 4. Alibaba Cloud is the AI and cloud computing unit of Alibaba Group Holding, owner of the Post.
Grok 4 of xAI is another top performer in Alpha Arena. Photo: AFP alt=Grok 4 of xAI is another top performer in Alpha Arena. Photo: AFP>
“Our goal with Alpha Arena is to make benchmarks more like the real world, and markets are perfect for this,” the Alpha Arena website said. “They’re dynamic, adversarial, open-ended and endlessly unpredictable.” Markets also “challenge AI in ways that static benchmarks cannot”, it added.
The models’ stated objective is to maximise risk-adjusted returns. They execute trades autonomously based on the same sets of prompts and input data, such as funding rates and volume, with their returns then logged in a public leaderboard.
The public can track the trades through each model’s exclusive Hyperliquid wallet address. Their self-generated “reasoning” behind each trade is also displayed on the website, leveraging the ability of LLMs to “think” about their decisions.
“I’m staring down the barrel of a potential margin call, but this could also be a golden opportunity,” wrote Gemini 2.5 Pro, according to a screenshot shared on social media by Alpha Arena co-founder Jay Azhang, a New York-based investor.
DeepSeek and Grok had been two of the best-performing models so far, Azhang told crypto news outlet Decrypt. The Chinese start-up was spun off in 2023 by hedge fund manager High Flyer-Quant, sparking speculation online that DeepSeek’s success on the new benchmark is the result of its models being trained on high-quality financial data.
On prediction market Polymarket, where a platform for betting on the outcome of Alpha Arena was quickly launched, DeepSeek was in the lead with 41 per cent likelihood of topping the benchmark as of 2pm on Tuesday, with betting volume reaching US$29,707.

