Alibaba Group Holding Ltd (阿里巴巴) cofounder Jack Ma (馬雲)-backed Ant Group Co (螞蟻集團) used Chinese-made semiconductors to develop techniques for training artificial intelligence (AI) models that would cut costs by 20 percent, people familiar with the matter said.
Ant used domestic chips, including from Alibaba and Huawei Technologies Co (華為), to train models using the so-called “mixture of experts” machine learning approach, the people said.
It got results similar to those from Nvidia Corp chips, such as the H800, they said.
Photo: AFP
Hangzhou-based Ant is still using Nvidia for AI development, but is now relying mostly on alternatives, including from Advanced Micro Devices Inc and Chinese chips for its latest models, one of the people said.
The models mark Ant’s entry into a race between Chinese and US companies that has accelerated since DeepSeek (深度求索) demonstrated how capable models can be trained for far less than the billions invested by OpenAI and Alphabet Inc’s Google.
It underscores how Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductors. While not the most advanced, the H800 is a relatively powerful processor and is barred by the US from China.
The company published a research paper this month that said its models at times outperformed Meta Platforms Inc in certain benchmarks, which has not been independently verified.
However, if they work as advertised, Ant’s platforms could mark another step forward for Chinese AI development by slashing the cost of inferencing or supporting AI services.
Ant said it cost about 6.35 million yuan (US$875,952) to train 1 trillion tokens using high-performance hardware, but its optimized approach would cut that down to 5.1 million yuan using lower-specification hardware.
Tokens are the fundamental units of text — such as words, characters or parts of words — that a language model breaks down and analyzes to understand context, meaning and structure.
In essence, they are the building blocks that enable the model to interpret human language and produce intelligent output.
The company plans to leverage the recent breakthrough in the large language models it has developed, Ling-Plus and Ling-Lite, for industrial AI solutions including healthcare and finance, the people said.
On English-language understanding, Ant in its paper said that the Ling-Lite model did better in a key benchmark compared with one of Meta’s Llama models.
Ling-Lite and Ling-Plus models outperformed DeepSeek’s equivalents on Chinese-language benchmarks.
Ant has made the Ling models open-source. Ling-Lite contains 16.8 billion parameters, which are adjustable settings that work like knobs and dials to direct the model’s performance.
Ling-Plus has 290 billion parameters, which is considered relatively large in the realm of language models. For comparison, experts estimate that ChatGPT’s GPT-4.5 has 1.8 trillion parameters, MIT Technology Review said. DeepSeek-R1 has 671 billion.
The company faced challenges in some areas of the training, including stability.
Even small changes in the hardware or the model’s structure led to problems, including jumps in the models’ error rate, it said in the paper.
Additonal reporting by staff writer
Anna Bhobho, a 31-year-old housewife from rural Zimbabwe, was once a silent observer in her home, excluded from financial and family decisionmaking in the deeply patriarchal society. Today, she is a driver of change in her village, thanks to an electric tricycle she owns. In many parts of rural sub-Saharan Africa, women have long been excluded from mainstream economic activities such as operating public transportation. However, three-wheelers powered by green energy are reversing that trend, offering financial opportunities and a newfound sense of importance. “My husband now looks up to me to take care of a large chunk of expenses,
SECTOR LEADER: TSMC can increase capacity by as much as 20 percent or more in the advanced node part of the foundry market by 2030, an analyst said Taiwan Semiconductor Manufacturing Co (TSMC, 台積電) is expected to lead its peers in the advanced 2-nanometer process technology, despite competition from Samsung Electronics Co and Intel Corp, TrendForce Corp analyst Joanne Chiao (喬安) said. TSMC’s sophisticated products and its large production scale are expected to allow the company to continue dominating the global 2-nanometer process market this year, Chiao said. The world’s largest contract chipmaker is scheduled to begin mass production of chips made on the 2-nanometer process in its Hsinchu fab in the second half of this year. It would also hold a ceremony on Monday next week to
TECH CLUSTER: The US company’s new office is in the Shalun Smart Green Energy Science City, a new AI industry base and cybersecurity hub in southern Taiwan US chip designer Advanced Micro Devices Inc (AMD) yesterday launched an office in Tainan’s Gueiren District (歸仁), marking a significant milestone in the development of southern Taiwan’s artificial intelligence (AI) industry, the Tainan City Government said in a statement. AMD Taiwan general manager Vincent Chern (陳民皓) presided over the opening ceremony for the company’s new office at the Shalun Smart Green Energy Science City (沙崙智慧綠能科學城), a new AI industry base and cybersecurity hub in southern Taiwan. Facilities in the new office include an information processing center, and a research and development (R&D) center, the Tainan Economic Development Bureau said. The Ministry
State-run CPC Corp, Taiwan (CPC, 台灣中油) yesterday signed a letter of intent with Alaska Gasline Development Corp (AGDC), expressing an interest to buy liquefied natural gas (LNG) and invest in the latter’s Alaska LNG project, the Ministry of Economic Affairs said in a statement. Under the agreement, CPC is to participate in the project’s upstream gas investment to secure stable energy resources for Taiwan, the ministry said. The Alaska LNG project is jointly promoted by AGDC and major developer Glenfarne Group LLC, as Alaska plans to export up to 20 million tonnes of LNG annually from 2031. It involves constructing an 1,290km