Chinese startup DeepSeek’s launch of its latest AI models, which it says are on a par or better than industry-leading models in the US at a fraction of the cost, is threatening to upset the technology world order.
The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than US$6 million worth of computing power from Nvidia H800 chips.
DeepSeek’s AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple’s App Store in the US.
Photo: Reuters 照片:路透
This has raised doubts about the reasoning behind some US tech companies’ decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit.
WHY IS DEEPSEEK CAUSING A STIR?
The release of OpenAI’s ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence.
Photo: Bloomberg 照片:彭博社
But after the release of the first Chinese ChatGPT equivalent, made by Chinese search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between US and Chinese firms.
The quality and cost efficiency of DeepSeek’s models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and US tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta’s most advanced models, the Chinese startup said.
They are also cheaper to use. The DeepSeek-R1, released last month, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek’s official WeChat account.
Photo: AP 照片:美聯社
But some have publicly expressed scepticism about DeepSeek’s success story.
Scale AI CEO Alexandr Wang (汪滔) said during an interview with CNBC on Jan. 23, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington’s export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation.
Bernstein analysts on Jan. 27 highlighted in a research note that DeepSeek’s total training costs for its V3 model were unknown but were much higher than the US$5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed.
Photo: Bloomberg 照片:彭博社
WHO IS BEHIND DEEPSEEK?
DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng (梁文鋒), co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records.
Liang’s fund announced in March 2023 on its official WeChat account that it was “starting again,” going beyond trading to concentrate resources on creating a “new and independent research group, to explore the essence of AGI [Artificial General Intelligence].” DeepSeek was created later that year.
Photo: Reuters 照片:路透
ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks.
It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, Chinese corporate records show.
High-Flyer’s AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips.
HOW DOES BEIJING VIEW DEEPSEEK?
DeepSeek’s success has already been noticed in China’s top political circles. On Jan. 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang (李強), Chinese state news agency Xinhua said.
Liang’s presence at the gathering is potentially a sign that DeepSeek’s success could be important to Beijing’s policy goal of overcoming Washington’s export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li (李彥宏).
(Reuters)
中國新創公司DeepSeek(深度求索)推出了最新的AI(人工智慧)模型,據稱與美國領先業界的模型旗鼓相當,甚至更好,但所需成本只有美國模型的一小部分,這可能顛覆科技世界的秩序。
該公司上個月在一篇論文中指出,DeepSeek-V3的訓練,只需要不到六百萬美元的輝達 H800晶片的運算力,引起了全球AI界的關注。
使用DeepSeek-V3的DeepSeek AI助手,已超越競爭對手ChatGPT,成為美國蘋果App Store上評價最高的免費應用程式。
這引發人們質疑為何一些美國科技公司要在AI領域投入數十億美元,輝達等幾家大型科技公司的股價也因此重挫。
DeepSeek為何引起轟動?
22022年底,OpenAI ChatGPT的發布,讓中國科技公司紛紛跟進,爭相創造自己的AI聊天機器人。
但在中國搜尋引擎巨頭百度發布第一個類似ChatGPT的中文版應用程式後,中國民眾對中美企業在AI能力上的差距普遍感到失望。
DeepSeek模型的品質及成本效益徹底顛覆了此說法。這家中國新創公司表示,DeepSeek-V3和DeepSeek-R1這兩款模型受到矽谷高層及美國科技公司工程師的一致好評,其水準與OpenAI及Meta最先進的模型不相上下。
而且使用DeepSeek也比較便宜。根據DeepSeek微信官方帳號上的一篇文章稱,上月發布的DeepSeek-R1,其使用成本比OpenAI o1模型低20到50倍,視任務而定。
但有些人對DeepSeek的成功故事公開表示懷疑。
Scale AI執行長汪滔1月23日接受CNBC採訪時表示,DeepSeek有五萬個輝達 H100晶片,但他並未提供證據,並聲稱不會接露這些晶片的下落,因為這會違反華盛頓的出口管制規定,即禁止將此類先進的AI晶片出售給中國公司。對此指控,DeepSeek並未直接回應。
華爾街投資機構伯恩斯坦的分析師1月27日在一份研究報告中強調,DeepSeek的V3模型訓練總成本尚不清楚,但遠高於該新創公司所稱用於算力的558萬美元。分析師也表示,同樣廣受好評的R1模型的訓練成本尚未揭露。
DeepSeek的幕後推手是誰?
DeepSeek是一家位於杭州的新創公司,根據中國公司記錄,其控股股東是量化對沖基金幻方量化的共同創辦人梁文鋒。
2023年3月,梁文鋒的基金在其微信官方帳號上宣布「重新出發」,超越交易,集中資源打造「全新獨立研究團隊,探索AGI(通用人工智慧)的本質」。DeepSeek於同年稍後創立。
開發ChatGPT的OpenAI將AGI定義為:在最具經濟價值的任務中超越人類的自主系統。
目前仍不清楚幻方量化對DeepSeek投資了多少。根據中國公司記錄,幻方量化的辦公室與DeepSeek位於同一棟大樓,並且還擁有訓練AI模型用之晶片群集的相關專利。
幻方量化的AI部門2022年7月在其官方微信上表示,他們所擁有並營運的晶片群集,有一萬個A100晶片。
北京如何看待DeepSeek?
DeepSeek的成功已引起中國高層政界的關注。據新華社報導,1月20日,DeepSeek-R1向公眾發布當天,創始人梁文鋒參加了由中國國務院總理李強主持的一場商人及專家秘密座談會。
梁文鋒出席該會議可能意味,DeepSeek的成功對於北京克服華盛頓的出口管制、實現AI等戰略產業的自給自足的政策目標至關重要。百度執行長李彥宏去年也出席了類似的研討會。
(台北時報林俐凱編譯)
As we bundle up in thick coats to stay warm during the winter, there is a population that has already adapted to extremely low temperatures. These people live in the remote city of Yakutsk, the coldest city on Earth. Yakutsk is situated in the heart of Siberia, which is the capital of the Sakha Republic in Russia. This historic mining city began to flourish in the 19th century following the discovery of gold deposits. Given its construction on permafrost, the average temperature in the city remains below 0°C for over half the year, with winter temperatures dropping to an astonishing -50°C.
本文由生成式 AI 協作,本刊編輯編修。 Have you ever wondered how an athlete who once performed flawlessly can unexpectedly struggle with the simplest tasks? Imagine an __1__ pitcher who suddenly can’t find the strike zone—this is the “yips” in action. This __2__ phenomenon primarily affects athletes in sports like baseball and golf. It is characterized by a sudden loss of motor skills, leading to difficulties with routine actions that were __3__ before, such as a pitcher’s throw or a golfer’s putt. For instance, American baseball pitcher Steve Blass, who had a stellar performance in the 1971 World Series, suffered a sudden inability
A: Apart from 2NE1, Rain and Maroon 5, Japanese band Yoasobi is set to hold two shows in Taipei this weekend. B: Yoasobi? A: Yoasobi is a J-pop duo formed by Ayase and Ikura in 2019, and it’s loved by young people. Haven’t you heard? B: Oops, I’m feeling a little old. A: It sings the theme songs of “Oshi No Ko” (“My Idol’s Children”) and other TV series, leading it to gain popularity among young people. A: 除了2NE1、Rain、魔力紅,日本熱門樂團Yoasobi本週末也將連唱兩場。 B: Yoasobi樂團? A: 這是由Ayase、Ikura在2019年組成的雙人團體,近年來大受年輕人歡迎!你沒聽過嗎? B: 天啊我覺得自己老了。 A: 他們唱了《我推的孩子》等人氣影視作品主題曲,所以大受年輕人喜愛。 (By Eddy Chang, Taipei Times/台北時報張聖恩)
A: Happy Year of the Snake! Did you do anything special during the Lunar New Year holiday? B: I went to K-pop girl group Apink’s concert. How about you? A: I just stayed at home. But I’m going to girl group 2NE1’s show on Saturday. B: Wow, I really love their megahit “I Am the Best,” better known by its Korean title “Naega jeil jal naga.” A: I’m so glad that 2NE1 reunited last year, eight years after they disbanded in 2016. A: 蛇年快樂!你春節有做什麼特別活動嗎? B: 我去了南韓女團Apink的演唱會,你呢? A: 我都宅在家裡,不過這週六要去韓流天團2NE1的演唱會。 B: 我愛該團神曲《我最紅》,韓文歌名《Naega jeil jal naga》超洗腦。 A: 她們2016年解散8年後,去年終於合體真令人開心。 (By Eddy Chang, Taipei