Chinese startup DeepSeek’s launch of its latest AI models, which it says are on a par or better than industry-leading models in the US at a fraction of the cost, is threatening to upset the technology world order.
The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than US$6 million worth of computing power from Nvidia H800 chips.
DeepSeek’s AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple’s App Store in the US.
Photo: Reuters 照片:路透
This has raised doubts about the reasoning behind some US tech companies’ decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit.
WHY IS DEEPSEEK CAUSING A STIR?
The release of OpenAI’s ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence.
Photo: Bloomberg 照片:彭博社
But after the release of the first Chinese ChatGPT equivalent, made by Chinese search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between US and Chinese firms.
The quality and cost efficiency of DeepSeek’s models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and US tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta’s most advanced models, the Chinese startup said.
They are also cheaper to use. The DeepSeek-R1, released last month, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek’s official WeChat account.
Photo: AP 照片:美聯社
But some have publicly expressed scepticism about DeepSeek’s success story.
Scale AI CEO Alexandr Wang (汪滔) said during an interview with CNBC on Jan. 23, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington’s export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation.
Bernstein analysts on Jan. 27 highlighted in a research note that DeepSeek’s total training costs for its V3 model were unknown but were much higher than the US$5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed.
Photo: Bloomberg 照片:彭博社
WHO IS BEHIND DEEPSEEK?
DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng (梁文鋒), co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records.
Liang’s fund announced in March 2023 on its official WeChat account that it was “starting again,” going beyond trading to concentrate resources on creating a “new and independent research group, to explore the essence of AGI [Artificial General Intelligence].” DeepSeek was created later that year.
Photo: Reuters 照片:路透
ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks.
It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, Chinese corporate records show.
High-Flyer’s AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips.
HOW DOES BEIJING VIEW DEEPSEEK?
DeepSeek’s success has already been noticed in China’s top political circles. On Jan. 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang (李強), Chinese state news agency Xinhua said.
Liang’s presence at the gathering is potentially a sign that DeepSeek’s success could be important to Beijing’s policy goal of overcoming Washington’s export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li (李彥宏).
(Reuters)
中國新創公司DeepSeek(深度求索)推出了最新的AI(人工智慧)模型,據稱與美國領先業界的模型旗鼓相當,甚至更好,但所需成本只有美國模型的一小部分,這可能顛覆科技世界的秩序。
該公司上個月在一篇論文中指出,DeepSeek-V3的訓練,只需要不到六百萬美元的輝達 H800晶片的運算力,引起了全球AI界的關注。
使用DeepSeek-V3的DeepSeek AI助手,已超越競爭對手ChatGPT,成為美國蘋果App Store上評價最高的免費應用程式。
這引發人們質疑為何一些美國科技公司要在AI領域投入數十億美元,輝達等幾家大型科技公司的股價也因此重挫。
DeepSeek為何引起轟動?
22022年底,OpenAI ChatGPT的發布,讓中國科技公司紛紛跟進,爭相創造自己的AI聊天機器人。
但在中國搜尋引擎巨頭百度發布第一個類似ChatGPT的中文版應用程式後,中國民眾對中美企業在AI能力上的差距普遍感到失望。
DeepSeek模型的品質及成本效益徹底顛覆了此說法。這家中國新創公司表示,DeepSeek-V3和DeepSeek-R1這兩款模型受到矽谷高層及美國科技公司工程師的一致好評,其水準與OpenAI及Meta最先進的模型不相上下。
而且使用DeepSeek也比較便宜。根據DeepSeek微信官方帳號上的一篇文章稱,上月發布的DeepSeek-R1,其使用成本比OpenAI o1模型低20到50倍,視任務而定。
但有些人對DeepSeek的成功故事公開表示懷疑。
Scale AI執行長汪滔1月23日接受CNBC採訪時表示,DeepSeek有五萬個輝達 H100晶片,但他並未提供證據,並聲稱不會接露這些晶片的下落,因為這會違反華盛頓的出口管制規定,即禁止將此類先進的AI晶片出售給中國公司。對此指控,DeepSeek並未直接回應。
華爾街投資機構伯恩斯坦的分析師1月27日在一份研究報告中強調,DeepSeek的V3模型訓練總成本尚不清楚,但遠高於該新創公司所稱用於算力的558萬美元。分析師也表示,同樣廣受好評的R1模型的訓練成本尚未揭露。
DeepSeek的幕後推手是誰?
DeepSeek是一家位於杭州的新創公司,根據中國公司記錄,其控股股東是量化對沖基金幻方量化的共同創辦人梁文鋒。
2023年3月,梁文鋒的基金在其微信官方帳號上宣布「重新出發」,超越交易,集中資源打造「全新獨立研究團隊,探索AGI(通用人工智慧)的本質」。DeepSeek於同年稍後創立。
開發ChatGPT的OpenAI將AGI定義為:在最具經濟價值的任務中超越人類的自主系統。
目前仍不清楚幻方量化對DeepSeek投資了多少。根據中國公司記錄,幻方量化的辦公室與DeepSeek位於同一棟大樓,並且還擁有訓練AI模型用之晶片群集的相關專利。
幻方量化的AI部門2022年7月在其官方微信上表示,他們所擁有並營運的晶片群集,有一萬個A100晶片。
北京如何看待DeepSeek?
DeepSeek的成功已引起中國高層政界的關注。據新華社報導,1月20日,DeepSeek-R1向公眾發布當天,創始人梁文鋒參加了由中國國務院總理李強主持的一場商人及專家秘密座談會。
梁文鋒出席該會議可能意味,DeepSeek的成功對於北京克服華盛頓的出口管制、實現AI等戰略產業的自給自足的政策目標至關重要。百度執行長李彥宏去年也出席了類似的研討會。
(台北時報林俐凱編譯)
Microsoft on Feb. 28 announced it was retiring Skype, the online voice and video call pioneer that the tech titan acquired in 2011. “Starting in May 2025, Skype will no longer be available,” said a post from Skype support on X, directing users to sign into Microsoft’s Teams platform for further use of its services. Skype was founded in 2003 by Scandinavians Niklas Zennstrom and Janus Friis in Estonia, revolutionizing Internet communication by offering free voice calls between computers and affordable rates for calls to landlines and mobile phones. Over the years, and as Internet speeds improved, Skype evolved to
A: China’s animated blockbuster “Ne Zha 2” also smashed a box office record recently. B: It’s No. 7 among the world’s best-selling films, grossing more than US$2 billion globally. A: I t has even become the world’s highest-grossing animated film, while the political metaphors in it are causing controversy. B: But who is Ne Zha anyway? A: Ne Zha, often spelled as “Nezha,” is actually a mythical teenage deity with superpowers. A: 中國動畫片《哪吒2》最近也打破紀錄。 B: 該片已衝上影史票房排行榜第7名,全球狂賣超過20億美元。 A: 聽說它甚至是全球最賣座的動畫片,影片中的政治隱喻卻引爆爭議! B: 但哪吒是誰? A: 哪吒的名字常被拼成「Nezha」,是神話中具有超能力的青少年神明。
People desire a sense of purpose in their lives, but they often remain idle unless they have a clear reason to act. This concept is illustrated by the retirement paradox. People work hard to prepare for a future without work but find life meaningless after achieving that goal. A study was carried out to determine if a reason, even a minor one, could encourage idle people to take action. __1__ Upon finishing the first, they were instructed to drop it off at a location either right outside the room or at a spot farther away, which would take around
Continued from yesterday(延續自昨日) https://www.taipeitimes.com/News/lang Results showed that only 32 percent were willing to walk farther for the same candy, but 59 percent chose to do so when promised different candy. Researchers also discovered that students who walked farther reported feeling happier during the wait. __3__ Idleness aversion can be applied in various settings. For example, Uber uses animations and real-time updates to keep customers from being bored while waiting. It can also be used in the design of office buildings to reduce elevator traffic. During busy times, it takes a while to wait for an elevator. __4__ Recognizing the power