The frenzy triggered by Computex Taipei 2024, which took place early this month, makes everyone clearly sense the arrival of the artificial intelligence (AI) era. Now that the bustling exhibition is over, a question we should consider is: What would be the biggest change in the AI era?
The importance of computing power has been widely recognized, and the development of AI applications is thriving. However, AI’s deeper impact on the world is that it marks the progress of technology from “bit” to “token.” The impact on society as a whole would be a shift from digitalization to tokenization.
In fact, this idea has been indicated in many of Nvidia CEO Jensen Huang’s (黃仁勳) speeches and interviews. He has been emphasizing the importance of floating-point numbers or tokens, saying that in the era of AI, a large number of tokens would be produced and a large amount of AI computational power would turn into AI factories.
This would drive the world to invest trillions of dollars in the innovation of computing frameworks, creating new economic value worth hundreds of trillions of dollars for the world. This is the core of the AI revolution.
In this wave of the AI gold rush, if we liken graphics processing units (GPUs) and AI computing power to “shovels” for digging gold, then the economic value generated by these tokens is the “gold mine” to be dug.
When we see the world’s Internet giants rushing to get their hands on GPUs, what we should really pay attention to is not only the “shovels” themselves, but the real target of their huge investments: the new global economic value created by the tokens in the AI wave.
In the digital era, the bit is the most basic computing unit. In the AI era, the most basic computing unit would be the token.
If you look up the definition of token on the Internet, the answer would be: In the field of AI, “token” usually refers to the smallest unit in the word processing process.
“Tokenization” is the process of breaking a continuous sequence of words into tokens. These tokens can be words, phrases, sentences or other smaller units of text. “Token” seems like a very technical term, but why is it so important? Because it is the smallest unit of computing in AI.
In text-based AI, tokens are like all the words contained in an AI dictionary. All language input must first be tokenized — to find out the appropriate tokens from this dictionary — to let the AI know what you want to express.
The result of the AI’s computation would also be output in tokens, which would then be translated back to human language through the process of de-tokenization.
The number of tokens contained in the AI dictionary is a factor to decide the range of the AI’s capabilities.
Having the right tokens to express itself can greatly increase the AI’s capabilities.
Without the proper tokens of expression, the AI would be poor in words.
The biggest difference between tokens and bits is that tokens are not just numeric expressions, but contain more implicit meanings, so that the meanings contained in these tokens can also be computed.
For example, the tokens of “Taiwan,” “US,” “Asia” and “North America” contain more meanings than the simple numeric zeroes and ones.
The training of AI models is to understand the meanings and connections between the tokens through studying a large amount of data.
So, when we ask the AI: “The relation between the US and North America equals to Taiwan and what?” the trained AI system would be able to correctly identify the relation between the tokens, and answer “Asia.”
Tokens not only function in the field of text, but use many different types of signals, such as images, video and audio, robot movement, weather information, factory data, environment perception for autonomous driving, DNA and protein structure, as well as physical and chemical signals — these can also be converted into tokens to allow AI systems to carry out computation and produce AI results.
Therefore, in the future world, AI computing will deal with huge amounts of tokens.
The large amount of data in human history — from ancient times to the present, including text, video, knowledge and measurement records — would be converted into tokens to train powerful AI models.
All kinds of inquiries and external inputs to the AI system are also converted into tokens to drive the AI system.
The AI-generated tokens are then translated into words, images, sounds, robot movements, weather forecasts, factory simulations, physics and mathematics answers or drug structures that can be understood by the outside world to further influence the world.
In fact, from a historical point of view, this wave of AI-driven tokenization is the latest advancement of civilization.
Human civilization has gone through several important stages in processing signals from the natural world, from “human observation signals,” “physical signals,” “analog signals” and “digital signals” to the latest “AI token signals.”
During the Renaissance, science, mathematics, astronomy and medicine began to flourish.
The natural phenomena that can be observed by human senses, including astronomy, physics, chemistry and medicine, began to be systematized through science and mathematics.
Natural phenomena of astronomy, physics, chemistry and medicine observed by human senses began to be systematically put in order through science and mathematics.
Nature was observed and described by human senses, and the observational data of natural phenomena were described and systematized in objective and scientific formulas of physics and mathematics.
In the first industrial revolution, as scientific knowledge based on Newtonian mechanics matured, the power of machines, such as steam engines, trains and ships, drove the development of civilization.
More importantly, the invention of various types of machines allowed the mass production of precision machines like clocks, watches, gears and textile machines.
Since this period, human beings have been able to control and process “physical signals” such as temperature, pressure, speed and so on, through the power of machinery.
In the second industrial revolution, through Scottish physicist James Clerk Maxwell’s equations of electromagnetism, mankind gained an understanding of the abstract forces of electricity and magnetism.
This led to telephones, radio, electricity and motors. From there, humans were able to utilize electricity and radio waves to process and transmit signals in the form of “analog signals.”
In recent decades, the third industrial revolution, also known as the digital revolution, took place, seeing the emergence of semiconductors, integrated circuits, computers, the Internet, mobile communications, smartphones and many other technologies.
Since this period, human beings have converted signals into “digital signals” expressed as zeroes and ones, thus dramatically increasing the accuracy and complexity of signal processing.
The computation, communication and storage of digital signals’ information built up the present technological civilization.
In this wave of AI progress, with the evolution of machine learning, neural network architectures and large language models, the “AI token signals” enable the implicit relations and meaning between information to be learned and reasoned by AI systems, to create more intelligent functions.
AI is still developing, and if we can successfully unleash the huge potential of AI, it would become the fourth industrial revolution.
In the AI gold rush, Taiwan’s ability to provide high-quality semiconductors and computing mainframes is as crucial as the must-have shovels for gold mining.
The world’s current computing mainframes are worth about US$1 trillion, and the demand for AI computing power could even double to US$2 trillion, Huang said.
Yet the higher value of the “gold mine” is hidden in the huge AI applications based on tokens.
He said that in the future, the products and services created by AI tokens would be valued at more than US$100 trillion. This is the core of this AI boom.
Therefore, we are now in a critical period in the evolution of human history and civilization. Taiwan’s position as a key player in the world’s semiconductor and information and communications industry chain has attracted global attention.
We should not stop there. We should grasp the trend of AI technology evolution and further grasp the world’s trend from digitalization to tokenization to advance the overall technological, economic and social progress.
Liang Bor-sung is senior director of MediaTek Inc’s Corporate Strategy and Strategic Technology division, a visiting professor in National Taiwan University’s Department of Computer Science and Information Engineering and Graduate School of Advanced Technology, and a professor-ranked specialist at National Yang Ming Chiao Tung University’s Institute of AI Innovation, Industry Academia Innovation School.
Translated by Lin Lee-kai
The Chinese Nationalist Party (KMT) caucus in the Legislative Yuan has made an internal decision to freeze NT$1.8 billion (US$54.7 million) of the indigenous submarine project’s NT$2 billion budget. This means that up to 90 percent of the budget cannot be utilized. It would only be accessible if the legislature agrees to lift the freeze sometime in the future. However, for Taiwan to construct its own submarines, it must rely on foreign support for several key pieces of equipment and technology. These foreign supporters would also be forced to endure significant pressure, infiltration and influence from Beijing. In other words,
As Taiwan’s domestic political crisis deepens, the opposition Chinese Nationalist Party (KMT) and Taiwan People’s Party (TPP) have proposed gutting the country’s national spending, with steep cuts to the critical foreign and defense ministries. While the blue-white coalition alleges that it is merely responding to voters’ concerns about corruption and mismanagement, of which there certainly has been plenty under Democratic Progressive Party (DPP) and KMT-led governments, the rationales for their proposed spending cuts lay bare the incoherent foreign policy of the KMT-led coalition. Introduced on the eve of US President Donald Trump’s inauguration, the KMT’s proposed budget is a terrible opening
“I compare the Communist Party to my mother,” sings a student at a boarding school in a Tibetan region of China’s Qinghai province. “If faith has a color,” others at a different school sing, “it would surely be Chinese red.” In a major story for the New York Times this month, Chris Buckley wrote about the forced placement of hundreds of thousands of Tibetan children in boarding schools, where many suffer physical and psychological abuse. Separating these children from their families, the Chinese Communist Party (CCP) aims to substitute itself for their parents and for their religion. Buckley’s reporting is
Last week, the Chinese Nationalist Party (KMT) and the Taiwan People’s Party (TPP), together holding more than half of the legislative seats, cut about NT$94 billion (US$2.85 billion) from the yearly budget. The cuts include 60 percent of the government’s advertising budget, 10 percent of administrative expenses, 3 percent of the military budget, and 60 percent of the international travel, overseas education and training allowances. In addition, the two parties have proposed freezing the budgets of many ministries and departments, including NT$1.8 billion from the Ministry of National Defense’s Indigenous Defense Submarine program — 90 percent of the program’s proposed