The Cross-Industry Integration of AI and Crypto Assets: How Deep Learning is Reshaping the Industry Landscape

2025-08-06 16:35:18

AI and Crypto: From Zero to Peak

The recent development of the AI industry is viewed by some as the Fourth Industrial Revolution. The emergence of large models has significantly improved efficiency across various sectors, with Boston Consulting estimating that GPT has increased work efficiency in the United States by about 20%. At the same time, the generalization capability brought by large models is seen as a new software design paradigm; in the past, software design was about precise code, whereas now it is about embedding more generalized large model frameworks into software, allowing these applications to perform better and support a wider range of input and output modalities. Deep learning technology has indeed brought about a fourth boom in the AI industry, and this trend has also spread to the crypto industry.

This report will explore in detail the development history of the AI industry, the classification of technologies, and the impact of the invention of deep learning technology on the industry. It will then analyze in depth the upstream and downstream of the industry chain in deep learning, including GPUs, cloud computing, data sources, edge devices, and their current state and trends. After that, we will discuss in detail the relationship between the Crypto and AI industries, and outline the structure of the AI industry chain related to Crypto.

The Development History of the AI Industry

The AI industry started in the 1950s. In order to achieve the vision of artificial intelligence, academia and industry have developed many schools of thought to realize artificial intelligence under different historical contexts and disciplinary backgrounds.

The main term used in modern artificial intelligence technology is "machine learning". The concept of this technology is to allow machines to iteratively improve system performance on tasks based on data. The main steps involve feeding data into algorithms, training models with this data, testing and deploying the models, and using the models to perform automated prediction tasks.

Currently, there are three main schools of thought in machine learning: connectionism, symbolism, and behaviorism, which imitate the human nervous system, thinking, and behavior respectively.

Currently, connectionism represented by neural networks is dominant (, also known as deep learning ). The main reason for this is that this architecture has an input layer, an output layer, and multiple hidden layers. Once the number of layers and the number of neurons ( parameters ) become sufficiently large, there are enough opportunities to fit complex general tasks. By inputting data, the parameters of the neurons can be continuously adjusted, and after experiencing multiple data inputs, the neuron will reach an optimal state ( parameters ). This is what we refer to as "great strength brings miracles," and this is also the origin of the word "deep"—sufficient layers and neurons.

For example, it can be simply understood as constructing a function. When we input X=2, Y=3; and X=3, Y=5, if we want this function to accommodate all X, we need to continuously add the degree of this function and its parameters. For instance, I can construct a function that satisfies this condition as Y = 2X - 1. However, if there is a data point where X=2, Y=11, we need to reconstruct a function suitable for these three data points. Using GPU for brute-force cracking, we find Y = X² - 3X + 5, which is quite suitable, but it does not need to completely overlap with the data; it just needs to adhere to balance and provide roughly similar output. In this context, X², X, and X₀ represent different neurons, while 1, -3, and 5 are their parameters.

At this point, if we input a large amount of data into the neural network, we can increase the neurons and iterate the parameters to fit the new data. This way, we can fit all the data.

The deep learning technology based on neural networks has also undergone multiple iterations and evolutions, such as the earliest neural networks in the above image, feedforward neural networks, RNNs, CNNs, and GANs, eventually evolving into modern large models like GPT that use Transformer technology. The Transformer technology is just one evolutionary direction of neural networks, adding a converter ( Transformer ), which is used to encode data from all modalities ( such as audio, video, images, etc. ) into corresponding numerical representations. This data is then input into the neural network, allowing the neural network to fit any type of data, thus achieving multimodality.

The development of AI has gone through three technological waves. The first wave occurred in the 1960s, a decade after AI technology was proposed. This wave was driven by the development of symbolic technology, which addressed issues related to general natural language processing and human-computer dialogue. During the same period, expert systems emerged, notably the DENRAL expert system completed by Stanford University. This system possesses a strong knowledge of chemistry and infers answers similar to those of a chemistry expert through questioning. This chemistry expert system can be seen as a combination of a chemistry knowledge base and an inference system.

After expert systems, in the 1990s, Judea Pearl ( proposed Bayesian networks, which are also known as belief networks. During the same period, Brooks introduced behavior-based robotics, marking the birth of behaviorism.

In 1997, IBM's Deep Blue defeated chess champion Kasparov 3.5:2.5, and this victory is regarded as a milestone in artificial intelligence, marking the peak of a second wave of AI development.

The third wave of AI technology occurred in 2006. The three giants of deep learning, Yann LeCun, Geoffrey Hinton, and Yoshua Bengio, proposed the concept of deep learning, an algorithm that uses artificial neural networks as architecture to perform representation learning on data. Subsequently, deep learning algorithms gradually evolved from RNNs, GANs to Transformers and Stable Diffusion, which together shaped this third wave of technology, marking the peak of connectionism.

Many iconic events have gradually emerged alongside the exploration and evolution of deep learning technology, including:

In 2011, IBM's Watson) defeated humans and won the championship on the quiz show Jeopardy(.
In 2014, Goodfellow proposed GAN), Generative Adversarial Network(, which learns through a game between two neural networks, capable of generating realistic-looking photos. At the same time, Goodfellow also wrote a book called "Deep Learning," known as the "flower book," which is one of the important introductory books in the field of deep learning.
In 2015, Hinton and others proposed deep learning algorithms in the journal "Nature," which immediately caused a huge response in both academia and industry.
In 2015, OpenAI was founded, with Musk, Y Combinator president Altman, angel investor Peter Thiel ) and others announcing a joint investment of $1 billion.
In 2016, AlphaGo, based on deep learning technology, faced Go world champion and professional 9-dan player Lee Sedol in a man-machine Go battle, winning with a total score of 4 to 1.
In 2017, Hanson Robotics, a company that developed the humanoid robot Sophia, referred to as the first robot in history to receive citizen status, possesses a wide range of facial expressions and human language understanding capabilities.
In 2017, Google published the paper "Attention is all you need" proposing the Transformer algorithm, and large-scale language models began to emerge.
In 2018, OpenAI released the GPT( Generative Pre-trained Transformer) built on the Transformer algorithm, which was one of the largest language models at that time.
In 2018, Google's DeepMind team released AlphaGo based on deep learning, which is capable of predicting protein structures and is regarded as a significant milestone in the field of artificial intelligence.
In 2019, OpenAI released GPT-2, which has 1.5 billion parameters.
In 2020, OpenAI developed GPT-3, which has 175 billion parameters, 100 times more than the previous version GPT-2. This model was trained on 570GB of text and can achieve state-of-the-art performance in multiple NLP( tasks) such as question answering, translation, and article writing(.
In 2021, OpenAI released GPT-4, which has 1.76 trillion parameters, ten times that of GPT-3.
The ChatGPT application based on the GPT-4 model was launched in January 2023, and in March, ChatGPT reached 100 million users, becoming the fastest application in history to reach 100 million users.
In 2024, OpenAI will launch GPT-4 omni.

Note: Due to the numerous papers on artificial intelligence, various schools of thought, and differing technological evolutions, this mainly follows the historical development of deep learning or connectionism, while other schools and technologies are still in a rapid development phase.

![Newbie Guide丨AI x Crypto: From Zero to Peak])https://img-cdn.gateio.im/webp-social/moments-0c9bdea33a39a2c07d1f06760ed7e804.webp(

Deep Learning Industry Chain

The current large model languages are all based on deep learning methods using neural networks. Led by GPT, large models have created a wave of enthusiasm for artificial intelligence, attracting a large number of players into this field. We have also found that the market's demand for data and computing power has surged significantly. Therefore, in this part of the report, we mainly explore the industrial chain of deep learning algorithms. In the AI industry dominated by deep learning algorithms, we examine how the upstream and downstream are composed, as well as the current state and supply-demand relationship of both ends, and how they will develop in the future.

First, it is important to clarify that when training large models led by GPT based on Transformer technology, there are a total of three steps.

Before training, because it is based on Transformer, the converter needs to convert text input into numerical values, a process known as "Tokenization". After this, these numerical values are referred to as Tokens. Under general guidelines, an English word or character can roughly be regarded as one Token, while each Chinese character can be roughly considered as two Tokens. This is also the basic unit used for GPT pricing.

Step one, pre-training. By providing enough data pairs to the input layer, similar to the examples given in the first part of the report )X,Y(, we search for the optimal parameters for each neuron under this model. At this time, a large amount of data is needed, and this process is also the most computationally intensive, as neurons need to iterate repeatedly to try various parameters. After a batch of data pairs is trained, the same batch of data is generally used for secondary training to iterate the parameters.

Step two, fine-tuning. Fine-tuning involves using a smaller batch of high-quality data for training, which can lead to higher quality outputs from the model. While pre-training requires a large amount of data, much of it may contain errors or be of low quality. The fine-tuning step can enhance the model's quality through the use of high-quality data.

The third step is reinforcement learning. First, a brand new model will be established, which we call the "reward model". The purpose of this model is very simple: to rank the output results. Therefore, implementing this model is relatively straightforward, as the business scenarios are quite vertical. After that, this model will be used to determine whether the output of our large model is of high quality, so a reward model can be used to automatically iterate the parameters of the large model. ) However, sometimes human participation is also needed to assess the quality of the model's output (.

In short, during the training process of large models, pre-training has very high requirements for the amount of data, and the GPU computing power needed is also the highest. Fine-tuning requires higher quality data to improve parameters, and reinforcement learning can iterate parameters repeatedly through a reward model to produce higher quality results.

In the training process, the more parameters there are, the higher the ceiling of its generalization ability. For example, in the example we use with the function Y = aX + b, there are actually two neurons, X and X0. Therefore, how the parameters change, the data it can fit is extremely limited because it is essentially still a straight line. If there are more neurons, then more parameters can be iterated, allowing for fitting more data. This is why large models can achieve remarkable results, and this is also why they are commonly referred to as large models. Essentially, it is about a massive number of neurons and parameters, a vast amount of data, and simultaneously requiring immense computing power.

Therefore, the performance of large models is mainly determined by three aspects: the number of parameters, the amount and quality of data, and computing power. These three factors collectively influence the result quality and generalization ability of large models. Let’s assume the number of parameters is p, and the amount of data is n) calculated in terms of Token quantity(, then we can estimate the required amount of computation based on general empirical rules, which will allow us to roughly estimate the computing power we need to purchase and the training time.

Computing power is generally measured in Flops as the basic unit.

GPT-3.85%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

8 Likes

Reward
8
11
Repost
Share

Comment

0/400

NFTragedy

· 08-08 08:37

Brother Ai is right~

View OriginalReply0

SquidTeacher

· 08-08 06:16

20 exactly

View OriginalReply0

BearMarketNoodler

· 08-07 18:47

A 20% increase in efficiency is a conservative estimate.

View OriginalReply0

SmartContractPlumber

· 08-07 14:36

GPT and smart contracts should not be underestimated.

View OriginalReply0

GateUser-43d6d1b5

· 08-06 18:44

1000x Vibes 🤑

Reply0

AirdropLicker

· 08-06 17:04

Don't miss the Airdrop in the next bull run!

View OriginalReply0

PumpDoctrine

· 08-06 17:03

The bull is blowing so big; whether the coin rises or not is the key.

View OriginalReply0

LayerHopper

· 08-06 17:03

The crazy momentum of Blockchain has passed, and hard drive Mining is now appealing.

View OriginalReply0

AirdropHunterXiao

· 08-06 16:56

Only 20%? The efficiency improvement is too small, right?

View OriginalReply0

StablecoinAnxiety

· 08-06 16:52

The efficiency of AI is just this, not enough to see.

View OriginalReply0

Topic
#Institutions Hold 10M+ ETH
6k Popularity
#MicroStrategy Loosens Stock Rules
5k Popularity
#Show My Alpha Points
163k Popularity
#BTC ETFs Top $153B in Holdings
21k Popularity
#Gate July Transparency Report
19k Popularity

sitemap