Catalog
1 brief introduction
1.1 Main contributions
- A complete transmitter and receiver in a given channel is introduced . The key idea is to put the transmitter , The channel and receiver are represented as a deep neural network , It can be trained as an automatic encoder . The advantage is that it can be applied to the channel model and loss function of unknown optimal solution .
- Extend this concept to multiple transceiver counterwork networks , All transmitter and receiver implementations can be jointly optimized for one or more common or single performance metrics .
- The neural network can be integrated into the end-to-end training process of signal transformation task
- The results of the experiment reflect the sustainable development trend of deep learning in various fields , The features learned here eventually outperform and replace the long used expert features .
1.2 The general content of the paper
- I-A: Discuss DL Potential value in the physical layer
- I-B:DL Related content of
- II:DL The background of
- III: Introduce several DL Applications in information transmission
- IV: Discuss open questions and key areas of future research
- V : summary
2 I-A Potential of DL for the physical layer
- 2.1 DL This model has no requirement of mathematical model and specific hardware configuration, and can get better performance
- 2.2 DL The end-to-end information transmission system of , Don't divide a system into several independent modules as before . It can provide a simple way to optimize performance .
- 2.3 Learning algorithms are faster than they can be executed , Lower losses .
- 2.4 Massively parallel processing architecture with distributed storage architecture .
3 I-B Historical context and related work
- There are two kinds of will DL The main methods applied to the physical layer . The goal is to use DL improvement / Enhance part of the existing algorithm , Or replace them completely .
4 II DEEP LEARNING BASICS
4.1 Basic introduction
-
What are the levels of neural networks
-
What are the activation functions
-
What are the loss functions
4.2 A. Convolutional layers
4.2 B. Deep learning libraries
- Many tools can be used to quickly build neural networks . such as Keras
4.3 C. Network dimensions and training
- Width ” Used to describe the number of output activations per layer or average of all layers
5 III EXAMPLES OF MACHINE LEARNING APPLICATIONS FOR THE PHYSICAL LAYER
The main points of this section :
- This paper introduces how to realize end-to-end automatic decoding and use SGD Communication system for algorithm training
- This concept will be extended to composite transmitters and receivers
- Then we will introduce RTN To improve the performance of fading channels
- Demonstrated CNN Application of modulation classification task in original RF time series data .
5.1 A. Autoencoders for end-to-end communications systems
-
Simple end-to-end transmitters - delivery - Receiver model
-
Communication system of automatic encoder based on Gaussian channel
-
Autoencoder
- Transmitter: Using deep learning method will s Mapping to x. As long as you compress and reconstruct the input nonlinearly . It's the process from generating signals to sending signals .s It's a M Dimensional one-hot vector .
- Receiver: It's also a feedforward neural network , The last layer uses softmax To classify . The output is s It's a choice with the highest probability
- have access to SGD The algorithm trains the end-to-end automatic coder , Using the appropriate cross entropy loss function .
-
be based on BPSK modulation , Hamming code combined with binary hard decision or maximum likelihood function decoding communication system . The figure below a It is the block error rate of automatic encoder in several benchmark communication schemes ( With fixed energy constraints Hamming (7,4) code;autoencoder (7,4) ).
- The result shows that , The automatic encoder has learned the function of encoder and decoder without any prior knowledge , Its performance and use MLD Of Hamming Same code .
- Experiments show that ,SGD Using two transport layers instead of one can converge to a better global solution . By adding this dimension parameter to the search space , This kind of solution is more likely to be a saddle point in the optimization process (saddle points) appear , In fact, it helps to reduce the possibility of convergence to the suboptimal minimum .
- Use Adam With 0.001 The rate of learning is fixed Eb / N0 = 7 dB(Eb Represents the average signal energy per bit ,N0 Represents the power spectral density of the noise ) Training on duty . We have observed , increase batch size , At the same time, reducing the learning rate during training can help improve the accuracy .
-
The figure below b It is the block error rate of automatic encoder in several benchmark communication schemes (an (8,8) and (2,2) communications system )
Experiments show that : The automatic encoder is in (2,2) And uncoded BPSK same BLER, But in Eb = N0 Within the whole range of , It's better than (8,8) the latter . This means that the automatic encoder has learned some joint coding and modulation schemes , So the coding gain is obtained .
-
The layout of the automatic encoder
-
X Signal generation ( Quadrature phase shift keying (QPSK) Constellations )
Shows a simple (2; 2) System , The system converges rapidly to the classical quadrature phase shift keying (QPSK) Constellations .
5.2 B. Autoencoders for multiple transmitters and receivers
-
A model composed of two Gaussian interference channels
-
How to train two coupled automatic encoders with contradictory targets ( The two methods )
-
A method for the : Is to minimize the weighted sum of two losses
-
Another way : ???
-
By automatic encoder and different parameters of QAM The time-sharing automatic encoder realizes the communication between two user interference channels BLER
experimental analysis : Two types of automatic encoders NN The layout is in the table above IV Provided in the , The method is to n Replace with 2n. We have used the average power constraint to compete with higher-order modulation schemes . As a benchmark for comparison , Select uncoded BLER 22k/n-QAM Time sharing with two transmitters (TS) The same rate when used together . Although automatic encoder and time sharing have (1; 1) and (2;2) same BLER, But the former stay 10-3 Of BLER Next , about (4; 4) About 0.7 dB The real gain of , about (4; 8) About 1 dB The real gain of . Its reason and III-A The reasons explained in section are similar .
Experiments show that : Transmitters have learned to use binary phase shift keying in the quadrature direction (BPSK) Constellations of . In this way, it can be realized with QPSK Same time sharing performance . however , about (2; 2), The constellations of learning are no longer orthogonal , It can be interpreted as some form of superimposed coding . We can see , The constellations of both transmitters are similar to ellipses with orthogonal axes and different focal lengths . (4; 8) It's better than (4; 4) The effect is more obvious , Because the number of constellations has increased .
5.3 C. Radio transformer networks for augmented signal processing algorithms
Use RTN Network to enhance signal processing . In the receiver . It mainly consists of three parts :
- The first part : Learning parameter estimators ( Calculate the input vector y, Output w)
- The second part : Parameter Converter ( Apply deterministic functions to y, The function consists of w A parameterized And suitable for the spread of the phenomenon )
- The third part : The discriminant network of learning ( Get normalized output )
principle : It's by optimizing the parameter estimation . It's not about directly optimizing the parameters .RTN Simplify the target manifold by merging domain knowledge , It is similar to the role of convolution layer in transmitting translation variance when appropriate . This leads to a simpler search space and improved generalization of the above auto encoder and RTN It can be extended with minor modifications , Direct manipulation IQ Signals, not symbols , So pulse shaping can be effectively processed 、 timing 、 Frequency and phase offset compensation, etc .
Analysis of experimental results : There are two advantages
-
first : In the multipath fading channel , With or without RTN Of BLER Comparison of . added RTN after ,BLER To reduce the
-
the second : added RTN The convergence of post training will be faster
shortcoming : After expanding the encoder and decoder network and increasing the number of iterations , The performance difference will decrease
5.4 CNNs for classification tasks
-
Applied to debug classification CNN The structure of the neural network is as follows :
-
The picture below shows how to CNN Classification accuracy and use of 1000 Extreme gradient enhancement of two estimators and single scikit-learn Decision trees are compared
-
The image below shows SNR = 10 dB when CNN Confusion matrix of , Reveals QAM16 And QAM64 And broadband FM(WBFM) With the bilateral belt AM(AM-DSB) The confusion between
-
Analysis of experimental results : The short-term nature of these examples makes this task at the difficult end of the modulation classification spectrum , Because we can't calculate the expert characteristics with high stability for a long time . From low to medium SNR Within the scope of ,CNN The performance of the proposed classifier is better than that of the enhanced feature-based classifier 4 dB, And high SNR Similar performance . In the case of a single tree , Performance ratio at SNR CNN Bad 6 dB, But the performance is poor in high SNR 3.5%.
6 IV. DISCUSSION AND OPEN RESEARCH CHALLENGES
A. Data sets and challenges
B. Data representation, loss functions, and training SNR
C. Complex-valued neural networks
D. ML-augmented signal processing
E. System identification for end-to-end learning
7 Source download (404)
unfortunately , Read a whole paper , Eager to , It turns out that the code is gone . Cry to death ...
https://github.com/radioml/introdlphy/