Article RNN Generally speaking LSTM,GRU wait

CNN neutralization RNN in batchSize The default location for is different .

  • CNN in :batchsize The position is position 0.
  • RNN in :batchsize The position is position 1.

stay RNN Input data format in :

For the simplest RNN, We can call... In two ways ,torch.nn.RNNCell(), It only accepts... In the sequence Single step Input , The hidden state must be explicitly passed in .torch.nn.RNN() You can accept a Sequence The input of , By default, an all 0 The hidden state of , You can also declare your own hidden state .

  1. The input size is The three dimensional tensor[seq_len,batch_size,input_dim]
  • input_dim It's the dimension of input , For example 128
  • batch_size It's a trip RNN Enter the number of sentences , For example 5.
  • seq_len Is the maximum length of a sentence , such as 15
    So pay attention to ,RNN The input is a sequence , Input all the sentences in the batch at once , Got ouptut and hidden All the output and hidden states of this batch , Dimension is also three-dimensional .
    ** Now there are batch_size Independent RNN Components ,RNN The input dimension of is input_dim, Total input seq_len Time steps , Then each time step is input to the whole RNN The dimension of the module is [batch_size,input_dim]
# structure RNN The Internet ,x Dimensions 5, The dimension of the hidden layer 10, The number of layers of the network 2
rnn_seq = nn.RNN(5, 10,2)
# Construct an input sequence , Sentence length is 6,batch yes 3, The length of each word is 5 Vector representation of
x = torch.randn(6, 3, 5)
#out,ht = rnn_seq(x,h0)
out,ht = rnn_seq(x) #h0 You can specify or not specify

problem 1: here outht Of size How much is? ?
answer out:6 * 3 * 10, ht: 2 * 3 * 10,out Output dimension of [seq_len,batch_size,output_dim],ht Dimensions [num_layers * num_directions, batch, hidden_size], If it is One way single layer Of RNN So a sentence only has One hidden.
problem 2out[-1] and ht[-1] Whether it is equal or not ?
answer : equal , The hidden unit is the last unit of output , As you can imagine , Each output is actually the hidden unit of that time step

  1. RNN Other parameters of
RNN(input_dim ,hidden_dim ,num_layers ,…)
– input_dim Represents the characteristic dimension of the input
– hidden_dim Represents the characteristic dimension of the output , If there are no special changes , amount to out
– num_layers Represents the number of layers of the network
– nonlinearity Represents the selected nonlinear activation function , The default is ‘tanh’
– bias Indicates whether bias is used , By default
– batch_first Represents the form of input data , The default is False, That's the form ,(seq, batch, feature), That is, put the length of the sequence first ,batch Put it in the second place
– dropout Indicates whether to apply in the output layer dropout
– bidirectional Indicates whether to use bidirectional rnn, The default is False

LSTM One more output of memory unit

# Input dimensions 50, Cryptic layer 100 dimension , Two layers of 
lstm_seq = nn.LSTM(50, 100, num_layers=2)
# Input sequence seq= 10,batch =3, Input dimensions =50
lstm_input = torch.randn(10, 3, 50)
out, (h, c) = lstm_seq(lstm_input) # Use the default full 0 Hidden state

problem 1:out and (h,c) Of size How many are each ?
answer :out:(10 * 3 * 100),(h,c): All are (2 * 3 * 100)
problem 2:out[-1,:,:] and h[-1,:,:] Is it equal ?
answer : equal

GRU It's more like the traditional RNN

gru_seq = nn.GRU(10, 20,2) # x_dim,h_dim,layer_num
gru_input = torch.randn(3, 32, 10) # seq,batch,x_dim
out, h = gru_seq(gru_input)


pytorch, LSTM More related articles

  1. pytorch Learning notes ( Nine ):PyTorch Structure is introduced

    PyTorch The introduction of structure is right PyTorch A superficial understanding of Architecture , There's no guarantee that it's completely right , But I hope I can do it on a higher level PyTorch There is an overall grasp of . Level co., LTD. , If there is a mistake , Welcome to point to the wrong , thank you ! Several important types are related to numerical values Tens ...

  2. Introduction to network traffic prediction ( Two ) And LSTM Introduce

    Catalog Introduction to network traffic prediction ( Two ) And LSTM Introduce LSTM brief introduction Simple RNN The disadvantages of LSTM Structure Cell state (Cell State) door (Gate) Oblivion gate (Forget Gate) Input gate (Inp ...

  3. LSTM Introduce

    from : LSTM The Internet long short term memory, That's what we call LSTM ...

  4. RNN LSTM Introduce

    [RNN as well as LSTM The introduction and formula of ] [ You know contrast rnn  lstm   Simple code ] ...

  5. pytorch lstm crf Code understanding a key

    I haven't blogged for a long time , This time I will see the latest pytorch In the tutorial lstm+crf Some of the experience and confusion recorded . original text PyTorch Tutorials I refer to many blogs of other gods ,https://blog.c ...

  6. pytorch lstm crf Code understanding

    I haven't blogged for a long time , This time I will see the latest pytorch In the tutorial lstm+crf Some of the experience and confusion recorded . original text PyTorch Tutorials I refer to many blogs of other gods ,https://blog.c ...

  7. Pytorch LSTM Part of speech judgment

    First , We've defined one LSTM The Internet , Then give a sentence , Every sentence is made up of many words , Each word can be represented by a word vector , Such a sentence can form a sequence , We pass this sequence in turn LSTM, And then you get the same length output as the sequence , ...

  8. pytorch LSTM All codes of emotion classification

    First run Text serialization , Again model training from import DataLoader,Dataset import to ...

  9. RNN、LSTM Introduction and explanation of gradient disappearance

    Let me write it first , Thank you for these two articles , The basic framework is derived from these two articles : ...

Random recommendation

  1. iOS And Inertia roll

    notes : All of the following examples are   only   stay iOS Tested in wechat of , But for hungry APP The same goes for the built-in browser ( Both use the same kernel ) Introduction There is often a need to show a lot of information at work , If the list exceeds one screen, it involves scrolling . for example - va ...

  2. grape dynamic PHP structure ( Two )—— Management backstage

    One . summary

  3. Customize N Dimensional space arrays

    class Space : IEnumerable<Space> { public object Filler { get { return filler ?? (filler = Top ...

  4. Vue.2.0.5- plug-in unit

    Developing a plug-in The plug-in is usually Vue Add global features . There is no limit to the scope of plug-ins -- Generally, there are the following kinds : Add global methods or properties , Such as : vue-element Add global resources : Instructions / filter / Transition, etc , Such as  vue-touch Through all ...

  5. SSL Programming Tutorial

    SSL Programming Tutorial � Table of Contents [ � Index       This section demonstrates the implement ...

  6. 0703-spring cloud config-git Configure symmetric encryption of attribute encryption and decryption

    One . summary have access to {cipher} * The encrypted value of the format , As long as you have a valid key , Then they will be decrypted before the main application context gets to the environment . To use encryption in an application , You need to include... In your classpath Spring Securi ...

  7. atitit. Network file access protocol .unc&#160;smb&#160;nfs&#160;ftp&#160;http The difference between

    atitit. Network file access protocol .unc smb nfs ftp http The difference between 1.  Network file access protocol 1 2. NETBios agreement   2 3. SMB(Server Message Block)2 3 ...

  8. No more confusion , Nothing and NULL value

    In the world of relational databases , Nothing and NULL What's the difference between values ? I've been obsessed with this problem , Even writing TSQL Script time , trembling with fear , to be very careful , I'm afraid because I don't know enough , Dig a hole , Harm later generations , therefore , In the spirit of seeking up and down , If you don't get to know the world, you'll never stop ...

  9. By C The language operator priority pit

    Today there is an enumeration of the title of the code is like this : The point is maxXor The implementation of this function , Enumerate two numbers , among maxr Save the maximum value of i Exclusive or j , But the result of this program is quite unexpected -_-. And then i Exclusive or j ...

  10. Android Online tutorial : Start

    original text :Android Networking Tutorial: Getting Started author :Eunice Obugyei translator :kmyhy from API Level 1 Start , The Internet has always been Andro ...