Hidden state tensors have a different order than the returned tensors

As part of GRU training, I want to retrieve the hidden state tensors.

I have defined a GRU with two layers:

self.lstm = nn.GRU(params.vid_embedding_dim, params.hidden_dim , 2)

The forward function is defined as follows (the following is just a part of the implementation):

    def forward(self, s, order, batch_size, where, anchor_is_phrase = False):
    Forward prop. 
      # s is of shape [128 , 1 , 300] , 128 is batch size
      output, (a,b) = self.lstm(s.cuda())

And out is of shape: [128 , 400] (128 is the number of samples which each one is embedded in 400 dimensional vector).

I understand that out is the output of the last hidden state and thus I expect it to be equal to b. However, after I checked the values I saw that it’s indeed equal but b contains the tensor in a different order, that is for example output[0] is b[49]. Am I missing something here ?


Leave a Reply

Your email address will not be published. Required fields are marked *