As part of GRU training, I want to retrieve the hidden state tensors.
I have defined a GRU with two layers:
self.lstm = nn.GRU(params.vid_embedding_dim, params.hidden_dim , 2)
The forward function is defined as follows (the following is just a part of the implementation):
def forward(self, s, order, batch_size, where, anchor_is_phrase = False): """ Forward prop. """ # s is of shape [128 , 1 , 300] , 128 is batch size output, (a,b) = self.lstm(s.cuda()) output.data.contiguous()
And out is of shape: [128 , 400] (128 is the number of samples which each one is embedded in 400 dimensional vector).
I understand that
out is the output of the last hidden state and thus I expect it to be equal to
b. However, after I checked the values I saw that it’s indeed equal but
b contains the tensor in a different order, that is for example
b. Am I missing something here ?