Encoder Decoder attention batching

I trying to implement Encoder decoder attention with batching but getting some errors.

I have followed the colab notebook for encoder decoder attention to implement attention model with batching.

I guess there is some mismatch in shape of outputs and hindi_batch in crietrion function but I’m not sure. Please suggest me a solution. Also, Please share implemented batch solution notebook if available.

error:

` ` `

AttributeError: ‘list’ object has no attribute ‘dim’

13 print(‘hindi_batch shape’, hindi_batch.shape)

14 hindi_batch = hindi_batch.numpy()

—> 15 loss= criterion(outputs, hindi_batch)

16 loss.backward(retain_graph = True)

` ` `

Please help!!

Hi @gyanchandani.dipesh,
It looks like some variable is of type list, but it should be of type torch.tensor in order to be applied dim.

I am sharing my code. can you please tell me where i am going wrong. Also, refer the error message which i posted before.

` ` `

#class encoder-decoder:

MAX_OUTPUT_CHARS = 30

class Transliteration_EncoderDecoder(nn.Module):

def __init__(self, input_size, hidden_size, output_size, verbose=False):

    super(Transliteration_EncoderDecoder, self).__init__()

    self.hidden_size = hidden_size

    self.output_size = output_size

    self.encoder_rnn_cell = nn.GRU(input_size, hidden_size)

    self.decoder_rnn_cell = nn.GRU(output_size, hidden_size)

    self.h2o = nn.Linear(hidden_size, output_size)

    self.softmax = nn.LogSoftmax(dim=2)

    self.verbose = verbose

def forward(self, input, max_output_chars = MAX_OUTPUT_CHARS, batch_size=3, device='cpu', ground_truth = None):

    out, hidden = self.encoder_rnn_cell(input)

    decoder_state = hidden

    decoder_input = torch.zeros(1, batch_size, self.output_size).to(device)

    outputs = [ ]

    for i in range(max_output_chars):

        out, decoder_state = self.decoder_rnn_cell(decoder_input,decoder_state )

        out = self.h2o(decoder_state)

        out = self.softmax(out)

        outputs.append(out.view(out.shape[1], -1))

        max_idx = torch.argmax(out, 2, keepdim=True)

        if not ground_truth is None:

            max_idx = ground_truth[i].reshape(1, batch_size, 1)

        one_hot = torch.FloatTensor(out.shape).to(device)

        one_hot.zero_()

        one_hot.scatter(2, max_idx, 1)

    return outputs

#train_batch function:

def train_batch(net, opt, criterion, batch_size, device=‘cpu’, teacher_force=False):

net.train().to(device)

opt.zero_grad()

eng, hindi, eng_batch, hindi_batch = train_data.get_batch(batch_size)

total_loss = 0

outputs = net(eng_batch, hindi_batch.shape[0] , batch_size,  device,  ground_truth = hindi_batch if teacher_force else None)

loss= criterion(outputs, hindi_batch) #getting error in this line 

loss.backward(retain_graph = True)

opt.step()

return loss

`

Hi, can you also share a screenshot having a complete traceback of the error?

Please note that variable output should be of type torch tensor, but it’s a list.
Can you try doing the following before calling the criterion:
torch.stack(output).to(device)

i guess the output size is not matching to the ground truth and now I am getting this error:

ValueError: Expected target size (12, 129), got torch.Size([12, 64])