Categories
Artificial Intelligence (AI) Mastering Development

Why does the BERT NSP head linear layer have two outputs?

Here’s the code in question. https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_bert.py#L491 class BertOnlyNSPHead(nn.Module): def __init__(self, config): super().__init__() self.seq_relationship = nn.Linear(config.hidden_size, 2) def forward(self, pooled_output): seq_relationship_score = self.seq_relationship(pooled_output) return seq_relationship_score I think it was just ranking how likely one sentence would follow another? Wouldn’t it be one score?

Categories
Artificial Intelligence (AI) Mastering Development

Which one is more important in case of different loss optimization algorithms, Speed or the Route?

We have different kinds of algorithms to optimize the loss like AdaGrad, SGD + Momentum, etc. Some are more commonly used than others. In some algorithms, they usually range out before they converge, reach to the steepest slope and find the minima. But some of these algorithms are significantly fast. So my question is that […]

Categories
Artificial Intelligence (AI) Mastering Development

Is the distribution of state-action pairs from sample based planning accurate for small experience sets?

From the David Silver lectures – based on Sutton and Barto – he talks about using sample based planning to use our model to take a sample of a state and then use model-free planning, such as monte carlo etc, to run the trajectory and observe the reward. He goes on to say that this […]

Categories
Artificial Intelligence (AI) Mastering Development

How to use one-hot encoding for multiple columns (multi-class) with varying number of labels in each class?

I am a beginner in TensorFlow as well as in AI. I am basically from Pharma background and learning AI from scratch. I have data with 5038 input (Float64) and 826 output (Categorical – Multi Labels in each column). I have utilized one-hot encoding but the neural network tackles only one output at a time. […]

Categories
Artificial Intelligence (AI) Mastering Development

What is the effect of using Pooling Layers in CNNs?

I know how pooling works, and what effect it has on the input dimensions – but I’m not sure why it’s done in the first place. It’d be great if someone could provide some intuition behind it – while explaining the following excerpt from a blog: A problem with the output feature maps is that […]

Categories
Artificial Intelligence (AI) Mastering Development

What is the intuition behind the Xavier Initialization for Deep Neural Networks?

The aim of weight initialization is to prevent layer activation outputs from exploding or vanishing during the course of a forward pass through a deep neural network I am really having trouble understanding weights initialization technique and Xavier Initialization for Deep Neural Networks? I mean how the initialization work in deep learning. In simple words […]

Categories
Artificial Intelligence (AI) Mastering Development

what is an autoassociator? And how to design an autoassociator for a given pattern?

what is an autoassociator and how does it work? how can we design an autoassociator for a given pattern? [I couldn’t find a clear explanation for this anywhere on the internet. ] example:

Categories
Artificial Intelligence (AI) Mastering Development

German Chatbot or conversational AI

I want to build a chatbot mostly BERT(Transformer) based in the German Language. But I do not find any German chatbot data set! So does it make sense to use google translator API to translate the English dataset to German and then train the model on it? Any idea where I can find German datasets […]

Categories
Artificial Intelligence (AI) Mastering Development

Does anyone know of a model for comparing the eyes of people in two images to see if they match?

There’s a lot of talk of undercover cops intentionally starting violence in otherwise peaceful protests. The evidence, primarily, are images like this. https://images.app.goo.gl/4n3o2EXwFzMQfsKq6 It looks pretty convincing, but I’d like something more solid. Does anyone know of a model that can detect with a high level of certainty if the “mask” area of two photos […]

Categories
Artificial Intelligence (AI) Mastering Development

efficientdet train use-tpu Error when running a command with only a single gpu locally [closed]

https://github.com/google/automl/issues/458 Train efficientdet from scratch with backbone checkpoint. backbone_name = { ‘efficientdet-d0’: ‘efficientnet-b0’, ‘efficientdet-d1’: ‘efficientnet-b1’, ‘efficientdet-d2’: ‘efficientnet-b2’, ‘efficientdet-d3’: ‘efficientnet-b3’, ‘efficientdet-d4’: ‘efficientnet-b4’, ‘efficientdet-d5’: ‘efficientnet-b5’, ‘efficientdet-d6’: ‘efficientnet-b6’, ‘efficientdet-d7’: ‘efficientnet-b6’, }[MODEL] generating train tfrecord is large, so we skip the execution here. import os if backbone_name not in os.listdir(): !wget https://storage.googleapis.com/cloud-tpu-checkpoints/efficientnet/ckptsaug/{backbone_name}.tar.gz !tar xf {backbone_name}.tar.gz !mkdir model_dir key option: […]