Applying Artificial neural network into kaggle’s house prices data set gave bad predicted values

I am trying to solve the kaggle’s house prices using neural network. I’ve already made it with ensembling several models (XGBoost, GradientBooster and Ridge) and I’ve got a great score ranking me between the top 25%. I imagined that by adding a new model to the ensembled models like ANN would increase prediction accuracy, so…

Small size dataset for Object Detection, Object segmentation, object localization

I am looking for a small size dataset on which I can implement Object Detection, Object segmentation, object localization. Can anyone suggest me a dataset less than 5GB ? or do I need to know anything before implementing these algorithms?

Labeling for multilabel image classification

with a friend of mine, we got in an argument over how to label images for multi-label. Note: Groups of a species and the species of catfish is important to recognize. The labels are: ‘I’: an individual fish of any type except catfish ‘R’: A group of same species ‘K’: Catfish First conflict: For an…

Why ResNet avoids the vanishing gradient problem?

I seen that if we use Sigmoid or Tanh activation functions in deep NN we can have some problems with the vanishing of the gradient, and this is visible by the shapes of the derivative of these functions. ReLU solves this problem thanks to its derivative, even if there may be some dead units. ResNet…

How to tell if two hotel reviews addressing the same thing

I am playing with a large dataset of hotel reviews, which contains both positive and negative reviews (the reviews are labeled). I want to use this dataset to perform textual style transfer – given a positive review, output a negative review which address the same thing. For example, if the positive review mentioned how spacious…

BatchNorm Layer effect during inference

I modified resnet50 architecture to get a regression network. I just add batchnorm1d and relu layers just before the fully connected layer. During training output of batchnorm1d layer is nearly equal to 3 and this gives good results for training. However, during inference, output of batchnorm1d layer is about 30 so this leads to too…

How to handle set-like size agnostic input format

Let’s set up some hypothetical simplified scenario: Each instance $i$ of my imaginary dataset $D=\{i_{1}, \ldots, i_{MAX}\}$ has different number $k_{i}$ of $n$-dimensional vectors as input into my neural network. Each of them will be transformed with $m \times n$ matrix $M$ (so, matrices with same parameters) and acted point-wise with some non-linearity $\sigma_{1}$. Now…

How to pass observation from CartPole-v0 to neural network using tensorflow

I am trying to learn about RL by implementing DQN with tensorflow. However, I am really stuck with tensorflow.. I just don’t understand it. I think I have found the core of what I understand – I dont understand how I should pass placeholders to the network. When I run the code below I get…

Transpose convolution in TiF-GAN: How does “same” padding works?

This question should be quite generic but I faced the problem in the case of the TiF-GAN generator so I am going to use it as an example. (Link to paper) If you check the penultimate page in the paper you can find the architecture design of the generator. The generator has a dense layer…

Did people analyze dynamics of very simple LSTMs?

I wonder if researchers tried to understand how LSTMs work by analyzing dynamics of simple LSTM (e.g. with 2 units)? For example how hidden state evolves depending on the properties of weight matrices. It seems like a very natural thing to try (especially because it is easy to draw hidden states with 2 units on…