### Formula for the learning curve in neural networks?

In training a neural networks you often see the curve showing how fast the neural network is gaining intelligence. It usually grows very fast then slow down to almost horizontal. Is there a mathematical formula that matches these curves? Some similar curves are: $$y=1-e^{-x}$$ $$y=\frac{x}{1+x}$$ $$y=\tanh(x)$$ $$y=1+x-\sqrt{1+x^2}$$ Is there a theoertical reason for this shape?

### Recognizing Set CARDs

Set is a card game and is Nicely described here. Each set-card has 4 properties: The number(1,2 or 3) the color (Red, Green or Purple) Fill (Full, Stripes, None) Form (Wave, Oval or Diamond) converts to 2 Purple Waves No fill (code: 2PWN) and convert to codes 1RON and 3GDN For every combination there is…

### Is there a way to parallelize GloVe cooccur function?

I would like to create a GloVe word embedding on a very large corpus (trillions of words). However, creating the co-occurence matrix with the GloVe cooccur script is projected to take weeks. Is there any way to parallelize the process of creating a co-occurence matrix, either using GloVe or another resource that is out there?

### batch size vs #gpus

Let’s say I have got a data-set which was split using fixed random seed and I am going to: use 64 batch size with one 1080Ti GTX for training 80% and validate on 20% to get an accuracy as a metric; use 128 batch size with two 1080Ti GTXs for training 80% and validate on…

### How can I do hyperparameter optimization for a CNN-LSTM neural network?

I have build a CNN-LSTM neural network with 2 inputs and 2 outputs in Keras. I trained the network not with model.fit() but model.fit_generator() to load just parts of the training data when needed because the training data is too large to load at once. After the training the model was not working. So I…

### Autoencoder produces repeated artifacts after convergence

As experiment, I have tried using an autoencoder to encode height data from the alps, however the decoded image is very pixellated after training for several hours as show in the image below. This repeating patter is larger than the final kernel size, so I would think it would possible to remove these repeating patterns…

### Can BERT convert paragraph to vectors (doc2vec embedding) for classification tasks using another model?

1) I’ve heard the term “BERT embeddings” used a lot. Is this similar to Word2Vec or Doc2Vec embeddings used in NLP? I need to convert articles of text into one single vector as an input for my model. I have tried summing up the word embeddings scaled down to get a final vector. Is there…

### NLP : Extract the reason of the legal compensation

I’m working on a project (court-related). At a certain point, I have to extract the reason of a legal compensation. For instance, let’s take these sentence (from a court report): ‘order mister X to pay EUR 5000 for compensation for unpaid wages and ‘ to cover damages, mister X must pay EUR 4000 to mister…

### What would be the definition of a company that has an AI-first strategy?

Although many organizations use AI, not every company is an AI company. How would you define this? And what would be examples of companies that follow this strategy?