Mastering Development

Add null features to train model when new phrase comes and model new inputs

Im training a model to detect entities in phrases.
My train is composed by 500 phrases, which have 1000 words. So, my

X_train.shape = (500,1000) 

X_train = [[0. 0. 0. 0. ...], [0. 0. ...], ...]. <-- already have this

Each column is about an specific word (order is very important).

When I want to predict a new phrase’s entity, I can receive words never seen.
Consider that I receive the input: "My shirt is yellow"

I need to put this input in form of an np.array with shape (1, 1000). If the word yellow doesn’t exists, I need to have an shape (1,1001) and retrain the model (with all zeros for that column, ofc). How can I do this?

Small example:

           "I" "am" "dark" "Vader's" "son". (trained corpus)
X_train = [[1,   1,   0,      0,      0], 
           [1,   1,   1,      0,      0]]

New input: Predict "I am dark Vader’s daughter"

So I need to retrain my model with:

       "I" "am" "dark" "Vader's" "son" "daughter". (trained corpus)
X_train = [[1,   1,   0,      0,      0,   0], 
           [1,   1,   1,      0,      0,   0]]

So I can predict the new input:

X_predict = [[1,1,1,1,0,1]] – also need to put this in this form

Leave a Reply

Your email address will not be published. Required fields are marked *