My first question here so please bare with me.
I’m trying to feed my neural network with training data read in from an excel file. It works perfectly fine when i have less than 50 rows in the sheet. But when i try with the real excel file containing almost 4.000 rows it suddenly takes forever. Although 4.000 is a lot i’m pretty sure my way of doing it is still very inefficient.
as you can see in the code below i’m using the read_excel over and over again in the loop.
I feel like there should be a way to only read the whole column 1 time and then work with it from there.
My goal is to read in 5 rows as the 1st input starting from row 0. then reading 5 rows in again but starting from row 1 and 5 rows again starting from row 3
So it’s like a window of 5 rows that is read and then moving the window by 1.
The output should allways be the 1 row after the window.
**Example:** if row 1-20 contains numbers 1-20 then: input1 = [1,2,3,4,5] and output1 = 6 input2 = [2,3,4,5,6] and output2 = 7 ... input15 = [15,16,17,18,19] and output15 = 20
notice how inputs are lists and outputs are just numbers. So when i append those to the final input & output lists i end up with inputs being a list of lists and out being list of outputs
from pandas import read_excel # initialize final input & output lists. The contents of the temporary input & output lists # are gonna be appended to these final lists training_input =  training_output =  # excel relevant info my_sheet = 'Junaid' file_name = '../Documents/Junaid1.xlsx' # initialize counters loop_count = 0 row_counter = 0 for x in range(25): # load the excel file containing inputs & outputs # using parameters skiprows, nrows (number of rows) and index col df = read_excel(file_name, sheet_name = my_sheet, skiprows=row_counter, nrows=6, index_col=0) # initialize temporary input & output lists input_temp =  output_temp =  for y in df.index: # append the first 5 rows of the 6 to input list if loop_count < 5: input_temp.append(df.index[loop_count]) loop_count += 1 else: # append the 6th data to output list training_output.append(df.index[loop_count]) training_input.append(input_temp) row_counter += 1 loop_count = 0