I would like to filter and split my original dataframe into a number of dataframes using the condition that progressPercentage goes from 1.0 to 100 as in the following example:
Input:
id_B, ts_B,course,weight,Phase,remainingTime,progressPercentageid1,2017-04-27 01:35:30,cotton,3.5,A,01:15:00,23.0id1,2017-04-27 01:37:30,cotton,3.5,B,01:13:00,24.0id1,2017-04-27 01:38:00,cotton,3.5,B,01:13:00,24.0id1,2017-04-27 01:38:30,cotton,3.5,C,01:13:00,24.0id1,2017-04-27 01:39:00,cotton,3.5,C,00:02:00,99.0id1,2017-04-27 01:39:30,cotton,3.5,C,00:01:00,100.0id1,2017-04-27 01:40:00,cotton,3.5,Finish,00:01:00,100.0id1,2017-04-27 02:35:30,cotton,3.5,A,03:15:00,1.0id1,2017-04-27 02:36:00,cotton,3.5,A,03:14:00,2.0 id1,2017-04-27 02:36:30,cotton,3.5,A,03:14:00,2.0 id1,2017-04-27 02:37:00,cotton,3.5,B,03:13:00,3.0id1,2017-04-27 02:37:30,cotton,3.5,B,03:13:00,4.0id1,2017-04-27 02:38:00,cotton,3.5,B,03:13:00,5.0id1,2017-04-27 02:38:30,cotton,3.5,C,03:13:00,98.0id1,2017-04-27 02:39:00,cotton,3.5,C,00:02:00,99.0id1,2017-04-27 02:39:30,cotton,3.5,C,00:01:00,100.0id1,2017-04-27 02:40:00,cotton,3.5,Finish,00:01:00,100.0id2,2017-04-27 03:36:00,cotton,3.5,A,03:15:00,1.0id2,2017-04-27 03:36:30,cotton,3.5,A,03:14:00,1.0 id2,2017-04-27 03:37:00,cotton,3.5,B,03:13:00,2.0id2,2017-04-27 03:37:30,cotton,3.5,B,03:13:00,2.0id2,2017-04-27 03:38:00,cotton,3.5,B,03:13:00,3.0id2,2017-04-27 03:38:30,cotton,3.5,C,03:13:00,98.0id2,2017-04-27 03:39:00,cotton,3.5,C,00:02:00,99.0id2,2017-04-27 03:39:30,cotton,3.5,C,00:01:00,100.0id2,2017-04-27 03:40:00,cotton,3.5,Finish,00:01:00,100.0id1,2017-05-27 01:35:30,cotton,3.5,A,03:15:00,23.0id1,2017-05-27 01:37:30,cotton,3.5,B,03:13:00,24.0id1,2017-05-27 01:38:00,cotton,3.5,B,03:13:00,24.0id1,2017-05-27 01:38:30,cotton,3.5,C,03:13:00,24.0id1,2017-05-27 01:39:00,cotton,3.5,C,00:02:00,99.0id1,2017-05-27 01:39:30,cotton,3.5,C,00:01:00,100.0id1,2017-05-27 01:40:00,cotton,3.5,Finish,00:01:00,100.0id1,2017-05-27 02:35:30,cotton,3.5,A,01:15:00,1.0id1,2017-05-27 02:36:00,cotton,3.5,A,01:14:00,2.0 id1,2017-05-27 02:36:30,cotton,3.5,A,01:13:00,2.0 id1,2017-05-27 02:37:00,cotton,3.5,B,01:12:00,3.0id1,2017-05-27 02:37:30,cotton,3.5,B,01:11:00,4.0id1,2017-05-27 02:38:00,cotton,3.5,B,01:10:00,5.0id1,2017-05-27 02:38:30,cotton,3.5,C,01:09:00,98.0id1,2017-05-27 02:39:00,cotton,3.5,C,00:08:00,99.0
Outputs:
id_B, ts_B,course,weight,Phase,remainingTime,progressPercentageid1,2017-04-27 01:35:30,cotton,3.5,A,01:15:00,23.0id1,2017-04-27 01:37:30,cotton,3.5,B,01:13:00,24.0id1,2017-04-27 01:38:00,cotton,3.5,B,01:13:00,24.0id1,2017-04-27 01:38:30,cotton,3.5,C,01:13:00,24.0id1,2017-04-27 01:39:00,cotton,3.5,C,00:02:00,99.0id1,2017-04-27 01:39:30,cotton,3.5,C,00:01:00,100.0id1,2017-04-27 01:40:00,cotton,3.5,Finish,00:01:00,100.0id_B, ts_B,course,weight,Phase,remainingTime,progressPercentageid1,2017-04-27 02:35:30,cotton,3.5,A,03:15:00,1.0id1,2017-04-27 02:36:00,cotton,3.5,A,03:14:00,2.0 id1,2017-04-27 02:36:30,cotton,3.5,A,03:14:00,2.0 id1,2017-04-27 02:37:00,cotton,3.5,B,03:13:00,3.0id1,2017-04-27 02:37:30,cotton,3.5,B,03:13:00,4.0id1,2017-04-27 02:38:00,cotton,3.5,B,03:13:00,5.0id1,2017-04-27 02:38:30,cotton,3.5,C,03:13:00,98.0id1,2017-04-27 02:39:00,cotton,3.5,C,00:02:00,99.0id1,2017-04-27 02:39:30,cotton,3.5,C,00:01:00,100.0id1,2017-04-27 02:40:00,cotton,3.5,Finish,00:01:00,100.0id_B, ts_B,course,weight,Phase,remainingTime,progressPercentageid2,2017-04-27 03:36:00,cotton,3.5,A,03:15:00,1.0id2,2017-04-27 03:36:30,cotton,3.5,A,03:14:00,1.0 id2,2017-04-27 03:37:00,cotton,3.5,B,03:13:00,2.0id2,2017-04-27 03:37:30,cotton,3.5,B,03:13:00,2.0id2,2017-04-27 03:38:00,cotton,3.5,B,03:13:00,3.0id2,2017-04-27 03:38:30,cotton,3.5,C,03:13:00,98.0id2,2017-04-27 03:39:00,cotton,3.5,C,00:02:00,99.0id2,2017-04-27 03:39:30,cotton,3.5,C,00:01:00,100.0id2,2017-04-27 03:40:00,cotton,3.5,Finish,00:01:00,100.0id_B, ts_B,course,weight,Phase,remainingTime,progressPercentageid1,2017-05-27 01:35:30,cotton,3.5,A,03:15:00,1.0id1,2017-05-27 01:37:30,cotton,3.5,B,03:13:00,2.0id1,2017-05-27 01:38:00,cotton,3.5,B,03:13:00,3.0id1,2017-05-27 01:38:30,cotton,3.5,C,03:13:00,4.0id1,2017-05-27 01:39:00,cotton,3.5,C,00:02:00,99.0id1,2017-05-27 01:39:30,cotton,3.5,C,00:01:00,100.0id1,2017-05-27 01:40:00,cotton,3.5,Finish,00:01:00,100.0id_B, ts_B,course,weight,Phase,remainingTime,progressPercentageid1,2017-05-27 02:35:30,cotton,3.5,A,01:15:00,1.0id1,2017-05-27 02:36:00,cotton,3.5,A,01:14:00,2.0 id1,2017-05-27 02:36:30,cotton,3.5,A,01:13:00,2.0 id1,2017-05-27 02:37:00,cotton,3.5,B,01:12:00,3.0id1,2017-05-27 02:37:30,cotton,3.5,B,01:11:00,4.0id1,2017-05-27 02:38:00,cotton,3.5,B,01:10:00,5.0id1,2017-05-27 02:38:30,cotton,3.5,C,01:09:00,98.0id1,2017-05-27 02:39:00,cotton,3.5,C,00:08:00,99.0id1,2017-05-27 02:39:00,cotton,3.5,C,00:01:00,100.0
I have been using the .shift() and groupby as in the following:
a = dfb['Operation.progressPercentage'].shift().eq(100) grouping = dfb.groupby([dfb.wm_id,a])
but it did not provide the expected results. Please, any help on how I should change the code to get it done?
Many Thanks in advance.Best Regards,Carlo