Pandas measure elapsed time since a condition












15














I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?










share|improve this question


















  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21
















15














I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?










share|improve this question


















  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21














15












15








15


1





I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?










share|improve this question













I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?







python pandas time timedelta






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 6 at 17:36









Rafael

763




763








  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21














  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21








4




4




Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
Dec 6 at 20:21




Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
Dec 6 at 20:21












5 Answers
5






active

oldest

votes


















13














You can use groupby:



# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
sec = df['Time'].dt.second
df['Elapsed Time'] = (
sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0
1 2018-12-01 10:00:02 On 2
2 2018-12-01 10:00:05 On 5
3 2018-12-01 10:00:06 On 6
4 2018-12-01 10:00:07 On 7
5 2018-12-01 10:00:09 Off 0
6 2018-12-01 10:00:11 Off 0
7 2018-12-01 10:00:14 On 3
8 2018-12-01 10:00:16 On 5
9 2018-12-01 10:00:18 On 7
10 2018-12-01 10:00:20 Off 0


The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



cumsum is used to identify groups:



df.Work.eq('Off').cumsum()

0 1
1 1
2 1
3 1
4 1
5 2
6 3
7 3
8 3
9 3
10 4
Name: Work, dtype: int64




If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



sec = df['Time'].values.astype(np.int64) // 10e8

df['Elapsed Time'] = (
sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0





share|improve this answer























  • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
    – coldspeed
    Dec 6 at 19:19










  • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
    – coldspeed
    Dec 6 at 19:21










  • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
    – Rafael
    Dec 6 at 19:32










  • @Rafael can you read my comments just about yours again please?
    – coldspeed
    Dec 6 at 19:32










  • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
    – Rafael
    Dec 6 at 19:49



















8














IIUC first with transform



(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
Out[1090]:
0 0
1 2
2 5
3 6
4 7
5 0
6 0
7 3
8 5
9 7
10 0
Name: Time, dtype: int64





share|improve this answer





















  • If I set the column Time as the index, how should I change the code so it would also work?
    – Rafael
    Dec 11 at 14:53










  • @Rafael df.reset_index(inplace=True)
    – W-B
    Dec 11 at 14:56










  • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
    – Rafael
    Dec 11 at 15:06










  • @Rafael this is df.reset_index(inplace=True) reset not set
    – W-B
    Dec 11 at 15:07



















7














You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



s = (df.Work=='Off').cumsum()
df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


Output



                  Time Work  Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0





share|improve this answer





















  • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
    – Rafael
    Dec 6 at 19:25










  • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
    – ALollz
    Dec 6 at 19:34





















4














Using a groupby, you can do this:



df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
.transform(lambda x: x.diff()
.dt.total_seconds()
.cumsum())
.fillna(0))

>>> df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0





share|improve this answer





























    3














    A numpy slicy approach



    u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
    t = df.Time.values

    df['Elapsed Time'] = t - t[f[i]]
    df

    Time Work Elapsed Time
    0 2018-12-01 10:00:00 Off 00:00:00
    1 2018-12-01 10:00:02 On 00:00:02
    2 2018-12-01 10:00:05 On 00:00:05
    3 2018-12-01 10:00:06 On 00:00:06
    4 2018-12-01 10:00:07 On 00:00:07
    5 2018-12-01 10:00:09 Off 00:00:00
    6 2018-12-01 10:00:11 Off 00:00:00
    7 2018-12-01 10:00:14 On 00:00:03
    8 2018-12-01 10:00:16 On 00:00:05
    9 2018-12-01 10:00:18 On 00:00:07
    10 2018-12-01 10:00:20 Off 00:00:00




    We can nail down the integer bit with



    df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
    df

    Time Work Elapsed Time
    0 2018-12-01 10:00:00 Off 0
    1 2018-12-01 10:00:02 On 2
    2 2018-12-01 10:00:05 On 5
    3 2018-12-01 10:00:06 On 6
    4 2018-12-01 10:00:07 On 7
    5 2018-12-01 10:00:09 Off 0
    6 2018-12-01 10:00:11 Off 0
    7 2018-12-01 10:00:14 On 3
    8 2018-12-01 10:00:16 On 5
    9 2018-12-01 10:00:18 On 7
    10 2018-12-01 10:00:20 Off 0





    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53656878%2fpandas-measure-elapsed-time-since-a-condition%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      5 Answers
      5






      active

      oldest

      votes








      5 Answers
      5






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      13














      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer























      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49
















      13














      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer























      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49














      13












      13








      13






      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer














      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Dec 6 at 19:50

























      answered Dec 6 at 17:42









      coldspeed

      117k18109185




      117k18109185












      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49


















      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49
















      @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
      – coldspeed
      Dec 6 at 19:19




      @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
      – coldspeed
      Dec 6 at 19:19












      @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
      – coldspeed
      Dec 6 at 19:21




      @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
      – coldspeed
      Dec 6 at 19:21












      The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
      – Rafael
      Dec 6 at 19:32




      The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
      – Rafael
      Dec 6 at 19:32












      @Rafael can you read my comments just about yours again please?
      – coldspeed
      Dec 6 at 19:32




      @Rafael can you read my comments just about yours again please?
      – coldspeed
      Dec 6 at 19:32












      I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
      – Rafael
      Dec 6 at 19:49




      I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
      – Rafael
      Dec 6 at 19:49













      8














      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64





      share|improve this answer





















      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        Dec 11 at 14:53










      • @Rafael df.reset_index(inplace=True)
        – W-B
        Dec 11 at 14:56










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        Dec 11 at 15:06










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        Dec 11 at 15:07
















      8














      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64





      share|improve this answer





















      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        Dec 11 at 14:53










      • @Rafael df.reset_index(inplace=True)
        – W-B
        Dec 11 at 14:56










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        Dec 11 at 15:06










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        Dec 11 at 15:07














      8












      8








      8






      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64





      share|improve this answer












      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Dec 6 at 18:05









      W-B

      99.5k73163




      99.5k73163












      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        Dec 11 at 14:53










      • @Rafael df.reset_index(inplace=True)
        – W-B
        Dec 11 at 14:56










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        Dec 11 at 15:06










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        Dec 11 at 15:07


















      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        Dec 11 at 14:53










      • @Rafael df.reset_index(inplace=True)
        – W-B
        Dec 11 at 14:56










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        Dec 11 at 15:06










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        Dec 11 at 15:07
















      If I set the column Time as the index, how should I change the code so it would also work?
      – Rafael
      Dec 11 at 14:53




      If I set the column Time as the index, how should I change the code so it would also work?
      – Rafael
      Dec 11 at 14:53












      @Rafael df.reset_index(inplace=True)
      – W-B
      Dec 11 at 14:56




      @Rafael df.reset_index(inplace=True)
      – W-B
      Dec 11 at 14:56












      I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
      – Rafael
      Dec 11 at 15:06




      I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
      – Rafael
      Dec 11 at 15:06












      @Rafael this is df.reset_index(inplace=True) reset not set
      – W-B
      Dec 11 at 15:07




      @Rafael this is df.reset_index(inplace=True) reset not set
      – W-B
      Dec 11 at 15:07











      7














      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer





















      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34


















      7














      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer





















      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34
















      7












      7








      7






      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer












      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Dec 6 at 17:41









      ALollz

      11k31334




      11k31334












      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34




















      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34


















      The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
      – Rafael
      Dec 6 at 19:25




      The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
      – Rafael
      Dec 6 at 19:25












      @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
      – ALollz
      Dec 6 at 19:34






      @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
      – ALollz
      Dec 6 at 19:34













      4














      Using a groupby, you can do this:



      df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
      .transform(lambda x: x.diff()
      .dt.total_seconds()
      .cumsum())
      .fillna(0))

      >>> df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer


























        4














        Using a groupby, you can do this:



        df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
        .transform(lambda x: x.diff()
        .dt.total_seconds()
        .cumsum())
        .fillna(0))

        >>> df
        Time Work Elapsed Time
        0 2018-12-01 10:00:00 Off 0.0
        1 2018-12-01 10:00:02 On 2.0
        2 2018-12-01 10:00:05 On 5.0
        3 2018-12-01 10:00:06 On 6.0
        4 2018-12-01 10:00:07 On 7.0
        5 2018-12-01 10:00:09 Off 0.0
        6 2018-12-01 10:00:11 Off 0.0
        7 2018-12-01 10:00:14 On 3.0
        8 2018-12-01 10:00:16 On 5.0
        9 2018-12-01 10:00:18 On 7.0
        10 2018-12-01 10:00:20 Off 0.0





        share|improve this answer
























          4












          4








          4






          Using a groupby, you can do this:



          df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
          .transform(lambda x: x.diff()
          .dt.total_seconds()
          .cumsum())
          .fillna(0))

          >>> df
          Time Work Elapsed Time
          0 2018-12-01 10:00:00 Off 0.0
          1 2018-12-01 10:00:02 On 2.0
          2 2018-12-01 10:00:05 On 5.0
          3 2018-12-01 10:00:06 On 6.0
          4 2018-12-01 10:00:07 On 7.0
          5 2018-12-01 10:00:09 Off 0.0
          6 2018-12-01 10:00:11 Off 0.0
          7 2018-12-01 10:00:14 On 3.0
          8 2018-12-01 10:00:16 On 5.0
          9 2018-12-01 10:00:18 On 7.0
          10 2018-12-01 10:00:20 Off 0.0





          share|improve this answer












          Using a groupby, you can do this:



          df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
          .transform(lambda x: x.diff()
          .dt.total_seconds()
          .cumsum())
          .fillna(0))

          >>> df
          Time Work Elapsed Time
          0 2018-12-01 10:00:00 Off 0.0
          1 2018-12-01 10:00:02 On 2.0
          2 2018-12-01 10:00:05 On 5.0
          3 2018-12-01 10:00:06 On 6.0
          4 2018-12-01 10:00:07 On 7.0
          5 2018-12-01 10:00:09 Off 0.0
          6 2018-12-01 10:00:11 Off 0.0
          7 2018-12-01 10:00:14 On 3.0
          8 2018-12-01 10:00:16 On 5.0
          9 2018-12-01 10:00:18 On 7.0
          10 2018-12-01 10:00:20 Off 0.0






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 6 at 17:42









          sacul

          29.9k41740




          29.9k41740























              3














              A numpy slicy approach



              u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
              t = df.Time.values

              df['Elapsed Time'] = t - t[f[i]]
              df

              Time Work Elapsed Time
              0 2018-12-01 10:00:00 Off 00:00:00
              1 2018-12-01 10:00:02 On 00:00:02
              2 2018-12-01 10:00:05 On 00:00:05
              3 2018-12-01 10:00:06 On 00:00:06
              4 2018-12-01 10:00:07 On 00:00:07
              5 2018-12-01 10:00:09 Off 00:00:00
              6 2018-12-01 10:00:11 Off 00:00:00
              7 2018-12-01 10:00:14 On 00:00:03
              8 2018-12-01 10:00:16 On 00:00:05
              9 2018-12-01 10:00:18 On 00:00:07
              10 2018-12-01 10:00:20 Off 00:00:00




              We can nail down the integer bit with



              df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
              df

              Time Work Elapsed Time
              0 2018-12-01 10:00:00 Off 0
              1 2018-12-01 10:00:02 On 2
              2 2018-12-01 10:00:05 On 5
              3 2018-12-01 10:00:06 On 6
              4 2018-12-01 10:00:07 On 7
              5 2018-12-01 10:00:09 Off 0
              6 2018-12-01 10:00:11 Off 0
              7 2018-12-01 10:00:14 On 3
              8 2018-12-01 10:00:16 On 5
              9 2018-12-01 10:00:18 On 7
              10 2018-12-01 10:00:20 Off 0





              share|improve this answer


























                3














                A numpy slicy approach



                u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
                t = df.Time.values

                df['Elapsed Time'] = t - t[f[i]]
                df

                Time Work Elapsed Time
                0 2018-12-01 10:00:00 Off 00:00:00
                1 2018-12-01 10:00:02 On 00:00:02
                2 2018-12-01 10:00:05 On 00:00:05
                3 2018-12-01 10:00:06 On 00:00:06
                4 2018-12-01 10:00:07 On 00:00:07
                5 2018-12-01 10:00:09 Off 00:00:00
                6 2018-12-01 10:00:11 Off 00:00:00
                7 2018-12-01 10:00:14 On 00:00:03
                8 2018-12-01 10:00:16 On 00:00:05
                9 2018-12-01 10:00:18 On 00:00:07
                10 2018-12-01 10:00:20 Off 00:00:00




                We can nail down the integer bit with



                df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
                df

                Time Work Elapsed Time
                0 2018-12-01 10:00:00 Off 0
                1 2018-12-01 10:00:02 On 2
                2 2018-12-01 10:00:05 On 5
                3 2018-12-01 10:00:06 On 6
                4 2018-12-01 10:00:07 On 7
                5 2018-12-01 10:00:09 Off 0
                6 2018-12-01 10:00:11 Off 0
                7 2018-12-01 10:00:14 On 3
                8 2018-12-01 10:00:16 On 5
                9 2018-12-01 10:00:18 On 7
                10 2018-12-01 10:00:20 Off 0





                share|improve this answer
























                  3












                  3








                  3






                  A numpy slicy approach



                  u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
                  t = df.Time.values

                  df['Elapsed Time'] = t - t[f[i]]
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 00:00:00
                  1 2018-12-01 10:00:02 On 00:00:02
                  2 2018-12-01 10:00:05 On 00:00:05
                  3 2018-12-01 10:00:06 On 00:00:06
                  4 2018-12-01 10:00:07 On 00:00:07
                  5 2018-12-01 10:00:09 Off 00:00:00
                  6 2018-12-01 10:00:11 Off 00:00:00
                  7 2018-12-01 10:00:14 On 00:00:03
                  8 2018-12-01 10:00:16 On 00:00:05
                  9 2018-12-01 10:00:18 On 00:00:07
                  10 2018-12-01 10:00:20 Off 00:00:00




                  We can nail down the integer bit with



                  df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 0
                  1 2018-12-01 10:00:02 On 2
                  2 2018-12-01 10:00:05 On 5
                  3 2018-12-01 10:00:06 On 6
                  4 2018-12-01 10:00:07 On 7
                  5 2018-12-01 10:00:09 Off 0
                  6 2018-12-01 10:00:11 Off 0
                  7 2018-12-01 10:00:14 On 3
                  8 2018-12-01 10:00:16 On 5
                  9 2018-12-01 10:00:18 On 7
                  10 2018-12-01 10:00:20 Off 0





                  share|improve this answer












                  A numpy slicy approach



                  u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
                  t = df.Time.values

                  df['Elapsed Time'] = t - t[f[i]]
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 00:00:00
                  1 2018-12-01 10:00:02 On 00:00:02
                  2 2018-12-01 10:00:05 On 00:00:05
                  3 2018-12-01 10:00:06 On 00:00:06
                  4 2018-12-01 10:00:07 On 00:00:07
                  5 2018-12-01 10:00:09 Off 00:00:00
                  6 2018-12-01 10:00:11 Off 00:00:00
                  7 2018-12-01 10:00:14 On 00:00:03
                  8 2018-12-01 10:00:16 On 00:00:05
                  9 2018-12-01 10:00:18 On 00:00:07
                  10 2018-12-01 10:00:20 Off 00:00:00




                  We can nail down the integer bit with



                  df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 0
                  1 2018-12-01 10:00:02 On 2
                  2 2018-12-01 10:00:05 On 5
                  3 2018-12-01 10:00:06 On 6
                  4 2018-12-01 10:00:07 On 7
                  5 2018-12-01 10:00:09 Off 0
                  6 2018-12-01 10:00:11 Off 0
                  7 2018-12-01 10:00:14 On 3
                  8 2018-12-01 10:00:16 On 5
                  9 2018-12-01 10:00:18 On 7
                  10 2018-12-01 10:00:20 Off 0






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 6 at 17:59









                  piRSquared

                  151k22143285




                  151k22143285






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53656878%2fpandas-measure-elapsed-time-since-a-condition%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Ellipse (mathématiques)

                      Quarter-circle Tiles

                      Mont Emei