Average neighbours inside a vector











up vote
9
down vote

favorite












My data :



data <- c(1,5,11,15,24,31,32,65)


There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :



data <- c(1,5,11,15,24,31.5,65)


It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :



data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)









share|improve this question
























  • Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
    – Klaus Gütter
    13 hours ago










  • It could be also longer runs (like 99, 100, 101 in data_2)
    – Loulou
    13 hours ago






  • 1




    Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
    – Henrik
    13 hours ago












  • Is your data sorted?
    – Konrad Rudolph
    11 hours ago










  • Yes, always growing order
    – Loulou
    10 hours ago















up vote
9
down vote

favorite












My data :



data <- c(1,5,11,15,24,31,32,65)


There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :



data <- c(1,5,11,15,24,31.5,65)


It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :



data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)









share|improve this question
























  • Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
    – Klaus Gütter
    13 hours ago










  • It could be also longer runs (like 99, 100, 101 in data_2)
    – Loulou
    13 hours ago






  • 1




    Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
    – Henrik
    13 hours ago












  • Is your data sorted?
    – Konrad Rudolph
    11 hours ago










  • Yes, always growing order
    – Loulou
    10 hours ago













up vote
9
down vote

favorite









up vote
9
down vote

favorite











My data :



data <- c(1,5,11,15,24,31,32,65)


There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :



data <- c(1,5,11,15,24,31.5,65)


It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :



data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)









share|improve this question















My data :



data <- c(1,5,11,15,24,31,32,65)


There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :



data <- c(1,5,11,15,24,31.5,65)


It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :



data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)






r vector difference neighbours






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 13 hours ago

























asked 13 hours ago









Loulou

1186




1186












  • Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
    – Klaus Gütter
    13 hours ago










  • It could be also longer runs (like 99, 100, 101 in data_2)
    – Loulou
    13 hours ago






  • 1




    Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
    – Henrik
    13 hours ago












  • Is your data sorted?
    – Konrad Rudolph
    11 hours ago










  • Yes, always growing order
    – Loulou
    10 hours ago


















  • Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
    – Klaus Gütter
    13 hours ago










  • It could be also longer runs (like 99, 100, 101 in data_2)
    – Loulou
    13 hours ago






  • 1




    Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
    – Henrik
    13 hours ago












  • Is your data sorted?
    – Konrad Rudolph
    11 hours ago










  • Yes, always growing order
    – Loulou
    10 hours ago
















Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago




Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago












It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago




It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago




1




1




Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago






Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago














Is your data sorted?
– Konrad Rudolph
11 hours ago




Is your data sorted?
– Konrad Rudolph
11 hours ago












Yes, always growing order
– Loulou
10 hours ago




Yes, always growing order
– Loulou
10 hours ago












4 Answers
4






active

oldest

votes

















up vote
6
down vote



accepted










Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.



#our group variable
grp <- cumsum(c(TRUE, diff(a) > 1))

#keep only groups with length 1 (i.e. with no neighbor)
i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)]

#Find the mean of the groups with more than 1 rows,
i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))

#Concatenate the above 2 (eliminating NAs from i2) to get final result
c(i1, i2[!is.na(i2)])
#[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5


You can also wrap it in a function. I left the gap as a parameter so you can adjust,



get_vec <- function(x, gap) {
grp <- cumsum(c(TRUE, diff(x) > gap))
i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]
i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))
return(c(i1, i2[!is.na(i2)]))
}

get_vec(a, 1)
#[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5

get_vec(a_2, 1)
#[1] 1.0 5.0 11.0 15.0 24.0 65.0 140.0 31.5 100.0


DATA:



a <- c(1,5,11,15,24,31,32,65)
a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)





share|improve this answer






























    up vote
    3
    down vote













    Here is my solution, which uses run-length encoding to identify groups:



    foo <- function(x) {
    y <- x - seq_along(x) #normalize to zero differences in groups
    ind <- rle(y) #run-length encoding
    ind$values <- ind$lengths != 1 #to find groups
    ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids
    ind <- inverse.rle(ind)
    xnew <- x
    xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means
    xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups
    }

    foo(data)
    #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0
    foo(data_2)
    #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0
    data_3 <- c(1, 2, 4, 1, 2)
    foo(data_3)
    #[1] 1.5 4.0 1.5


    I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.






    share|improve this answer






























      up vote
      2
      down vote













      I have a data.table based solution, same could be translated into dplyr I guess:



      library(data.table)
      df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))
      df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]
      df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
      df[,neigh_seq := rleid(neighbours)]

      unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])

      neigh_seq V1
      1: 1 1.0
      2: 1 5.0
      3: 1 11.0
      4: 1 15.0
      5: 1 24.0
      6: 2 31.5
      7: 3 65.0
      8: 4 100.0
      9: 5 140.0


      What it does :
      first line set neigbours to 1 if the difference with following number is 1



       1:     1          0
      2: 5 0
      3: 11 0
      4: 15 0
      5: 24 0
      6: 31 0
      7: 32 1
      8: 65 0
      9: 99 0
      10: 100 1
      11: 101 1
      12: 140 0


      I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:



      df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
      data2 neighbours
      1: 1 0
      2: 5 0
      3: 11 0
      4: 15 0
      5: 24 0
      6: 31 1
      7: 32 1
      8: 65 0
      9: 99 1
      10: 100 1
      11: 101 1
      12: 140 0


      Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours



      df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]
      rleid V1
      1: 1 1.0
      2: 1 5.0
      3: 1 11.0
      4: 1 15.0
      5: 1 24.0
      6: 2 31.5
      7: 2 31.5
      8: 3 65.0
      9: 4 100.0
      10: 4 100.0
      11: 4 100.0
      12: 5 140.0


      and take the unique values. And voila.






      share|improve this answer






























        up vote
        0
        down vote













        This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):



        library(dplyr)
        data_2 %>% data.frame(x = .) %>%
        group_by(id = cumsum(c(1,diff(x)!=1))) %>%
        summarise(res = mean(x)) %>%
        select(res)
        # A tibble: 9 x 1
        res
        <dbl>
        1 1.0
        2 5.0
        3 11.0
        4 15.0
        5 24.0
        6 31.5
        7 65.0
        8 100.0
        9 140.0





        share|improve this answer





















          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53704926%2faverage-neighbours-inside-a-vector%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          4 Answers
          4






          active

          oldest

          votes








          4 Answers
          4






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          6
          down vote



          accepted










          Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.



          #our group variable
          grp <- cumsum(c(TRUE, diff(a) > 1))

          #keep only groups with length 1 (i.e. with no neighbor)
          i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)]

          #Find the mean of the groups with more than 1 rows,
          i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))

          #Concatenate the above 2 (eliminating NAs from i2) to get final result
          c(i1, i2[!is.na(i2)])
          #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5


          You can also wrap it in a function. I left the gap as a parameter so you can adjust,



          get_vec <- function(x, gap) {
          grp <- cumsum(c(TRUE, diff(x) > gap))
          i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]
          i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))
          return(c(i1, i2[!is.na(i2)]))
          }

          get_vec(a, 1)
          #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5

          get_vec(a_2, 1)
          #[1] 1.0 5.0 11.0 15.0 24.0 65.0 140.0 31.5 100.0


          DATA:



          a <- c(1,5,11,15,24,31,32,65)
          a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)





          share|improve this answer



























            up vote
            6
            down vote



            accepted










            Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.



            #our group variable
            grp <- cumsum(c(TRUE, diff(a) > 1))

            #keep only groups with length 1 (i.e. with no neighbor)
            i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)]

            #Find the mean of the groups with more than 1 rows,
            i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))

            #Concatenate the above 2 (eliminating NAs from i2) to get final result
            c(i1, i2[!is.na(i2)])
            #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5


            You can also wrap it in a function. I left the gap as a parameter so you can adjust,



            get_vec <- function(x, gap) {
            grp <- cumsum(c(TRUE, diff(x) > gap))
            i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]
            i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))
            return(c(i1, i2[!is.na(i2)]))
            }

            get_vec(a, 1)
            #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5

            get_vec(a_2, 1)
            #[1] 1.0 5.0 11.0 15.0 24.0 65.0 140.0 31.5 100.0


            DATA:



            a <- c(1,5,11,15,24,31,32,65)
            a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)





            share|improve this answer

























              up vote
              6
              down vote



              accepted







              up vote
              6
              down vote



              accepted






              Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.



              #our group variable
              grp <- cumsum(c(TRUE, diff(a) > 1))

              #keep only groups with length 1 (i.e. with no neighbor)
              i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)]

              #Find the mean of the groups with more than 1 rows,
              i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))

              #Concatenate the above 2 (eliminating NAs from i2) to get final result
              c(i1, i2[!is.na(i2)])
              #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5


              You can also wrap it in a function. I left the gap as a parameter so you can adjust,



              get_vec <- function(x, gap) {
              grp <- cumsum(c(TRUE, diff(x) > gap))
              i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]
              i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))
              return(c(i1, i2[!is.na(i2)]))
              }

              get_vec(a, 1)
              #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5

              get_vec(a_2, 1)
              #[1] 1.0 5.0 11.0 15.0 24.0 65.0 140.0 31.5 100.0


              DATA:



              a <- c(1,5,11,15,24,31,32,65)
              a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)





              share|improve this answer














              Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.



              #our group variable
              grp <- cumsum(c(TRUE, diff(a) > 1))

              #keep only groups with length 1 (i.e. with no neighbor)
              i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)]

              #Find the mean of the groups with more than 1 rows,
              i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))

              #Concatenate the above 2 (eliminating NAs from i2) to get final result
              c(i1, i2[!is.na(i2)])
              #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5


              You can also wrap it in a function. I left the gap as a parameter so you can adjust,



              get_vec <- function(x, gap) {
              grp <- cumsum(c(TRUE, diff(x) > gap))
              i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]
              i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))
              return(c(i1, i2[!is.na(i2)]))
              }

              get_vec(a, 1)
              #[1] 1.0 5.0 11.0 15.0 24.0 65.0 31.5

              get_vec(a_2, 1)
              #[1] 1.0 5.0 11.0 15.0 24.0 65.0 140.0 31.5 100.0


              DATA:



              a <- c(1,5,11,15,24,31,32,65)
              a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited 11 hours ago

























              answered 12 hours ago









              Sotos

              27.2k51640




              27.2k51640
























                  up vote
                  3
                  down vote













                  Here is my solution, which uses run-length encoding to identify groups:



                  foo <- function(x) {
                  y <- x - seq_along(x) #normalize to zero differences in groups
                  ind <- rle(y) #run-length encoding
                  ind$values <- ind$lengths != 1 #to find groups
                  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids
                  ind <- inverse.rle(ind)
                  xnew <- x
                  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means
                  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups
                  }

                  foo(data)
                  #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0
                  foo(data_2)
                  #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0
                  data_3 <- c(1, 2, 4, 1, 2)
                  foo(data_3)
                  #[1] 1.5 4.0 1.5


                  I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.






                  share|improve this answer



























                    up vote
                    3
                    down vote













                    Here is my solution, which uses run-length encoding to identify groups:



                    foo <- function(x) {
                    y <- x - seq_along(x) #normalize to zero differences in groups
                    ind <- rle(y) #run-length encoding
                    ind$values <- ind$lengths != 1 #to find groups
                    ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids
                    ind <- inverse.rle(ind)
                    xnew <- x
                    xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means
                    xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups
                    }

                    foo(data)
                    #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0
                    foo(data_2)
                    #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0
                    data_3 <- c(1, 2, 4, 1, 2)
                    foo(data_3)
                    #[1] 1.5 4.0 1.5


                    I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.






                    share|improve this answer

























                      up vote
                      3
                      down vote










                      up vote
                      3
                      down vote









                      Here is my solution, which uses run-length encoding to identify groups:



                      foo <- function(x) {
                      y <- x - seq_along(x) #normalize to zero differences in groups
                      ind <- rle(y) #run-length encoding
                      ind$values <- ind$lengths != 1 #to find groups
                      ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids
                      ind <- inverse.rle(ind)
                      xnew <- x
                      xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means
                      xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups
                      }

                      foo(data)
                      #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0
                      foo(data_2)
                      #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0
                      data_3 <- c(1, 2, 4, 1, 2)
                      foo(data_3)
                      #[1] 1.5 4.0 1.5


                      I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.






                      share|improve this answer














                      Here is my solution, which uses run-length encoding to identify groups:



                      foo <- function(x) {
                      y <- x - seq_along(x) #normalize to zero differences in groups
                      ind <- rle(y) #run-length encoding
                      ind$values <- ind$lengths != 1 #to find groups
                      ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids
                      ind <- inverse.rle(ind)
                      xnew <- x
                      xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means
                      xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups
                      }

                      foo(data)
                      #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0
                      foo(data_2)
                      #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0
                      data_3 <- c(1, 2, 4, 1, 2)
                      foo(data_3)
                      #[1] 1.5 4.0 1.5


                      I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited 13 hours ago

























                      answered 13 hours ago









                      Roland

                      98.5k6106177




                      98.5k6106177






















                          up vote
                          2
                          down vote













                          I have a data.table based solution, same could be translated into dplyr I guess:



                          library(data.table)
                          df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))
                          df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]
                          df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                          df[,neigh_seq := rleid(neighbours)]

                          unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])

                          neigh_seq V1
                          1: 1 1.0
                          2: 1 5.0
                          3: 1 11.0
                          4: 1 15.0
                          5: 1 24.0
                          6: 2 31.5
                          7: 3 65.0
                          8: 4 100.0
                          9: 5 140.0


                          What it does :
                          first line set neigbours to 1 if the difference with following number is 1



                           1:     1          0
                          2: 5 0
                          3: 11 0
                          4: 15 0
                          5: 24 0
                          6: 31 0
                          7: 32 1
                          8: 65 0
                          9: 99 0
                          10: 100 1
                          11: 101 1
                          12: 140 0


                          I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:



                          df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                          data2 neighbours
                          1: 1 0
                          2: 5 0
                          3: 11 0
                          4: 15 0
                          5: 24 0
                          6: 31 1
                          7: 32 1
                          8: 65 0
                          9: 99 1
                          10: 100 1
                          11: 101 1
                          12: 140 0


                          Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours



                          df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]
                          rleid V1
                          1: 1 1.0
                          2: 1 5.0
                          3: 1 11.0
                          4: 1 15.0
                          5: 1 24.0
                          6: 2 31.5
                          7: 2 31.5
                          8: 3 65.0
                          9: 4 100.0
                          10: 4 100.0
                          11: 4 100.0
                          12: 5 140.0


                          and take the unique values. And voila.






                          share|improve this answer



























                            up vote
                            2
                            down vote













                            I have a data.table based solution, same could be translated into dplyr I guess:



                            library(data.table)
                            df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))
                            df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]
                            df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                            df[,neigh_seq := rleid(neighbours)]

                            unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])

                            neigh_seq V1
                            1: 1 1.0
                            2: 1 5.0
                            3: 1 11.0
                            4: 1 15.0
                            5: 1 24.0
                            6: 2 31.5
                            7: 3 65.0
                            8: 4 100.0
                            9: 5 140.0


                            What it does :
                            first line set neigbours to 1 if the difference with following number is 1



                             1:     1          0
                            2: 5 0
                            3: 11 0
                            4: 15 0
                            5: 24 0
                            6: 31 0
                            7: 32 1
                            8: 65 0
                            9: 99 0
                            10: 100 1
                            11: 101 1
                            12: 140 0


                            I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:



                            df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                            data2 neighbours
                            1: 1 0
                            2: 5 0
                            3: 11 0
                            4: 15 0
                            5: 24 0
                            6: 31 1
                            7: 32 1
                            8: 65 0
                            9: 99 1
                            10: 100 1
                            11: 101 1
                            12: 140 0


                            Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours



                            df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]
                            rleid V1
                            1: 1 1.0
                            2: 1 5.0
                            3: 1 11.0
                            4: 1 15.0
                            5: 1 24.0
                            6: 2 31.5
                            7: 2 31.5
                            8: 3 65.0
                            9: 4 100.0
                            10: 4 100.0
                            11: 4 100.0
                            12: 5 140.0


                            and take the unique values. And voila.






                            share|improve this answer

























                              up vote
                              2
                              down vote










                              up vote
                              2
                              down vote









                              I have a data.table based solution, same could be translated into dplyr I guess:



                              library(data.table)
                              df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))
                              df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]
                              df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                              df[,neigh_seq := rleid(neighbours)]

                              unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])

                              neigh_seq V1
                              1: 1 1.0
                              2: 1 5.0
                              3: 1 11.0
                              4: 1 15.0
                              5: 1 24.0
                              6: 2 31.5
                              7: 3 65.0
                              8: 4 100.0
                              9: 5 140.0


                              What it does :
                              first line set neigbours to 1 if the difference with following number is 1



                               1:     1          0
                              2: 5 0
                              3: 11 0
                              4: 15 0
                              5: 24 0
                              6: 31 0
                              7: 32 1
                              8: 65 0
                              9: 99 0
                              10: 100 1
                              11: 101 1
                              12: 140 0


                              I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:



                              df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                              data2 neighbours
                              1: 1 0
                              2: 5 0
                              3: 11 0
                              4: 15 0
                              5: 24 0
                              6: 31 1
                              7: 32 1
                              8: 65 0
                              9: 99 1
                              10: 100 1
                              11: 101 1
                              12: 140 0


                              Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours



                              df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]
                              rleid V1
                              1: 1 1.0
                              2: 1 5.0
                              3: 1 11.0
                              4: 1 15.0
                              5: 1 24.0
                              6: 2 31.5
                              7: 2 31.5
                              8: 3 65.0
                              9: 4 100.0
                              10: 4 100.0
                              11: 4 100.0
                              12: 5 140.0


                              and take the unique values. And voila.






                              share|improve this answer














                              I have a data.table based solution, same could be translated into dplyr I guess:



                              library(data.table)
                              df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))
                              df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]
                              df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                              df[,neigh_seq := rleid(neighbours)]

                              unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])

                              neigh_seq V1
                              1: 1 1.0
                              2: 1 5.0
                              3: 1 11.0
                              4: 1 15.0
                              5: 1 24.0
                              6: 2 31.5
                              7: 3 65.0
                              8: 4 100.0
                              9: 5 140.0


                              What it does :
                              first line set neigbours to 1 if the difference with following number is 1



                               1:     1          0
                              2: 5 0
                              3: 11 0
                              4: 15 0
                              5: 24 0
                              6: 31 0
                              7: 32 1
                              8: 65 0
                              9: 99 0
                              10: 100 1
                              11: 101 1
                              12: 140 0


                              I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:



                              df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
                              data2 neighbours
                              1: 1 0
                              2: 5 0
                              3: 11 0
                              4: 15 0
                              5: 24 0
                              6: 31 1
                              7: 32 1
                              8: 65 0
                              9: 99 1
                              10: 100 1
                              11: 101 1
                              12: 140 0


                              Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours



                              df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]
                              rleid V1
                              1: 1 1.0
                              2: 1 5.0
                              3: 1 11.0
                              4: 1 15.0
                              5: 1 24.0
                              6: 2 31.5
                              7: 2 31.5
                              8: 3 65.0
                              9: 4 100.0
                              10: 4 100.0
                              11: 4 100.0
                              12: 5 140.0


                              and take the unique values. And voila.







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited 12 hours ago

























                              answered 13 hours ago









                              denis

                              1,9611218




                              1,9611218






















                                  up vote
                                  0
                                  down vote













                                  This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):



                                  library(dplyr)
                                  data_2 %>% data.frame(x = .) %>%
                                  group_by(id = cumsum(c(1,diff(x)!=1))) %>%
                                  summarise(res = mean(x)) %>%
                                  select(res)
                                  # A tibble: 9 x 1
                                  res
                                  <dbl>
                                  1 1.0
                                  2 5.0
                                  3 11.0
                                  4 15.0
                                  5 24.0
                                  6 31.5
                                  7 65.0
                                  8 100.0
                                  9 140.0





                                  share|improve this answer

























                                    up vote
                                    0
                                    down vote













                                    This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):



                                    library(dplyr)
                                    data_2 %>% data.frame(x = .) %>%
                                    group_by(id = cumsum(c(1,diff(x)!=1))) %>%
                                    summarise(res = mean(x)) %>%
                                    select(res)
                                    # A tibble: 9 x 1
                                    res
                                    <dbl>
                                    1 1.0
                                    2 5.0
                                    3 11.0
                                    4 15.0
                                    5 24.0
                                    6 31.5
                                    7 65.0
                                    8 100.0
                                    9 140.0





                                    share|improve this answer























                                      up vote
                                      0
                                      down vote










                                      up vote
                                      0
                                      down vote









                                      This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):



                                      library(dplyr)
                                      data_2 %>% data.frame(x = .) %>%
                                      group_by(id = cumsum(c(1,diff(x)!=1))) %>%
                                      summarise(res = mean(x)) %>%
                                      select(res)
                                      # A tibble: 9 x 1
                                      res
                                      <dbl>
                                      1 1.0
                                      2 5.0
                                      3 11.0
                                      4 15.0
                                      5 24.0
                                      6 31.5
                                      7 65.0
                                      8 100.0
                                      9 140.0





                                      share|improve this answer












                                      This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):



                                      library(dplyr)
                                      data_2 %>% data.frame(x = .) %>%
                                      group_by(id = cumsum(c(1,diff(x)!=1))) %>%
                                      summarise(res = mean(x)) %>%
                                      select(res)
                                      # A tibble: 9 x 1
                                      res
                                      <dbl>
                                      1 1.0
                                      2 5.0
                                      3 11.0
                                      4 15.0
                                      5 24.0
                                      6 31.5
                                      7 65.0
                                      8 100.0
                                      9 140.0






                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered 6 hours ago









                                      Lamia

                                      3,0651717




                                      3,0651717






























                                          draft saved

                                          draft discarded




















































                                          Thanks for contributing an answer to Stack Overflow!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.





                                          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                          Please pay close attention to the following guidance:


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53704926%2faverage-neighbours-inside-a-vector%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          Mont Emei

                                          Province de Neuquén

                                          Journaliste