Average neighbours inside a vector

up vote
9
down vote

favorite

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

edited 13 hours ago

asked 13 hours ago

Loulou

1186

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago

It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago

1

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago

Is your data sorted?
– Konrad Rudolph
11 hours ago

Yes, always growing order
– Loulou
10 hours ago

add a comment |

up vote
9
down vote

favorite

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

edited 13 hours ago

asked 13 hours ago

Loulou

1186

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago

It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago

1

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago

Is your data sorted?
– Konrad Rudolph
11 hours ago

Yes, always growing order
– Loulou
10 hours ago

add a comment |

up vote
9
down vote

favorite

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

edited 13 hours ago

asked 13 hours ago

Loulou

1186

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

r vector difference neighbours

edited 13 hours ago

asked 13 hours ago

Loulou

1186

edited 13 hours ago

asked 13 hours ago

Loulou

1186

edited 13 hours ago

asked 13 hours ago

Loulou

1186

asked 13 hours ago

Loulou

1186

asked 13 hours ago

Loulou

1186

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago

It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago

1

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago

Is your data sorted?
– Konrad Rudolph
11 hours ago

Yes, always growing order
– Loulou
10 hours ago

add a comment |

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago

It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago

1

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago

Is your data sorted?
– Konrad Rudolph
11 hours ago

Yes, always growing order
– Loulou
10 hours ago

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?
– Klaus Gütter
13 hours ago

It could be also longer runs (like 99, 100, 101 in data_2)
– Loulou
13 hours ago

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)
– Henrik
13 hours ago

Is your data sorted?
– Konrad Rudolph
11 hours ago

Yes, always growing order
– Loulou
10 hours ago

add a comment |

4 Answers
4

active

oldest

votes

up vote
6
down vote

accepted

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited 11 hours ago

answered 12 hours ago

Sotos

27.2k51640

add a comment |

up vote
3
down vote

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited 13 hours ago

answered 13 hours ago

Roland

98.5k6106177

add a comment |

up vote
2
down vote

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited 12 hours ago

answered 13 hours ago

denis

1,9611218

add a comment |

up vote
0
down vote

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered 6 hours ago

Lamia

3,0651717

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53704926%2faverage-neighbours-inside-a-vector%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
6
down vote

accepted

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited 11 hours ago

answered 12 hours ago

Sotos

27.2k51640

add a comment |

up vote
6
down vote

accepted

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited 11 hours ago

answered 12 hours ago

Sotos

27.2k51640

add a comment |

up vote
6
down vote

accepted

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited 11 hours ago

answered 12 hours ago

Sotos

27.2k51640

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited 11 hours ago

answered 12 hours ago

Sotos

27.2k51640

edited 11 hours ago

answered 12 hours ago

Sotos

27.2k51640

answered 12 hours ago

Sotos

27.2k51640

answered 12 hours ago

Sotos

27.2k51640

add a comment |

up vote
3
down vote

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited 13 hours ago

answered 13 hours ago

Roland

98.5k6106177

add a comment |

up vote
3
down vote

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited 13 hours ago

answered 13 hours ago

Roland

98.5k6106177

add a comment |

up vote
3
down vote

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited 13 hours ago

answered 13 hours ago

Roland

98.5k6106177

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited 13 hours ago

answered 13 hours ago

Roland

98.5k6106177

edited 13 hours ago

answered 13 hours ago

Roland

98.5k6106177

answered 13 hours ago

Roland

98.5k6106177

answered 13 hours ago

Roland

98.5k6106177

add a comment |

up vote
2
down vote

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited 12 hours ago

answered 13 hours ago

denis

1,9611218

add a comment |

up vote
2
down vote

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited 12 hours ago

answered 13 hours ago

denis

1,9611218

add a comment |

up vote
2
down vote

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited 12 hours ago

answered 13 hours ago

denis

1,9611218

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited 12 hours ago

answered 13 hours ago

denis

1,9611218

edited 12 hours ago

answered 13 hours ago

denis

1,9611218

answered 13 hours ago

denis

1,9611218

answered 13 hours ago

denis

1,9611218

add a comment |

up vote
0
down vote

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered 6 hours ago

Lamia

3,0651717

add a comment |

up vote
0
down vote

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered 6 hours ago

Lamia

3,0651717

add a comment |

up vote
0
down vote

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered 6 hours ago

Lamia

3,0651717

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered 6 hours ago

Lamia

3,0651717

answered 6 hours ago

Lamia

3,0651717

answered 6 hours ago

Lamia

3,0651717

answered 6 hours ago

Lamia

3,0651717

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Krdytkyu