Loop to identify data location in a Dataframe, and populate to new Dataframe (R)

Multi tool use
Multi tool use
The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Loop to identify data location in a Dataframe, and populate to new Dataframe (R)



So I have a dataframe of user IDs as columns with rows (corresponding to period intervals) with binary variables (where an event happened) like this:


date Id1 Id2 id3 id4 id5

row1 1 0 0 1 0
row2 0 0 1 1 0
row3 0 1 0 0 1
row4 1 1 0 0 0
row5 0 0 1 1 1
...



I am trying to build a loop that runs through each row of each column which identifies any cell with a 1 and populates a new data frame with the row number of each occurrence, e.g:


occ. Id1 Id2 id3 id4 id5
1 1 3 2 1 3
2 4 4 5 2 5
3 5
4
5



I am pretty lost in how to approach this, if anyone is able to help?





Your second frame isn't really a frame since you have different lengths of each column. Additionally, your sample frames don't match, can you edit your question to make them representative?
– r2evans
12 hours ago





Best to store in a list, e.g. lapply(df[, -1], function(x) which(x == 1))
– Maurits Evers
12 hours ago


list


lapply(df[, -1], function(x) which(x == 1))





@r2evans - updated!
– user10137196
11 hours ago




3 Answers
3



You can using reshape::melt with spread from tidyverse


reshape::melt


spread


tidyverse


reshape::melt(df)%>%filter(.,value==1)%>%group_by(variable)%>%mutate(n=row_number())%>%spread(variable,date)
# A tibble: 3 x 7
value n Id1 Id2 id3 id4 id5
* <int> <int> <fctr> <fctr> <fctr> <fctr> <fctr>
1 1 1 row1 row3 row2 row1 row3
2 1 2 row4 row4 row5 row2 row5
3 1 3 NA NA NA row5 NA



Here is an option using tidyverse


tidyverse


library(tidyverse)
gather(df1, key, val, Id1:id5) %>%
filter(val != 0) %>%
mutate(val = readr::parse_number(date)) %>%
spread(key, val) %>%
mutate_at(vars(Id1:id5), funs(c(.[!is.na(.)], .[is.na(.)]))) %>%
rename(occ = date) %>%
mutate( occ = readr::parse_number(occ))
# occ Id1 Id2 id3 id4 id5
#1 1 1 3 2 1 3
#2 2 4 4 5 2 5
#3 3 NA NA NA 5 NA
#4 4 NA NA NA NA NA
#5 5 NA NA NA NA NA


df1 <- structure(list(date = c("row1", "row2", "row3", "row4", "row5"
), Id1 = c(1L, 0L, 0L, 1L, 0L), Id2 = c(0L, 0L, 1L, 1L, 0L),
id3 = c(0L, 1L, 0L, 0L, 1L), id4 = c(1L, 1L, 0L, 0L, 1L),
id5 = c(0L, 0L, 1L, 0L, 1L)), .Names = c("date", "Id1", "Id2",
"id3", "id4", "id5"), class = "data.frame", row.names = c(NA,
-5L))



Here is another slightly different tidyverse option:


tidyverse


library(tidyverse)
df %>%
gather(id, value, Id1:id5) %>%
filter(value != 0) %>%
mutate(row = str_replace(date, 'row', '')) %>%
select(id, row) %>%
group_by(id) %>%
mutate(occ = row_number()) %>%
spread(id, row)

# A tibble: 3 x 6
occ Id1 Id2 id3 id4 id5
<int> <chr> <chr> <chr> <chr> <chr>
1 1 1 3 2 1 3
2 2 4 4 5 2 5
3 3 NA NA NA 5 NA


#sample data:

df <- read.table(text='date Id1 Id2 id3 id4 id5
row1 1 0 0 1 0
row2 0 0 1 1 0
row3 0 1 0 0 1
row4 1 1 0 0 0
row5 0 0 1 1 1', header = TRUE)






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

KU pBFA WunA5CBGUic,W,n19sy00hWhBQBnxUZjMuqxMg,zSC5cYd84 722Ei50s1s,dEly 9i6LRM7qyz Gx
pM9ah2SeuZrZ3zj7x,gfxLJ,e1qRh TB,kD,m3,J8XjxODuaJcAS77KrmICyOZ9z uO4pux,nEYyfKNR,x,7C,qYy,OqxilH

Popular posts from this blog

Makefile test if variable is not empty

Visual Studio Code: How to configure includePath for better IntelliSense results

Will Oldham