Using nested apply functions instead of nested for loops

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Using nested apply functions instead of nested for loops



My objective here was to iterate across each column in a df and then for each column iterate down each row and perform a function. The specific function in this case replaces the NA values with the corresponding value in the final column, but the details of the function required are not relevant to the question here. I got the results I needed using two nested for loops like this:


df


NA


for (j in 1:ncol(df.i)) {
for (i in 1:nrow(df.i)) {
df.i[i,j] <- ifelse(is.na(df.i[i,j]), df.i[i,39], df.i[i,j])
}
}



However, I believe this should be possible using an apply(df.i, 1, function) nested within an apply(df.i, 2, function) But I'm not totally sure that is possible or how to do it. Does anyone know how to achieve the same thing with a nested use of the apply function?


apply(df.i, 1, function)


apply(df.i, 2, function)


apply





ifelse is a vectorized function, so your inner loop can be replaced with: df.i[,j] <- ifelse(is.na(df.i[,j]), df.i[,39], df.i[,j]). This can now be used in your apply function.
– Dave2e
15 mins ago




ifelse


df.i[,j] <- ifelse(is.na(df.i[,j]), df.i[,39], df.i[,j])





Beware when using apply() with data.frames. apply() coerces the data.frame to matrix where all columns are of the same data type. This seems not to be an issue in your particular case but in general it is safer to use lapply().
– Uwe
10 mins ago




apply()


apply()


lapply()




1 Answer
1



Here are three ways to do what the inner instruction does.



First, a dataset example.


set.seed(5345) # Make the results reproducible
df.i <- matrix(1:400, ncol = 40)
is.na(df.i) <- sample(400, 50)



Now, the comment by @Dave2e: just one for loop, vectorize the inner most one.


for


df.i1 <- df.i # Work with a copy

for (j in 1:ncol(df.i1)) {
df.i1[,j] <- ifelse(is.na(df.i1[, j]), df.i1[, 39], df.i1[, j])
}



Then, fully vectorized, no loops at all.


df.i2 <- ifelse(is.na(df.i), df.i[, 39], df.i)



And your solution, as posted in the question.


for (j in 1:ncol(df.i)) {
for (i in 1:nrow(df.i)) {
df.i[i,j] <- ifelse(is.na(df.i[i,j]), df.i[i,39], df.i[i,j])
}
}



Compare the results.


identical(df.i, df.i1)
#[1] TRUE

identical(df.i, df.i2)
#[1] TRUE






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Makefile test if variable is not empty

Will Oldham

Visual Studio Code: How to configure includePath for better IntelliSense results