library(doParallel)
<- detectCores()
ncores <- makeCluster(getOption("cl.cores", ncores/2))
cl registerDoParallel(cl)
<- ncores
nruns
<- 2
x <- foreach(i=1:4, .combine=c) %dopar% { c(i, x^2) }
res
stopCluster(cl)
res
[1] 1 4 2 4 3 4 4 4
Jong-Hoon Kim
June 14, 2024
I find that parallel
, doParallel
and foreach
packages provide the easiest approach for parallel computing in R. The doParallel
vignette provides a great overview. library(doParallel)
command automatically loads required packages.
library(doParallel)
ncores <- detectCores()
cl <- makeCluster(getOption("cl.cores", ncores/2))
registerDoParallel(cl)
nruns <- ncores
x <- 2
res <- foreach(i=1:4, .combine=c) %dopar% { c(i, x^2) }
stopCluster(cl)
res
[1] 1 4 2 4 3 4 4 4
We can export packages.
cl <- makeCluster(getOption("cl.cores", ncores/2))
registerDoParallel(cl)
res <-
foreach(i=1:5, .packages=c("data.table"), .combine=cbind) %dopar% {
frollmean(rnorm(10), 7)
}
stopCluster(cl)
res
result.1 result.2 result.3 result.4 result.5
[1,] NA NA NA NA NA
[2,] NA NA NA NA NA
[3,] NA NA NA NA NA
[4,] NA NA NA NA NA
[5,] NA NA NA NA NA
[6,] NA NA NA NA NA
[7,] 0.10610803 -0.2996241 0.03278228 0.4701361 -0.004181703
[8,] 0.05807023 -0.3114088 0.20836936 0.3251219 0.400871629
[9,] 0.14870568 -0.3538062 0.35330620 0.3301056 0.293785357
[10,] -0.01413357 -0.4768548 0.19839694 0.2946405 -0.029407561
We may want to set the same seed for each worker.
cl <- makeCluster(getOption("cl.cores", ncores/2))
registerDoParallel(cl)
res <-
foreach(i=1:5, .packages=c("data.table"), .combine=cbind) %dopar% {
set.seed(12)
frollmean(rnorm(10), 7)
}
stopCluster(cl)
res
result.1 result.2 result.3 result.4 result.5
[1,] NA NA NA NA NA
[2,] NA NA NA NA NA
[3,] NA NA NA NA NA
[4,] NA NA NA NA NA
[5,] NA NA NA NA NA
[6,] NA NA NA NA NA
[7,] -0.6236335 -0.6236335 -0.6236335 -0.6236335 -0.6236335
[8,] -0.5018746 -0.5018746 -0.5018746 -0.5018746 -0.5018746
[9,] -0.7423937 -0.7423937 -0.7423937 -0.7423937 -0.7423937
[10,] -0.5445709 -0.5445709 -0.5445709 -0.5445709 -0.5445709
Different foreach
sessions with the same random seed are not reproducible.
cl <- makeCluster(getOption("cl.cores", ncores/2))
registerDoParallel(cl)
set.seed(12)
a1 <- foreach(i=1:2, combine=cbind) %dopar% { rnorm(5) }
b1 <- foreach(i=1:2, combine=cbind) %dopar% { rnorm(5) }
set.seed(12)
a2 <- foreach(i=1:2, combine=cbind) %dopar% { rnorm(5) }
b2 <- foreach(i=1:2, combine=cbind) %dopar% { rnorm(5) }
identical(a1, a2) && identical(b1, b2)
[1] FALSE
Reproducibility across different foreach
sessions are possible using doRNG package. Examples below were adapted from the stackoverflow post.
You use %dorng%
instead of %dopar%
in the doRNG
approach
library(doRNG)
cl <- makeCluster(getOption("cl.cores", ncores/2))
registerDoParallel(cl)
set.seed(12)
a1 <- foreach(i=1:2, combine=cbind) %dorng% { rnorm(5) }
b1 <- foreach(i=1:2, combine=cbind) %dorng% { rnorm(5) }
set.seed(12)
a2 <- foreach(i=1:2, combine=cbind) %dorng% { rnorm(5) }
b2 <- foreach(i=1:2, combine=cbind) %dorng% { rnorm(5) }
identical(a1, a2) && identical(b1, b2)
[1] TRUE
We may want to examine parallel workers.