| availableCores {parallelly} | R Documentation |
Get Number of Available Cores on The Current Machine
Description
The current/main R session counts as one, meaning the minimum number of cores available is always at least one.
Usage
availableCores(
constraints = NULL,
methods = getOption2("parallelly.availableCores.methods", c("system",
"/proc/self/status", "cgroups.cpuset", "cgroups.cpuquota", "cgroups2.cpu.max",
"nproc", "mc.cores", "BiocParallel", "_R_CHECK_LIMIT_CORES_", "Bioconductor", "LSF",
"PJM", "PBS", "SGE", "Slurm", "fallback", "custom")),
na.rm = TRUE,
logical = getOption2("parallelly.availableCores.logical", TRUE),
default = c(current = 1L),
which = c("min", "max", "all"),
omit = getOption2("parallelly.availableCores.omit", 0L),
max = getOption2("parallelly.availableCores.max", Inf)
)
Arguments
constraints |
An optional character specifying under what
constraints ("purposes") we are requesting the values.
For instance, on systems where multicore processing is not supported
(i.e. Windows), using |
methods |
A character vector specifying how to infer the number of available cores. |
na.rm |
If TRUE, only non-missing settings are considered/returned. |
logical |
Passed to
|
default |
The default number of cores to return if no non-missing settings are available. |
which |
A character specifying which settings to return.
If |
omit |
(integer; non-negative) Number of cores to not include. |
max |
(integer; positive) Maximum number of cores returned.
|
Details
The following settings ("methods") for inferring the number of cores are supported:
-
"system"- QuerydetectCores(logical = logical). -
"/proc/self/status"- QueryCpus_allowed_listof/proc/self/status. -
"cgroups.cpuset"- On Unix, query control group (cgroup v1) valuecpuset.set. -
"cgroups.cpuquota"- On Unix, query control group (cgroup v1) valuecpu.cfs_quota_us/cpu.cfs_period_us. -
"cgroups2.cpu.max"- On Unix, query control group (cgroup v2) valuescpu.max. -
"nproc"- On Unix, query system commandnproc. -
"mc.cores"- If available, returns the value of optionmc.cores. Note thatmc.coresis defined as the number of additional R processes that can be used in addition to the main R process. This means that withmc.cores = 0all calculations should be done in the main R process, i.e. we have exactly one core available for our calculations. Themc.coresoption defaults to environment variable MC_CORES (and is set accordingly when the parallel package is loaded). Themc.coresoption is used by for instancemclapply()of the parallel package. -
"connections"or"connections-N"- Query the current number of available R connections perfreeConnections(). This is the maximum number of socket-based parallel cluster nodes that are possible launch, because each one needs its own R connection. The"connections-N"form (e.g.connections-16) works like"connections"but usesfreeConnections() - Nas the upper limit, leavingNconnections free for other purposes. The exception is when the result is zero or less, then1Lis still returned, becauseavailableCores()should always return a positive integer. -
"BiocParallel"- Query environment variable BIOCPARALLEL_WORKER_NUMBER (integer), which is defined and used by BiocParallel (>= 1.27.2). If the former is set, this is the number of cores considered. -
"_R_CHECK_LIMIT_CORES_"- Query environment variable _R_CHECK_LIMIT_CORES_ (logical or"warn") used byR CMD checkand set to true byR CMD check --as-cran. In addition, package parallelly sets_R_CHECK_LIMIT_CORES_=truewhen loaded if it detects thatR CMDis running and builds package vignettes viaR CMD buildandR CMD check, which is somethingR CMD checkdoes not do itself. If_R_CHECK_LIMIT_CORES_is set to a non-false value, then a maximum of 2 cores is considered. -
"Bioconductor"- Query environment variable IS_BIOC_BUILD_MACHINE (logical) used by the Bioconductor (>= 3.16) build and check system. If set to true, then a maximum of 4 cores is considered. -
"LSF"- Query Platform Load Sharing Facility (LSF)/OpenLava environment variable LSB_DJOB_NUMPROC. Jobs with multiple (CPU) slots can be submitted on LSF usingbsub -n 2 -R "span[hosts=1]" < hello.sh. -
"PJM"- Query Fujitsu Technical Computing Suite (that we choose to shorten as "PJM") environment variables PJM_VNODE_CORE and PJM_PROC_BY_NODE. The first is set when submitted withpjsub -L vnode-core=8 hello.sh. -
"PBS"- Query TORQUE/PBS environment variables PBS_NUM_PPN and NCPUS. Depending on PBS system configuration, these resource parameters may or may not default to one. An example of a job submission that results in this isqsub -l nodes=1:ppn=2, which requests one node with two cores. -
"SGE"- Query the "Grid Engine" scheduler environment variable NSLOTS. An example of a job submission that results in this isqsub -pe smp 2(orqsub -pe by_node 2), which requests two cores on a single machine. Known Grid Engine schedulers are Oracle Grid Engine (OGE; acquired Sun Microsystems in 2010), Univa Grid Engine (UGE; fork of open-source SGE 6.2u5), Altair Grid Engine (AGE; acquires Univa Corporation in 2020), Son of Grid Engine (SGE aka SoGE; open-source fork of SGE 6.2u5), and -
"Slurm"- Query Simple Linux Utility for Resource Management (Slurm) environment variable SLURM_CPUS_PER_TASK. This may or may not be set. It can be set when submitting a job, e.g.sbatch --cpus-per-task=2 hello.shor by adding#SBATCH --cpus-per-task=2to the ‘hello.sh’ script. If SLURM_CPUS_PER_TASK is not set, then it will fall back to use SLURM_CPUS_ON_NODE if the job is a single-node job (SLURM_JOB_NUM_NODES is 1), e.g.sbatch --ntasks=2 hello.sh. To make sure all tasks are assign to a single node, specify--nodes=1, e.g.sbatch --nodes=1 --ntasks=16 hello.sh. -
"custom"- If optionparallelly.availableCores.customis set and a function, then this function will be called (without arguments) and it's value will be coerced to an integer, which will be interpreted as a number of available cores. If the value is NA, then it will be ignored. It is safe for this custom function to callavailableCores(); if done, the custom function will not be recursively called.
For any other value of a methods element, the R option with the
same name is queried. If that is not set, the system environment
variable is queried. If neither is set, a missing value is returned.
Value
Return a positive (>= 1) integer.
If which = "all", then more than one value may be returned.
Together with na.rm = FALSE missing values may also be returned.
Avoid ending up with zero cores
Note that some machines might have a limited number of cores, or the R process runs in a container or a cgroup that only provides a small number of cores. A real-world example is when you run R in webR – webR is single-core by design. Another example are free Posit Cloud accounts, which are limited to a single core. In such cases
ncores <- availableCores() - 1
may return zero, which is often not intended and is likely to give an error downstream. Instead, use:
ncores <- availableCores(omit = 1)
to put aside one of the cores from being used. Regardless how many cores you put aside, this function is guaranteed to return at least one core.
Advanced usage
It is possible to override the maximum number of cores on the machine
as reported by availableCores(methods = "system"). This can be
done by first specifying
options(parallelly.availableCores.methods = "mc.cores") and
then the number of cores to use, e.g. options(mc.cores = 8).
See Also
To get the set of available workers regardless of machine,
see availableWorkers().
Examples
message(paste("Number of cores available:", availableCores()))
## Not run:
options(mc.cores = 2L)
message(paste("Number of cores available:", availableCores()))
## End(Not run)
## Not run:
## IMPORTANT: availableCores() may return 1L
options(mc.cores = 1L)
ncores <- availableCores() - 1 ## ncores = 0
ncores <- availableCores(omit = 1) ## ncores = 1
message(paste("Number of cores to use:", ncores))
## End(Not run)
## Not run:
## Use 75% of the cores on the system but never more than four
options(parallelly.availableCores.custom = function() {
ncores <- max(parallel::detectCores(), 1L, na.rm = TRUE)
ncores <- min(as.integer(0.75 * ncores), 4L)
max(1L, ncores)
})
message(paste("Number of cores available:", availableCores()))
## Use 50% of the cores according to availableCores(), e.g.
## allocated by a job scheduler or cgroups.
## Note that it is safe to call availableCores() here.
options(parallelly.availableCores.custom = function() {
0.50 * parallelly::availableCores()
})
message(paste("Number of cores available:", availableCores()))
## End(Not run)