hwloc-calc(1) hwloc hwloc-calc(1)
NAME
hwloc-calc - Operate on cpu mask strings and objects
SYNOPSIS
hwloc-calc [topology options] [options] <location1> [<location2> [...]
]
Note that hwloc(7) provides a detailed explanation of the hwloc system
and of valid <location> formats; it should be read before reading this
man page.
TOPOLOGY OPTIONS
All topology options must be given before all other options.
--no-smt, --no-smt=<N>
Only keep the first PU per core in the input locations. If
<N> is specified, keep the <N>-th instead, if any. PUs are
ordered by physical index during this filtering.
Note that this option is applied after searching locations.
Hence --no-smt pu:2-5 will first select the PUs #2 to #5 in
the machine before keeping one of them per core. To rather
get PUs #2 to #5 after filtering one per core, you should
combine invocations:
hwloc-calc --restrict $(hwloc-calc --no-smt all) pu:2-5
--cpukind <n>, --cpukind <infoname>=<infovalue>
Only keep PUs whose CPU kind match. Either a single CPU kind
is specified as an index, or the info attribute name-value
will select matching kinds.
When specified by index, it corresponds to hwloc ranking of
CPU kinds which returns energy-efficient cores first, and
high-performance power-hungry cores last. The full list of
CPU kinds may be seen with lstopo --cpukinds.
Note that this option is applied after searching locations.
Hence --cpukind 0 core:1 will return the second core of the
machine if it is of kind 0, and nothing otherwise. To rather
get the second core among those of kind 0, you should combine
invocations:
hwloc-calc --restrict $(hwloc-calc --cpukind 0 all) core:1
--default-nodes
Only keep NUMA nodes that are considered default nodes on
heterogeneous memory platforms. This usually includes DRAM
memory nodes (or nodes of the same memory tier) rather than
nodes with specific characteristics (HBM, NVM, CXL, etc).
This option is useful for splitting the topology by NUMA
domain when binding one task per domain even if some NUMA
domains have the same locality (e.g. one DRAM and one HBM
node per socket).
See hwloc_topology_get_default_nodeset() for details.
--restrict <cpuset>
Restrict the topology to the given cpuset. This removes some
PUs and their now-child-less parents.
This is useful when combining invocations to filter some
objects before selecting among them.
Beware that restricting the PUs in a topology may change the
logical indexes of many objects, including NUMA nodes.
--restrict nodeset=<nodeset>
Restrict the topology to the given nodeset (unless
--restrict-flags specifies something different). This
removes some NUMA nodes and their now-child-less parents.
Beware that restricting the NUMA nodes in a topology may
change the logical indexes of many objects, including PUs.
--restrict-flags <flags>
Enforce flags when restricting the topology. Flags may be
given as numeric values or as a comma-separated list of flag
names that are passed to hwloc_topology_restrict(). Those
names may be substrings of actual flag names as long as a
single one matches, for instance bynodeset,memless. The
default is 0 (or none).
--disallowed
Include objects disallowed by administrative limitations.
-i <path>, --input <path>
Read the topology from <path> instead of discovering the
topology of the local machine.
If <path> is a file, it may be a XML file exported by a
previous hwloc program. If <path> is "-", the standard input
may be used as a XML file.
On Linux, <path> may be a directory containing the topology
files gathered from another machine topology with hwloc-
gather-topology.
On x86, <path> may be a directory containing a cpuid dump
gathered with hwloc-gather-cpuid.
When the archivemount program is available, <path> may also
be a tarball containing such Linux or x86 topology files.
-i <specification>, --input <specification>
Simulate a fake hierarchy (instead of discovering the
topology on the local machine). If <specification> is "node:2
pu:3", the topology will contain two NUMA nodes with 3
processing units in each of them. The <specification> string
must end with a number of PUs.
--if <format>, --input-format <format>
Enforce the input in the given format, among xml, fsroot,
cpuid and synthetic.
OUTPUT CONVERSION OPTIONS
By default, the output is a CPU set (or nodeset). These options
convert this set into objects, count them, etc.
All these options must be given after all topology options above.
-N --number-of <type|depth>
Report the number of objects of the given type or depth that
intersect the CPU set. This is convenient for finding how many
cores, NUMA nodes or PUs are available in a machine.
<type may contain a filter to select specific objects among the
type. For instance -N "numa[hbm]" counts NUMA nodes marked with
subtype "HBM", while -N "numa[mcdram]" only counts MCDRAM NUMA
nodes on KNL.
If an OS device subtype such as gpu is given instead of osdev,
only the os devices of that subtype will be counted.
Special values such as cpukind and memorytier may be given to
return the number of cpukinds or memory tiers matching the input
location.
-I --intersect <type|depth>
Find the list of objects of the given type or depth that
intersect the CPU set and report the comma-separated list of
their indexes instead of the cpu mask string. This may be used
for determining the list of objects above or below the input
objects.
When combined with --physical, the list is convenient to pass to
external tools such as taskset or numactl --physcpubind or
--membind. This is different from --largest since the latter
requires that all reported objects are strictly included inside
the input objects.
<type may contain a filter to select specific objects among the
type. For instance -N "numa[hbm]" lists NUMA nodes marked with
subtype "HBM", while -N "numa[mcdram]" only lists MCDRAM NUMA
nodes on KNL. Note that this filter applies when selecting
objects, but not when outputting them, e.g. MCDRAM NUMA node #3
is outputted as 7 (NUMA node #7) instead of 3.
If an OS device subtype such as gpu is given instead of osdev,
only the os devices of that subtype will be returned.
Special values such as cpukind and memorytier may be given to
return the list of cpukind or memory tier indexes matching the
input location.
If combined with --object-output, object indexes are prefixed
with types (e.g. Core:0 instead of 0).
-H --hierarchical <type1>.<type2>...
Find the list of objects of type <type2> that intersect the CPU
set and report the space-separated list of their hierarchical
indexes with respect to <type1>, <type2>, etc. For instance, if
package.core is given, the output would be Package:1.Core:2
Package:2.Core:3 if the input contains the third core of the
second package and the fourth core of the third package.
Only normal CPU-side object types should be used.
NUMA nodes may be used but they may cause redundancy in the
output on heterogeneous memory platform. For instance, on a
platform with both DRAM and HBM memory on a package, the first
core will be considered both as first core of first NUMA node
(DRAM) and as first core of second NUMA node (HBM).
--largest
Report (in a human readable format) the list of largest objects
which exactly include all input objects (by looking at their CPU
sets). None of these output objects intersect each other, and
the sum of them is exactly equivalent to the input. No larger
object is included in the input.
This is different from --intersect where reported objects may
not be strictly included in the input.
--local-memory
Report the list of NUMA nodes that are local to the input
objects.
This option is similar to -I numa but the way nodes are selected
is different: The selection performed by --local-memory may be
precisely configured with --local-memory-flags, while -I numa
just selects all nodes that are somehow local to any of the
input objects.
If combined with --object-output, object indexes are prefixed
with types (e.g. NUMANode:0 instead of 0).
--local-memory-flags
Change the flags used to select local NUMA nodes. Flags may be
given as numeric values or as a comma-separated list of flag
names that are passed to hwloc_get_local_numanode_objs(). Those
names may be substrings of actual flag names as long as a single
one matches. The default is xb (or smaller,larger,intersects)
which means NUMA nodes are displayed if their locality either
contains, is contained, or intersects the locality of the given
object.
This option enables --local-memory.
--best-memattr <name>
Enable the listing of local memory nodes with --local-memory,
but only display the local nodes that have the best value for
the memory attribute given by <name> (or as an index).
If the memory attribute values depend on the initiator, the
hwloc-calc input objects are used as the initiator.
Standard attribute names are Capacity, Locality, Bandwidth, and
Latency. All existing attributes in the current topology may be
listed with
$ lstopo --memattrs
If combined with --object-output, the object index is prefixed
with its type (e.g. NUMANode:0 instead of 0).
<name> may be suffixed with flags to tune the selection of best
nodes, for instance as bandwidth,strict,default.
default means that default nodes are reported if no best could
be found (see --default-nodes). If neither best nor default
nodes could be found, all local nodes are reported.
strict means that nodes are selected only if their performance
is the best for all the input CPUs. On a dual-socket machine
with HBM in each socket, both HBMs are the best for their local
socket, but not for the remote socket. Hence both HBM are also
considered best for the entire machine by default, but none if
strict.
INPUT / OUTPUT SET AND OBJECT OPTIONS
These options configure how objects and CPU/node sets are parsed on
input and formatted on output.
All these options must be given after all topology options above.
-p --physical
Use OS/physical indexes instead of logical indexes for both
input and output.
-l --logical
Use logical indexes instead of physical/OS indexes for both
input and output (default).
--pi --physical-input
Use OS/physical indexes instead of logical indexes for input.
--li --logical-input
Use logical indexes instead of physical/OS indexes for input
(default).
--po --physical-output
Use OS/physical indexes instead of logical indexes for
output.
--lo --logical-output
Use logical indexes instead of physical/OS indexes for output
(default, except for cpusets which are always physical).
-n --nodeset
Interpret both input and output sets as nodesets instead of
CPU sets. See --nodeset-output and --nodeset-input below for
details.
--no --nodeset-output
Report nodesets instead of CPU sets. This output is more
precise than the default CPU set output when memory locality
matters because it properly describes CPU-less NUMA nodes, as
well as NUMA-nodes that are local to multiple CPUs.
--ni --nodeset-input
Interpret input sets as nodesets instead of CPU sets.
FORMATTING OPTIONS
All these options must be given after all topology options above.
--oo --object-output
When reporting object indexes (e.g. with -I or --local-memory),
this option prefixes these indexes with types (e.g. Core:0
instead of 0).
--sep <sep>
Change the field separator in the output. By default, a space
is used to separate output objects (for instance when
--hierarchical or --largest is given) while a comma is used to
separate indexes (for instance when --intersect is given).
--single
Singlify the output to a single CPU.
--cpuset-output-format <hwloc|list|taskset|systemd-dbus-api> --cof
<hwloc|list|taskset|systemd-dbus-api>
Change the format of displayed bitmap strings (CPU set or
nodeset). By default, the hwloc-specific format is used. If
list is given, the output is a comma-separated of numbers or
ranges, e.g. 2,4-5,8 . If taskset is given, the output is
compatible with the taskset program (replaces the former
--taskset option). If systemd-dbus-api is given, the output is
compatible with systemd's D-Bus API, e.g. "ay 0x0002 0x78 0x04"
for the CPU set list "3-6,10".
For convenience, --nodeset-output-format (or --nof) behaves the
same but also implies --nodeset-output.
This option has no impact on the format of input CPU set
strings, see --cpuset-input-format.
--cpuset-input-format <hwloc|list|taskset> --cif <hwloc|list|taskset>
Change the format of input bitmap strings (CPU set or nodeset).
By default, the tool tries to guess the type automatically
between hwloc, list or taskset formats. This option forces the
parsing format to avoid ambiguity for instance when "1,3,5" may
be parsed as a hwloc cpuset "0x1,0x00000003,0x00000005" or as
list "1-1,3-3,5-5".
This option has no impact on the format of output CPU set
strings, see --cpuset-output-format.
-q --quiet
Hide non-fatal error messages. It mostly includes locations
pointing to non-existing objects.
-v --verbose
Verbose output.
--version
Report version and exit.
-h --help
Display help message and exit.
DESCRIPTION
hwloc-calc generates and manipulates CPU mask strings or objects. Both
input and output may be either objects (with physical or logical
indexes), CPU lists (with physical or logical indexes), or CPU mask
strings (always physically indexed). Input location specification is
described in hwloc(7).
If objects or CPU mask strings are given on the command-line, they are
combined and a single output is printed. If no object or CPU mask
strings are given on the command-line, the program will read the
standard input. It will combine multiple objects or CPU mask strings
that are given on the same line of the standard input line with spaces
as separators. Different input lines will be processed separately.
Command-line arguments and options are processed in order. First
topology configuration options should be given. Then, for instance,
changing the type of input indexes with --li or changing the input
topology with -i only affects the processing the following arguments.
NOTE: It is highly recommended that you read the hwloc(7) overview page
before reading this man page. Most of the concepts described in
hwloc(7) directly apply to the hwloc-calc utility.
EXAMPLES
hwloc-calc's operation is best described through several examples.
To display the (physical) CPU mask corresponding to the second package:
$ hwloc-calc package:1
0x000000f0
To display the (physical) CPU mask corresponding to the third pacakge,
excluding its even numbered logical processors:
$ hwloc-calc package:2 ~PU:even
0x00000c00
To display the (physical) CPU mask of the entire topology except the
third package:
$ hwloc-calc all ~package:3
0x0000f0ff
To combine two (physical) CPU masks:
$ hwloc-calc 0x0000ffff 0xff000000
0xff00ffff
Examples of listing or counting objects
To display the list of logical numbers of processors included in the
second package:
$ hwloc-calc --intersect PU package:1
4,5,6,7
To bind GNU OpenMP threads logically over the whole machine, we need to
use physical number output instead:
$ export GOMP_CPU_AFFINITY=`hwloc-calc --physical-output
--intersect PU all`
$ echo $GOMP_CPU_AFFINITY
0,4,1,5,2,6,3,7
To display the list of NUMA nodes, by physical indexes, that intersect
a given (physical) CPU mask:
$ hwloc-calc --physical --intersect NUMAnode 0xf0f0f0f0
0,2
To find how many cores are in the second CPU kind (those cores are
likely higher-performance and more power-hungry than cores of the first
kind):
$ hwloc-calc --cpukind 1 -N core all
4
To convert a cpu mask to human-readable output, the -H option can be
used to emit a space-delimited list of locations:
$ echo 0x000000f0 | hwloc-calc -q -H package.core
Package:1.Core1 Package:1.Core:1 Package:1.Core:2 Package:1.Core:3
To use some other character (e.g., a comma) instead of spaces in
output, use the --sep option:
$ echo 0x000000f0 | hwloc-calc -q -H package.core --sep ,
Package:1.Core1,Package:1.Core:1,Package:1.Core:2,Package:1.Core:3
To synthetize a set of cores into largest objects on a 2-node 2-package
2-core machine:
$ hwloc-calc core:0 --largest
Core:0
$ hwloc-calc core:0-1 --largest
Package:0
$ hwloc-calc core:4-7 --largest
L3Cache:1
$ hwloc-calc core:2-6 --largest
Package:1 Package:2 Core:6
$ hwloc-calc pack:2 --largest
Package:2
$ hwloc-calc package:2-3 --largest
L3Cache:1
To get the set of first threads of all cores:
$ hwloc-calc core:all.pu:0
0xffff0000
$ hwloc-calc --no-smt all -I pu
0,2,4,6,8,10,12,14
To get the number of cpukinds inside a package:
$ hwloc-calc -N cpukind package:0
2
Examples of listing or counting NUMA nodes
To display the list of NUMA nodes, by physical indexes, whose locality
is exactly equal to a Package:
$ hwloc-calc --local-memory-flags 0 --physical-output pack:1
4,7
To display the list of default NUMA nodes, by logical indexes, in the
entire machine:
$ hwloc-calc --default-nodes -I numa all
0,2,4,6
To display the best-capacity NUMA node(s), by physical indexes, whose
locality is exactly equal to a Package:
$ hwloc-calc --local-memory-flags 0 --best-memattr capacity
--physical-output pack:1
4
To find the number of NUMA nodes with subtype "HBM":
$ hwloc-calc -N "numa[hbm]" all
4
To find the number of NUMA nodes in memory tier 1 (DRAM nodes on a
server with HBM and DRAM):
$ hwloc-calc -N "numa[tier=1]" all
4
To find the NUMA node of subtype MCDRAM (on KNL) near a PU:
$ hwloc-calc -I "numa[mcdram]" --oo pu:157
NUMANode:1
To find the memory tier of a NUMA node:
$ hwloc-calc -I memorytier node:2
1
Examples with physical and logical indexes
Converting object logical indexes (default) from/to physical/OS indexes
may be performed with --intersect combined with either --physical-
output (logical to physical conversion) or --physical-input (physical
to logical):
$ hwloc-calc --physical-output PU:2 --intersect PU
3
$ hwloc-calc --physical-input PU:3 --intersect PU
2
This may also be used for converting indexes of memory objects, even
with heterogeneous memory:
$ hwloc-calc --physical-output node:2 --intersect node
3
$ hwloc-calc --physical-input node:3 --intersect node
2
To combine both physical and logical indexes as input:
$ hwloc-calc PU:2 --physical-input PU:3
0x0000000c
Examples with I/O devices
To display the set of CPUs near network interface eth0:
$ hwloc-calc os=eth0
0x00005555
To display the indexes of packages near PCI device whose bus ID is
0000:01:02.0:
$ hwloc-calc pci=0000:01:02.0 --intersect Package
1
OS devices may also be filtered by subtype. In this example, there are
8 OS devices in the system, 4 of them are near NUMA node #1, and only 2
of these are CoProcessors:
$ utils/hwloc/hwloc-calc -I osdev all
0,1,2,3,4,5,6,7,8
$ utils/hwloc/hwloc-calc -I osdev node:1
5,6,7,8
$ utils/hwloc/hwloc-calc -I coproc node:1
7,8
Examples with other tools
To make GNU OpenMP use exactly one thread per core, and in logical core
order:
$ export OMP_NUM_THREADS=`hwloc-calc --number-of core all`
$ echo $OMP_NUM_THREADS
4
$ export GOMP_CPU_AFFINITY=`hwloc-calc --physical-output
--intersect PU --no-smt all`
$ echo $GOMP_CPU_AFFINITY
0,2,1,3
To export bitmask in a format that is acceptable by the resctrl Linux
subsystem (for configuring cache partitioning, etc), apply a sed regexp
to the output of hwloc-calc:
$ hwloc-calc pack:all.core:7-9.pu:0
0x00000380,,0x00000380 <this format cannot be given to resctrl>
$ hwloc-calc pack:all.core:7-9.pu:0 | sed -e 's/0x//g' -e
's/,,/,0,/g' -e 's/,,/,0,/g'
00000380,0,00000380
# echo 00000380,0,00000380 > /sys/fs/resctrl/test/cpus
# cat /sys/fs/resctrl/test/cpus
00000000,00000380,00000000,00000380 <the modified bitmask was
corrected parsed by resctrl>
Example of use of the systemd-dbus-api cpuset and nodeset outputs format
hwloc-calc allows one to generate the very cryptic AllowedCPUs and
AllowedMemoryNodes strings, which the D-Bus API of systemd expects,
from other hwloc representations. This is especially useful when the
systemd-run command, which understands numeric lists, cannot be used.
First, create a systemd slice:
$ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss my_slice.slice fail
Then, configure the CPU and Node sets of the slice, using hwloc-calc to
translate the syntax:
$ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager SetUnitProperties 'sba(sv)' my_slice.slice 1 1 AllowedCPUs $(hwloc-calc pu:0 pu:31 pu:32 pu:63 pu:64 pu:77 --cpuset-output-format systemd-dbus-api)
$ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager SetUnitProperties 'sba(sv)' my_slice.slice 1 1 AllowedMemoryNodes $(hwloc-calc pu:0 pu:31 pu:32 pu:63 pu:64 pu:77 --nodeset-output-format systemd-dbus-api)
Finally, add the current process to the slice:
$ busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartTransientUnit 'ssa(sv)a(sa(sv))' my_scope.scope fail 3 Delegate b 1 PIDs au 1 $$ Slice s my_slice.slice 0
More info in the org.freedesktop.systemd1(5) manual page.
RETURN VALUE
Upon successful execution, hwloc-calc displays the (physical) CPU mask
string, (physical or logical) object list, or (physical or logical)
object number list. The return value is 0.
hwloc-calc will return nonzero if any kind of error occurs, such as
(but not limited to): failure to parse the command line.
SEE ALSO
hwloc(7), lstopo(1), hwloc-info(1)
2.12.1 May 12, 2025 hwloc-calc(1)
hwloc 2.12.1 - Generated Sat Jun 14 08:06:03 CDT 2025
