hwlocality_memattrs(3) Library Functions Manual hwlocality_memattrs(3)
NAME
hwlocality_memattrs - Comparing memory node attributes for finding
where to allocate on
SYNOPSIS
Data Structures
struct hwloc_location
Typedefs
typedef unsigned hwloc_memattr_id_t
Enumerations
enum hwloc_memattr_id_e { HWLOC_MEMATTR_ID_CAPACITY,
HWLOC_MEMATTR_ID_LOCALITY, HWLOC_MEMATTR_ID_BANDWIDTH,
HWLOC_MEMATTR_ID_READ_BANDWIDTH, HWLOC_MEMATTR_ID_WRITE_BANDWIDTH,
HWLOC_MEMATTR_ID_LATENCY, HWLOC_MEMATTR_ID_READ_LATENCY,
HWLOC_MEMATTR_ID_WRITE_LATENCY, HWLOC_MEMATTR_ID_MAX }
enum hwloc_location_type_e { HWLOC_LOCATION_TYPE_CPUSET,
HWLOC_LOCATION_TYPE_OBJECT }
enum hwloc_local_numanode_flag_e {
HWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY,
HWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY,
HWLOC_LOCAL_NUMANODE_FLAG_INTERSECT_LOCALITY,
HWLOC_LOCAL_NUMANODE_FLAG_ALL }
Functions
int hwloc_memattr_get_by_name (hwloc_topology_t topology, const char
*name, hwloc_memattr_id_t *id)
int hwloc_get_local_numanode_objs (hwloc_topology_t topology, struct
hwloc_location *location, unsigned *nr, hwloc_obj_t *nodes,
unsigned long flags)
int hwloc_topology_get_default_nodeset (hwloc_topology_t topology,
hwloc_nodeset_t nodeset, unsigned long flags)
int hwloc_memattr_get_value (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, hwloc_obj_t target_node, struct
hwloc_location *initiator, unsigned long flags, hwloc_uint64_t
*value)
int hwloc_memattr_get_best_target (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, struct hwloc_location *initiator,
unsigned long flags, hwloc_obj_t *best_target, hwloc_uint64_t
*value)
int hwloc_memattr_get_best_initiator (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, hwloc_obj_t target_node, unsigned
long flags, struct hwloc_location *best_initiator, hwloc_uint64_t
*value)
int hwloc_memattr_get_targets (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, struct hwloc_location *initiator,
unsigned long flags, unsigned *nr, hwloc_obj_t *targets,
hwloc_uint64_t *values)
int hwloc_memattr_get_initiators (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, hwloc_obj_t target_node, unsigned
long flags, unsigned *nr, struct hwloc_location *initiators,
hwloc_uint64_t *values)
Detailed Description
Platforms with heterogeneous memory require ways to decide whether a
buffer should be allocated on 'fast' memory (such as HBM), 'normal'
memory (DDR) or even 'slow' but large-capacity memory (non-volatile
memory). These memory nodes are called 'Targets' while the CPU
accessing them is called the 'Initiator'. Access performance depends on
their locality (NUMA platforms) as well as the intrinsic performance of
the targets (heterogeneous platforms).
The following attributes describe the performance of memory accesses
from an Initiator to a memory Target, for instance their latency or
bandwidth. Initiators performing these memory accesses are usually some
PUs or Cores (described as a CPU set). Hence a Core may choose where to
allocate a memory buffer by comparing the attributes of different
target memory nodes nearby.
There are also some attributes that are system-wide. Their value does
not depend on a specific initiator performing an access. The memory
node Capacity is an example of such attribute without initiator.
One way to use this API is to start with a cpuset describing the Cores
where a program is bound. The best target NUMA node for allocating
memory in this program on these Cores may be obtained by passing this
cpuset as an initiator to hwloc_memattr_get_best_target() with the
relevant memory attribute. For instance, if the code is latency
limited, use the Latency attribute.
A more flexible approach consists in getting the list of local NUMA
nodes by passing this cpuset to hwloc_get_local_numanode_objs().
Attribute values for these nodes, if any, may then be obtained with
hwloc_memattr_get_value() and manually compared with the desired
criteria.
Memory attributes are also used internally to build Memory Tiers which
provide an easy way to distinguish NUMA nodes of different kinds, as
explained in Heterogeneous Memory.
Beside tiers, hwloc defines a set of 'default' nodes where normal
memory allocations should be made from (see
hwloc_topology_get_default_nodeset()). This is also useful for dividing
the machine into a set of non-overlapping NUMA domains, for instance
for binding tasks per domain.
See also
An example is available in doc/examples/memory-attributes.c in the
source tree.
Note
The API also supports specific objects as initiator, but it is
currently not used internally by hwloc. Users may for instance use
it to provide custom performance values for host memory accesses
performed by GPUs.
The interface actually also accepts targets that are not NUMA
nodes.
Typedef Documentation
typedef unsigned hwloc_memattr_id_t
A memory attribute identifier. hwloc predefines some commonly-used
attributes in hwloc_memattr_id_e. One may then dynamically register
custom ones with hwloc_memattr_register(), they will be assigned IDs
immediately after the predefined ones. See Managing memory attributes
for more information about existing attribute IDs.
Enumeration Type Documentation
enum hwloc_local_numanode_flag_e
Flags for selecting target NUMA nodes.
Enumerator
HWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY
Select NUMA nodes whose locality is larger than the given
cpuset. For instance, if a single PU (or its cpuset) is given in
initiator, select all nodes close to the package that contains
this PU.
HWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY
Select NUMA nodes whose locality is smaller than the given
cpuset. For instance, if a package (or its cpuset) is given in
initiator, also select nodes that are attached to only a half of
that package.
HWLOC_LOCAL_NUMANODE_FLAG_INTERSECT_LOCALITY
Select NUMA nodes whose locality intersects the given cpuset.
This includes larger and smaller localities as well as
localities that are partially included. For instance, if the
locality is one core of both packages, a NUMA node local to one
package is neither larger nor smaller than this locality, but it
intersects it.
HWLOC_LOCAL_NUMANODE_FLAG_ALL
Select all NUMA nodes in the topology. The initiator initiator
is ignored.
enum hwloc_location_type_e
Type of location.
Enumerator
HWLOC_LOCATION_TYPE_CPUSET
Location is given as a cpuset, in the location cpuset union
field.
HWLOC_LOCATION_TYPE_OBJECT
Location is given as an object, in the location object union
field.
enum hwloc_memattr_id_e
Predefined memory attribute IDs. See hwloc_memattr_id_t for the generic
definition of IDs for predefined or custom attributes.
Enumerator
HWLOC_MEMATTR_ID_CAPACITY
The "Capacity" is returned in bytes (local_memory attribute in
objects). Best capacity nodes are nodes with higher capacity.
No initiator is involved when looking at this attribute. The
corresponding attribute flags are HWLOC_MEMATTR_FLAG_HIGHER_FIRST.
Capacity values may not be modified using hwloc_memattr_set_value().
HWLOC_MEMATTR_ID_LOCALITY
The "Locality" is returned as the number of PUs in that locality
(e.g. the weight of its cpuset). Best locality nodes are nodes
with smaller locality (nodes that are local to very few PUs).
Poor locality nodes are nodes with larger locality (nodes that
are local to the entire machine).
No initiator is involved when looking at this attribute. The
corresponding attribute flags are HWLOC_MEMATTR_FLAG_HIGHER_FIRST.
Locality values may not be modified using hwloc_memattr_set_value().
HWLOC_MEMATTR_ID_BANDWIDTH
The "Bandwidth" is returned in MiB/s, as seen from the given
initiator location. Best bandwidth nodes are nodes with higher
bandwidth.
The corresponding attribute flags are HWLOC_MEMATTR_FLAG_HIGHER_FIRST
and HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
This is the average bandwidth for read and write accesses. If the
platform provides individual read and write bandwidths but no explicit
average value, hwloc computes and returns the average.
HWLOC_MEMATTR_ID_READ_BANDWIDTH
The "ReadBandwidth" is returned in MiB/s, as seen from the given
initiator location. Best bandwidth nodes are nodes with higher
bandwidth.
The corresponding attribute flags are HWLOC_MEMATTR_FLAG_HIGHER_FIRST
and HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
HWLOC_MEMATTR_ID_WRITE_BANDWIDTH
The "WriteBandwidth" is returned in MiB/s, as seen from the
given initiator location. Best bandwidth nodes are nodes with
higher bandwidth.
The corresponding attribute flags are HWLOC_MEMATTR_FLAG_HIGHER_FIRST
and HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
HWLOC_MEMATTR_ID_LATENCY
The "Latency" is returned as nanoseconds, as seen from the given
initiator location. Best latency nodes are nodes with smaller
latency.
The corresponding attribute flags are HWLOC_MEMATTR_FLAG_LOWER_FIRST
and HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
This is the average latency for read and write accesses. If the
platform provides individual read and write latencies but no explicit
average value, hwloc computes and returns the average.
HWLOC_MEMATTR_ID_READ_LATENCY
The "ReadLatency" is returned as nanoseconds, as seen from the
given initiator location. Best latency nodes are nodes with
smaller latency.
The corresponding attribute flags are HWLOC_MEMATTR_FLAG_LOWER_FIRST
and HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
HWLOC_MEMATTR_ID_WRITE_LATENCY
The "WriteLatency" is returned as nanoseconds, as seen from the
given initiator location. Best latency nodes are nodes with
smaller latency.
The corresponding attribute flags are HWLOC_MEMATTR_FLAG_LOWER_FIRST
and HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
Function Documentation
int hwloc_get_local_numanode_objs (hwloc_topology_t topology, struct
hwloc_location * location, unsigned * nr, hwloc_obj_t * nodes, unsigned
long flags)
Return an array of local NUMA nodes. By default only select the NUMA
nodes whose locality is exactly the given location. More nodes may be
selected if additional flags are given as a OR'ed set of
hwloc_local_numanode_flag_e.
If location is given as an explicit object, its CPU set is used to find
NUMA nodes with the corresponding locality. If the object does not have
a CPU set (e.g. I/O object), the CPU parent (where the I/O object is
attached) is used.
On input, nr points to the number of nodes that may be stored in the
nodes array. On output, nr will be changed to the number of stored
nodes, or the number of nodes that would have been stored if there were
enough room.
Returns
0 on success or -1 on error.
Note
Some of these NUMA nodes may not have any memory attribute values
and hence not be reported as actual targets in other functions.
The number of NUMA nodes in the topology (obtained by
hwloc_bitmap_weight() on the root object nodeset) may be used to
allocate the nodes array.
When an object CPU set is given as locality, for instance a
Package, and when flags contain both
HWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY and
HWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY, the returned array
corresponds to the nodeset of that object.
int hwloc_memattr_get_best_initiator (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, hwloc_obj_t target_node, unsigned long
flags, struct hwloc_location * best_initiator, hwloc_uint64_t * value)
Return the best initiator for the given attribute and target NUMA node.
If value is non NULL, the corresponding value is returned there.
If multiple initiators have the same attribute values, only one is
returned (and there is no way to clarify how that one is chosen).
Applications that want to detect initiators with identical/similar
values, or that want to look at values for multiple attributes, should
rather get all values using hwloc_memattr_get_value() and manually
select the initiator they consider the best.
The returned initiator should not be modified or freed, it belongs to
the topology.
target_node cannot be NULL.
flags must be 0 for now.
Returns
0 on success.
-1 with errno set to ENOENT if there are no matching initiators.
-1 with errno set to EINVAL if the attribute does not relate to a
specific initiator (it does not have the flag
HWLOC_MEMATTR_FLAG_NEED_INITIATOR).
int hwloc_memattr_get_best_target (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, struct hwloc_location * initiator,
unsigned long flags, hwloc_obj_t * best_target, hwloc_uint64_t * value)
Return the best target NUMA node for the given attribute and initiator.
If the attribute does not relate to a specific initiator (it does not
have the flag HWLOC_MEMATTR_FLAG_NEED_INITIATOR), location initiator is
ignored and may be NULL.
If value is non NULL, the corresponding value is returned there.
If multiple targets have the same attribute values, only one is
returned (and there is no way to clarify how that one is chosen).
Applications that want to detect targets with identical/similar values,
or that want to look at values for multiple attributes, should rather
get all values using hwloc_memattr_get_value() and manually select the
target they consider the best.
flags must be 0 for now.
Returns
0 on success.
-1 with errno set to ENOENT if there are no matching targets.
-1 with errno set to EINVAL if flags are invalid, or no such
attribute exists.
Note
The initiator initiator should be of type
HWLOC_LOCATION_TYPE_CPUSET when refering to accesses performed by
CPU cores. HWLOC_LOCATION_TYPE_OBJECT is currently unused
internally by hwloc, but users may for instance use it to provide
custom information about host memory accesses performed by GPUs.
int hwloc_memattr_get_by_name (hwloc_topology_t topology, const char *
name, hwloc_memattr_id_t * id)
Return the identifier of the memory attribute with the given name.
Returns
0 on success.
-1 with errno set to EINVAL if no such attribute exists.
int hwloc_memattr_get_initiators (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, hwloc_obj_t target_node, unsigned long
flags, unsigned * nr, struct hwloc_location * initiators,
hwloc_uint64_t * values)
Return the initiators that have values for a given attribute for a
specific target NUMA node. Return initiators for the given attribute
and target node in the initiators array. If values is not NULL, the
corresponding attribute values are stored in the array it points to.
On input, nr points to the number of initiators that may be stored in
the array initiators (and values). On output, nr points to the number
of initiators (and values) that were actually found, even if some of
them couldn't be stored in the array. Initiators that couldn't be
stored are ignored, but the function still returns success (0). The
caller may find out by comparing the value pointed by nr before and
after the function call.
The returned initiators should not be modified or freed, they belong to
the topology.
target_node cannot be NULL.
flags must be 0 for now.
If the attribute does not relate to a specific initiator (it does not
have the flag HWLOC_MEMATTR_FLAG_NEED_INITIATOR), no initiator is
returned.
Returns
0 on success or -1 on error.
Note
This function is meant for tools and debugging (listing internal
information) rather than for application queries. Applications
should rather select useful NUMA nodes with
hwloc_get_local_numanode_objs() and then look at their attribute
values for some relevant initiators.
int hwloc_memattr_get_targets (hwloc_topology_t topology,
hwloc_memattr_id_t attribute, struct hwloc_location * initiator,
unsigned long flags, unsigned * nr, hwloc_obj_t * targets,
hwloc_uint64_t * values)
Return the target NUMA nodes that have some values for a given
attribute. Return targets for the given attribute in the targets array
(for the given initiator if any). If values is not NULL, the
corresponding attribute values are stored in the array it points to.
On input, nr points to the number of targets that may be stored in the
array targets (and values). On output, nr points to the number of
targets (and values) that were actually found, even if some of them
couldn't be stored in the array. Targets that couldn't be stored are
ignored, but the function still returns success (0). The caller may
find out by comparing the value pointed by nr before and after the
function call.
The returned targets should not be modified or freed, they belong to
the topology.
Argument initiator is ignored if the attribute does not relate to a
specific initiator (it does not have the flag
HWLOC_MEMATTR_FLAG_NEED_INITIATOR). Otherwise initiator may be non NULL
to report only targets that have a value for that initiator.
flags must be 0 for now.
Note
This function is meant for tools and debugging (listing internal
information) rather than for application queries. Applications
should rather select useful NUMA nodes with
hwloc_get_local_numanode_objs() and then look at their attribute
values.
Returns
0 on success or -1 on error.
Note
The initiator initiator should be of type
HWLOC_LOCATION_TYPE_CPUSET when referring to accesses performed by
CPU cores. HWLOC_LOCATION_TYPE_OBJECT is currently unused
internally by hwloc, but users may for instance use it to provide
custom information about host memory accesses performed by GPUs.
int hwloc_memattr_get_value (hwloc_topology_t topology, hwloc_memattr_id_t
attribute, hwloc_obj_t target_node, struct hwloc_location * initiator,
unsigned long flags, hwloc_uint64_t * value)
Return an attribute value for a specific target NUMA node. If the
attribute does not relate to a specific initiator (it does not have the
flag HWLOC_MEMATTR_FLAG_NEED_INITIATOR), location initiator is ignored
and may be NULL.
target_node cannot be NULL. If attribute is HWLOC_MEMATTR_ID_CAPACITY,
target_node must be a NUMA node. If it is HWLOC_MEMATTR_ID_LOCALITY,
target_node must have a CPU set.
flags must be 0 for now.
Returns
0 on success.
-1 on error, for instance with errno set to EINVAL if flags are
invalid or no such attribute exists.
Note
The initiator initiator should be of type
HWLOC_LOCATION_TYPE_CPUSET when refering to accesses performed by
CPU cores. HWLOC_LOCATION_TYPE_OBJECT is currently unused
internally by hwloc, but users may for instance use it to provide
custom information about host memory accesses performed by GPUs.
int hwloc_topology_get_default_nodeset (hwloc_topology_t topology,
hwloc_nodeset_t nodeset, unsigned long flags)
Return the set of default NUMA nodes. In machines with heterogeneous
memory, some NUMA nodes are considered the default ones, i.e. where
basic allocations should be made from. These are usually DRAM nodes.
Other nodes may be reserved for specific use (I/O device memory, e.g.
GPU memory), small but high performance (HBM), large but slow memory
(NVM), etc. Buffers should usually not be allocated from there unless
explicitly required.
This function fills nodeset with the bits of NUMA nodes considered
default.
It is guaranteed that these nodes have non-intersecting CPU sets, i.e.
cores may not have multiple local NUMA nodes anymore. Hence this may be
used to iterate over the platform divided into separate NUMA
localities, for instance for binding one task per NUMA domain.
Any core that had some local NUMA node(s) in the initial topology
should still have one in the default nodeset. Corner cases where this
would be wrong consist in asymmetric platforms with missing DRAM nodes,
or topologies that were already restricted to less NUMA nodes.
The returned nodeset may be passed to hwloc_topology_restrict() with
HWLOC_RESTRICT_FLAG_BYNODESET to remove all non-default nodes from the
topology. The resulting topology will be easier to use when iterating
over (now homogeneous) NUMA nodes.
The heuristics for finding default nodes relies on memory tiers and
subtypes (see Heterogeneous Memory) as well as the assumption that
hardware vendors list default nodes first in hardware tables.
flags must be 0 for now.
Returns
0 on success.
-1 on error.
Note
The returned nodeset usually contains all nodes from a single
memory tier, likely the DRAM one.
The returned nodeset is included in the list of available nodes
returned by hwloc_topology_get_topology_nodeset(). It is strictly
smaller if the machine has heterogeneous memory.
The heuristics may return a suboptimal set of nodes if hwloc could
not guess memory types and/or if some default nodes were removed
earlier from the topology (e.g. with hwloc_topology_restrict()).
Author
Generated automatically by Doxygen for Hardware Locality (hwloc) from
the source code.
Hardware Locality (hwloc) Version 2.12.2 hwlocality_memattrs(3)
hwloc 2.12.2 - Generated Sat Oct 4 07:13:38 CDT 2025
