0% found this document useful (0 votes)
37 views7 pages

Max Core Value & Performance

The document provides guidelines on setting the max-core parameter for components like SORT, JOIN, and ROLLUP, emphasizing that the optimal value varies based on the specific graph and data. It explains the consequences of setting max-core too low or too high, including performance degradation and potential system failures. Additionally, it discusses memory usage for graphs, filesystem performance, and the importance of testing configurations to optimize performance.

Uploaded by

edutech2026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views7 pages

Max Core Value & Performance

The document provides guidelines on setting the max-core parameter for components like SORT, JOIN, and ROLLUP, emphasizing that the optimal value varies based on the specific graph and data. It explains the consequences of setting max-core too low or too high, including performance degradation and potential system failures. Additionally, it discusses memory usage for graphs, filesystem performance, and the importance of testing configurations to optimize performance.

Uploaded by

edutech2026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

What should you set it to?

What you should set max-core to, or whether you should leave it at its default setting, depends
on the component and the data you’re working with. For specific guidelines, see the
documentation for the individual components. If you happen to have a “just right” max-core
setting, or if the default value serves (often it does), that’s fine. But since, if you have to set it,
you have to set it by estimate, it’s important to know what happens when you set it too low or
too high.
Too low
Giving max-core a value less than what the component needs to get the job done completely in
memory will result in the component writing more temporary data to disk at runtime. Depending
on how much data is involved, this can slow performance. But this type of disk activity is much,
much preferable to the uncontrolled disk activity of paging or thrashing.
Too high
Allocating too much memory with max-core can have various effects, depending on the
circumstances:
Perhaps max-core is set to a value higher than needed by the component, but still within the
capacity of the system. In this case, no harm is done — unless the data size increases in later
runs, causing more memory to be allocated, with the result that system performance begins to be
affected (see “Conclusion”).
NOTE: If you set a SORT component’s max-core too low, the component may create too
many little temporary files. This can be a problem, and a good reason to increase max-core.
If max-core is set high enough that your graph’s working set no longer fits in physical memory,
the computer will have to start paging simply to run the graph. This will certainly have an
adverse effect on the graph’s performance, and on the performance of any other applications
running at the time.
If max-core is set so high that the computer’s swap space is exhausted, you can cause your own
graph, and possibly other applications, or even the computer itself, to fail. This has the worst
possible effect on performance.

-================

How does a single graph use memory?


In many cases, the working set of a graph is only a fraction of the total memory demands of all
its components and data, for the following reasons:
Every graph consists of one or more phases
A graph that has more than one phase runs one phase at a time, sequentially, to completion. All
data is written to disk at each phase break. The amount of memory needed by a multiphase graph
is thus equal only to the amount of memory needed by the graph’s single most memory-intensive
phase.
Memory demand is relevant only to separate individual computers
A graph with layouts that involve more than one computer will make memory demands on all
those computers. But the ultimate unit of available memory is the individual computer. You
should calculate each graph phase’s memory demands in terms only of each particular computer
that the phase is running on.
However, it’s important to remember that just because a graph is running parallel, that doesn’t
mean it’s necessarily running on more than one machine. It could be running (for example) on an
SMP, with multiple processors but only one memory space.

Filesystem layout and performance


A graph’s performance is dependent on the performance of the file systems it uses, just as much
as on its own computational efficiency. Graphs work best with an application file space that is
optimized for reading and writing large contiguous blocks of data, fast.
What determines the read/write efficiency of a filesystem is the number of independent disk
controllers and disks available, and whether the files are cached in the controller. These things
are often hidden, however, beneath the configured filesystem (for example, within a storage area
network (SAN)) as it appears to you at the user level. You only see them indirectly, in the
sometimes surprising effects they have on filesystem operations.
The simple tests described in this section will give you a good idea of the performance
capabilities of your filesystem. Three things are separately measured:
Write performance
Read performance
Write/read performance in the same graph
Filesystem performance testing should begin with serial operations. This gives you a set of
simple base observations. You can then go on to test parallel performance running with a
succession of multifiles using increasing numbers of partitions. When testing the performance of
a system or an application in various scenarios and configurations, it is critical to change only
one thing at a time. That way, you will always know the precise cause of any performance
change.
Related topics
Formula for calculating a component’s memory usage
As explained above, the memory used by a graph phase should be roughly equivalent to the
following:
Component_instance1 + Component_instance2 + Component_instance3 + . . .
where Component_instance1 (and so on) represent the memory requirements of each component
instance process in the graph phase. We say component instance because, in cases where a
component is running parallel in a partitioned layout, each partition’s instance of the component
process must be added into the total. Thus, a component running four ways parallel is really four
separate instances of the component, and all four instances have to be counted in the memory
usage total.
Note that these are program components only: an INPUT FILE or OUTPUT FILE component
doesn’t count, although (for example) READ MULTIPLE FILES or WRITE MULTIPLE FILES
does.
The formula for (roughly) calculating the amount of memory required by one component
instance process is as follows:
base amount + lookups + max-core
where:
base amount is an amount of memory, usually equal to about 7 MB
The actual base amount varies by the particular component, platform and compilation mode, and
can be as low as 3 MB and as high as 10 MB. However, 7 MB is a good middle figure to use in
these calculations.
lookups is the amount of memory required by any lookup file referenced by the component
Lookup files use memory for the lookup data itself, and for the “indexes” used to do the lookup.
In some cases, the lookup data can be shared among components. In these cases, the memory
used by the data should only be counted once in a graph phase. See “Memory needs for lookup
files”.
max-core is an extra amount of memory specified by the component’s max-core parameter (if
any)
Certain components have extra memory needs which can vary, depending on the size of the data
involved, and other things. This parameter, if a component has it, allows you to specify how
much extra memory can be allocated to the component. See “Choosing the max-core setting”.

Memory needs for lookup files


You can assume that both parts of any lookup file (both the data and the indexes) are always
shared among the graph components that use it, unless:
The lookup file is remote. If (for example) you have two REFORMAT components in your
graph that access the same lookup file, they cannot share a copy of it if they run on two different
computers.
The lookup file is an MVS dataset.
The lookup file is of a type that does not have a precomputed index — for example, appendable
lookup files and updatable lookup files.
When you’re counting up the memory needs of components in a graph phase, you should count
only once any lookup file that is shared.
Two things, taken together, make up the size of a lookup file:
Lookup data size
This is the same as the size of the file itself. If the file is a multifile, and the component doing the
lookup is partitioned on more than one computer, then you should count only the data in a single
partition on the same computer.
Note also that if the lookup file will be growing over time, you should allow for this growth in
your memory estimate as well.
Index size
This is equal to:
number of records in the data * index entry size
The index entry size varies, depending on the types and numbers of key fields:
For a 32-bit Co>Operating System, an index entry will be about 20 bytes
Simpler keys (fixed length types, contiguous fields) will have entries of about 12 bytes
For a 64-bit Co>Operating System, an index entry will be about 32 bytes
Simpler keys (fixed length types, contiguous fields) will have entries of about 24 bytes

File table overflow


Question

What does the error message “File table overflow” mean?

Short answer

This error message indicates that the system-wide limit on open files has been exceeded. Either
there are too many processes running on the system, or the kernel configuration needs to be
changed.

Details

This error message might occur if the maximum number of open files allowed on the machine is
set too low, or if max-core is set too low in the components that are processing large amounts of
data. In the latter case, much of the data processed in a component (such as a SORT or JOIN
component) spills to disk, causing many files to be opened. Increasing the value of max-core is
an appropriate first step in the case of a sort, because it reduces the number of separate merge
files that must be opened at the conclusion of the sort.
NOTE: Because increasing max-core also results in the memory requirements of your
graph increasing be careful not to increase it too much (and you might need to consider changing
the graph’s phasing to reduce memory requirements). It is seldom necessary to increase max-
core beyond 100MB.
If the error still occurs, see your system administrator. Note that the kernel setting for the
maximum number of system-wide open files is operating system-dependent (for example, this is
the nfile parameter on Unix systems), and, on many platforms, requires a reboot in order to take
effect. See the Ab Initio Server Software Installation Guide for Unix for the recommended
settings.

Value for max-core

Question

What value should I set for the max-core parameter?

Short answer

The max-core parameter is found in the SORT, JOIN, and ROLLUP components, among others. There is
no single optimal value for the max-core parameter, because a “good” value depends on your particular
graph and the environment in which it runs, and on the data.

Details
The Sort, Rollup, Scan and Join components have a parameter max-core which determines the
maximum amount of memory they will consume per partition before they spill to disk. When the value
of max-core is exceeded, all input (in the case of Sort) or the excess input (in the cases of the other
components) are dropped to disk in the form of temporary files. This can have a dramatic impact on
performance, but it does not mean that it is always better to increase the value of max-core in these
situations.

The higher you set the value of max-core, the more memory the component can use. Using more
memory generally improves performance — up to a point. Beyond this point, performance will not
improve and may even worsen. If the value of max-core is set too high, operating system swapping can
occur and the graph may fail if virtual memory on the machine is exhausted.

When setting the value for max-core, you can use the suffixes k, m, and g (uppercase is also supported)
to indicate powers of 1024. For max-core, the suffix k (kilobytes) means precisely 1024 bytes, not 1000.
Similarly, the suffix m (megabytes) means precisely 1048576 (10242), and g (gigabytes) means precisely
10243. Note that the maximum allowed value for max-core is 2147483647 in 32-bit builds of the
Co>Operating System.

In general, using additional memory can improve the performance of in-memory Rollup or Join, but not
of Sort.

When spillage occurs, consider setting the configuration variable AB_SPILL_FILE_COMPRESSION_LEVEL.


This variable compresses the temporary files spilled to disk. It is most helpful when you have a fast CPU
but slow disk (which is common).

In-memory ROLLUP or JOIN


It is difficult to be precise about the amount of memory an in-memory Rollup or Join can use.

An in-memory Join tries to hold all its nondriving inputs in memory. Thus you should make the largest
input by volume the driving one by setting the driving parameter to the number of its port.

When the non-driving inputs fit in memory, the driving input is pipelined, resulting in pipeline
parallelism. Any spillage of the non-driving input (which happens incrementally when its size exceeds
the value of max-core) breaks the pipeline and eliminates the parallelism.

An in-memory Rollup component must have enough memory to hold the size of its keys, plus the size of
its temporaries, plus the size of any input fields required in finalize to produce the output. In practice, in
most Rollup components, this is simply the size of the output. In addition, some space is needed for the
in-memory index.

If the totality of this data exceeds the value of max-core, the component spills the excess to disk
incrementally.
You should always set max-core’s value in in-memory Rollup and Join components as a reference to a
sandbox input parameter declared with an appropriate default value. The input parameter’s value can
be changed at runtime if required.

NOTE: The Ab Initio Environment’s AI_GRAPH_MAX_CORE parameter is predefined for this


purpose. AI_GRAPH_MAX_CORE is defined in terms of declarations for AI_GRAPH_MAX_CORE_HALF
and AI_GRAPH_MAX_CORE_QUARTER and thus you can easily divide the available max-core among
different in-memory components in a phase. The Ab Initio Environment checks that
$AI_GRAPH_MAX_CORE has a sensible value by comparing it to $AI_GRAPH_MAX_CORE_MIN and
$AI_GRAPH_MAX_CORE_MAX.

If two or more in-memory components each need most or all of the memory available for max-core, you
should put the components in separate phases, provided you have the disk space to hold the data at the
phase break.

Another use of phasing is to control the allocation of memory among in-memory components. When
there is a limited amount of memory available you can use phasing to make sure each in-memory
component gets a sufficient amount. Typically, only one to four in-memory components of significant
size should occupy the same phase, depending on memory availability and demands.

To compute a runtime estimate for max-core, take two thirds of the total memory available on the
machine and subtract any memory used by lookups and competing jobs, including other graphs, running
at the same time on the machine. This is the available memory. Divide this result by the number of
partitions to get your max-core estimate — max-core is measured per partition. The formula is thus:

AI_GRAPH_MAX_CORE = ((2/3 * total memory) - memory used elsewhere)/(number of partitions)

SORT component
For the Sort component, 100 MB is the default value for max-core. This default works well for a wide
variety of situations, and you rarely need to change it.

You should increase max-core when the data volume is so large that the number of temporary (spillage)
files exceeds 1000 (approximately — the actual value depends on ulimit). In this case SORT writes to disk
twice, slowing performance significantly. You can estimate the number of temporary files by multiplying
the data volume being sorted by three and dividing by the value of max-core (because data is written to
disk in blocks that are one-third the size of the max-core setting). For example, suppose you are sorting
100 GB of data with the default max-core setting of 100 MB and the process is running in serial. The
number of temporary files that will be created is:

3 × 100000 MB / 100 MB = 3000 files

In this case, increasing max-core would reduce the number of temporary files. (Remember that if you
are using a multifile system, you should use the data volume per partition in this calculation.)
For other cases, where a SORT component is a critical bottleneck, you must experiment to determine
whether increasing max-core to keep more data in memory is worthwhile. To keep data in memory, you
need max-core to be 50% greater than the volume of data being sorted. However, increasing the max-
core to accommodate larger volumes of data in memory typically does not increase performance.
(Briefly, SORT divides the data into blocks; each block is one-third the size of max-core. When you
increase max-core, the size of each block increases. However, the time to sort each block increases
disproportionately, slowing performance even with fewer blocks to sort.)

Rarely, you may see a “Too many open files” error message. Most often this occurs when the sort
operation encounters the system limit for the number of open files. To avoid the error, decrease the
value of the configuration variable AB_MAX_SIMULTANEOUS_MERGE_FILES. For more information, see
“Too many open files in the system or Too many open files”.

NOTE: We recommend setting the value max-core as a $ reference to a parameter (for example,
$AI_SORT_MAX_CORE) so you can easily adjust the value at runtime if required.

=========================================

You might also like