PCP2SPARK

Section: User Commands (1)
Index Return to Main Contents

NAME

pcp2spark - pcp-to-spark metrics exporter

SYNOPSIS

pcp2spark [ -CGHIjLnrRvV? ] [ -8 | -9 limit ] [ -a archive ] [ --archive-folio folio ] [ -A align ] [ -b | -B space-scale ] [ -c config ] [ --container container ] [ --daemonize ] [ -e derived ] [ -E password ] [ -g server ] [ -h host ] [ -i instances ] [ -J rank ] [ -K spec ] [ -N predicate ] [ -O origin ] [ -p port ] [ -P | -0 precision ] [ -q | -Q count-scale ] [ -s samples ] [ -S starttime ] [ -t interval ] [ -T endtime ] [ -U username ] [ -x database ] [ -X tags ] [ -y | -Y time-scale ] metricspec [...]

DESCRIPTION

pcp2spark is a customizable performance metrics exporter tool from PCP to Apache Spark. Any available performance metric, live or archived, system and/or application, can be selected for exporting using either command line arguments or a configuration file.

pcp2spark is a close relative of pmrep (1). Please refer to pmrep (1) for the metricspec description accepted on pcp2spark command line and pmrep.conf (5) for description of the pcp2spark.conf configuration file overall syntax, this page describes pcp2spark specific options and configuration file differences with pmrep.conf (5). pmrep (1) also lists some usage examples of which most are applicable with pcp2spark as well.

Only the command line options listed on this page are supported, other options recognized by pmrep (1) are not supported.

Options via environment values (see pmGetOptions (3)) override the corresponding built-in default values (if any). Configuration file options override the corresponding environment variables (if any). Command line options override the corresponding configuration file options (if any).

CONFIGURATION FILE

pcp2spark uses a configuration file with overall syntax described in pmrep.conf (5). The following options are common with pmrep.conf : version , source , speclocal , derived , header , globals , samples , interval , type , type_prefer , ignore_incompat , instances , live_filter , rank , limit_filter , limit_filter_force , invert_filter , predicate , omit_flat , precision , precision_force , count_scale , count_scale_force , space_scale , space_scale_force , time_scale , time_scale_force . The output option is recognized but ignored for pmrep.conf compatibility.

pcp2spark specific options

spark_server (string)

Specify the address of the local server to host the metrics. Corresponding command line option is -g . Default is 127.0.0.1 .

spark_port (integer)

Specify the port to run the local server on. Corresponding command line option is -p . Default is 44325 .

OPTIONS

The available command line options are:
-8 limit , --limit-filter = limit
Limit results to instances with values above/below limit . A positive integer will include instances with values at or above the limit in reporting. A negative integer will include instances with values at or below the limit in reporting. A value of zero performs no limit filtering. This option will not override possible per-metric specifications. See also -J and -N .
-9 limit , --limit-filter-force = limit
Like -8 but this option will override per-metric specifications.
-a archive , --archive = archive
Performance metric values are retrieved from the set of Performance Co-Pilot (PCP) archive log files identified by the argument archive , which is a comma-separated list of names, each of which may be the base name of an archive or the name of a directory containing one or more archives.
--archive-folio
Read metric source archives from the PCP archive folio created by tools like pmchart (1) or, less often, manually with mkaf (1).
-A align , --align = align
Force the initial sample to be aligned on the boundary of a natural time unit align . Refer to PCPIntro (1) for a complete description of the syntax for align .
-b scale , --space-scale = scale
Unit/scale for space (byte) metrics, possible values include bytes , Kbytes , KB , Mbytes , MB , and so forth. This option will not override possible per-metric specifications. See also pmParseUnitsStr (3).
-B scale , --space-scale-force = scale
Like -b but this option will override per-metric specifications.
-c config , --config = config
Specify the config file to use. The default is the first found of: ./pcp2spark.conf , $HOME /.pcp2spark.conf, $HOME /pcp/pcp2spark.conf, and $PCP_SYSCONF_DIR /pcp2spark.conf. For details, see the above section and pmrep.conf (5).
--container = container
Fetch performance metrics from the specified container , either local or remote (see -h ).
-C , --check
Exit before reporting any values, but after parsing the configuration and metrics and printing possible headers.
--daemonize
Daemonize on startup.
-e derived , --derived = derived
Specify derived performance metrics. If derived starts with a slash (``/'') or with a dot (``.'') it will be interpreted as a derived metrics configuration file, otherwise it will be interpreted as comma- or semicolon-separated derived metric expressions. For details see pmLoadDerivedConfig (3) and pmRegisterDerived (3).
-G , --no-globals
Do not include global metrics in reporting (see pmrep.conf (5)).
-h host , --host = host
Fetch performance metrics from pmcd (1) on host , rather than from the default localhost.
-H , --no-header
Do not print any headers.
-i instances , --instances = instances
Report only the listed instances from current instances (if present, see also -j ). By default all instances, present and future, are reported. This is a global option that is used for all metrics unless a metric-specific instance definition is provided as part of a metricspec . By default single-valued ``flat'' metrics without multiple instances are still reported as usual, use -v to change this. Please refer to pmrep (1) for more details on this option.
-I , --ignore-incompat
Ignore incompatible metrics. By default incompatible metrics (that is, their type is unsupported or they cannot be scaled as requested) will cause pcp2spark to terminate with an error message. With this option all incompatible metrics are silently omitted from reporting. This may be especially useful when requesting non-leaf nodes of the PMNS tree for reporting.
-j , --live-filter
Perform instance live filtering. This allows capturing all filtered instances even if processes are restarted at some point (unlike without live filtering). Doing live filtering over a huge amount of instances naturally comes with some overhead so a bit of user caution is advised.
-J rank , --rank = rank
Limit results to highest/lowest rank instances of set-valued metrics. A positive integer will include highest valued instances in reporting. A negative integer will include lowest valued instances in reporting. A value of zero performs no ranking. See also -8 .
-K spec , --spec-local = spec
When fetching metrics from a local context (see -L ), the -K option may be used to control the DSO PMDAs that should be made accessible. The spec argument conforms to the syntax described in pmSpecLocalPMDA (3). More than one -K option may be used.
-L , --local-PMDA
Use a local context to collect metrics from DSO PMDAs on the local host without PMCD. See also -K .
-n , --invert-filter
Perform ranking before live filtering. By default instance live filter filtering (when requested, see -j ) happens before instance ranking (when requested, see -J ). With this option the logic is inverted and ranking happens before live filtering.
-N predicate , --predicate = predicate
Specify a comma-separated list of predicate filter reference metrics. By default ranking (see -J ) happens for each metric individually. With predicate filter reference metrics, ranking is done only for the specified metrics. When reporting, the rest of the metrics sharing the same instance domain (see PCPIntro (1)) as the predicates will include only the highest/lowest ranking instances of the corresponding predicates.

So for example, when the using proc.memory.rss (resident size of process) as the predicate and including proc.io.total_bytes and mem.util.used as metrics to be reported, only the processes using most/least memory (as per -J ) will be included when reporting total bytes written by processes. Since mem.util.used is a single-valued metric (thus not sharing the same instance domain as the process-related metrics), it will be reported as usual.

-O origin , --origin = origin
When reporting archived metrics, start reporting at origin within the time window (see -S and -T ). Refer to PCPIntro (1) for a complete description of the syntax for origin .
-P precision , --precision = precision
Use precision for numeric non-integer output values. The default is to use 3 decimal places (when applicable). This option will not override possible per-metric specifications.
-0 precision , --precision-force = precision
Like -P but this option will override per-metric specifications.
-q scale , --count-scale = scale
Unit/scale for count metrics, possible values include count x 10^-1 , count , count x 10 , count x 10^2 , and so forth from 10^-8 to 10^7 . (These values are currently space-sensitive.) This option will not override possible per-metric specifications. See also pmParseUnitsStr (3).
-Q scale , --count-scale-force = scale
Like -q but this option will override per-metric specifications.
-r , --raw
Output raw metric values, do not convert cumulative counters to rates. This option will override possible per-metric specifications.
-R , --raw-prefer
Like -r but this option will not override per-metric specifications.
-s samples , --samples = samples
The argument samples defines the number of samples to be retrieved and reported. If samples is 0 or -s is not specified, pcp2spark will sample and report continuously (in real time mode) or until the end of the set of PCP archives (in archive mode). See also -T .
-S starttime , --start = starttime
When reporting archived metrics, the report will be restricted to those records logged at or after starttime . Refer to PCPIntro (1) for a complete description of the syntax for starttime .
-t interval , --interval = interval
The default update interval may be set to something other than the default 1 second. The interval argument follows the syntax described in PCPIntro (1), and in the simplest form may be an unsigned integer (the implied units in this case are seconds). See also the -T option.
-T endtime , --finish = endtime
When reporting archived metrics, the report will be restricted to those records logged before or at endtime . Refer to PCPIntro (1) for a complete description of the syntax for endtime .

When used to define the runtime before pcp2spark will exit, if no samples is given (see -s ) then the number of reported samples depends on interval (see -t ). If samples is given then interval will be adjusted to allow reporting of samples during runtime. In case all of -T , -s , and -t are given, endtime determines the actual time pcp2spark will run.

-v , --omit-flat
Omit single-valued ``flat'' metrics from reporting, only consider set-valued metrics (i.e., metrics with multiple values) for reporting. See -i and -I .
-V , --version
Display version number and exit.
-X tags , --db-tags = tags
Specify strings of tags to add to the metrics.
-y scale , --time-scale = scale
Unit/scale for time metrics, possible values include nanosec , ns , microsec , us , millisec , ms , and so forth up to hour , hr . This option will not override possible per-metric specifications. See also pmParseUnitsStr (3).
-Y scale , --time-scale-force = scale
Like -y but this option will override per-metric specifications.
-? , --help
Display usage message and exit.

FILES

pcp2spark.conf
pcp2spark configuration file (see -c )

PCP ENVIRONMENT

Environment variables with the prefix PCP_ are used to parameterize the file and directory names used by PCP. On each installation, the file /etc/pcp.conf contains the local values for these variables. The $PCP_CONF variable may be used to specify an alternative configuration file, as described in pcp.conf (5).

For environment variables affecting PCP tools, see pmGetOptions (3).

SEE ALSO

mkaf (1), PCPIntro (1), pcp (1), pcp2elasticsearch (1), pcp2graphite (1), pcp2json (1), pcp2xlsx (1), pcp2xml (1), pcp2zabbix (1), pmcd (1), pminfo (1), pmrep (1), pmGetOptions (3), pmSpecLocalPMDA (3), pmLoadDerivedConfig (3), pmParseUnitsStr (3), pmRegisterDerived (3), LOGARCHIVE (5), pcp.conf (5), pmns (5) and pmrep.conf (5).


Index

NAME
SYNOPSIS
DESCRIPTION
CONFIGURATION FILE
pcp2spark specific options
OPTIONS
FILES
PCP ENVIRONMENT
SEE ALSO