1 '\" te
   2 .\" Copyright (c) 2007, Sun Microsystems, Inc. All Rights Reserved.
   3 .\" The contents of this file are subject to the terms of the Common Development and Distribution License (the "License").  You may not use this file except in compliance with the License.
   4 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing.  See the License for the specific language governing permissions and limitations under the License.
   5 .\" When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE.  If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
   6 .TH CPC_BIND_EVENT 3CPC "Sep 10, 2013"
   7 .SH NAME
   8 cpc_bind_event, cpc_take_sample, cpc_rele \- use CPU performance counters on
   9 lwps
  10 .SH SYNOPSIS
  11 .LP
  12 .nf
  13 cc [ \fIflag\fR... ] \fIfile\fR... \(milcpc [ \fIlibrary\fR... ]
  14 #include <libcpc.h>
  15 
  16 \fBint\fR \fBcpc_bind_event\fR(\fBcpc_event_t *\fR\fIevent\fR, \fBint\fR \fIflags\fR);
  17 .fi
  18 
  19 .LP
  20 .nf
  21 \fBint\fR \fBcpc_take_sample\fR(\fBcpc_event_t *\fR\fIevent\fR);
  22 .fi
  23 
  24 .LP
  25 .nf
  26 \fBint\fR \fBcpc_rele\fR(\fBvoid\fR);
  27 .fi
  28 
  29 .SH DESCRIPTION
  30 .sp
  31 .LP
  32 Once the events to be sampled have been selected using, for example,
  33 \fBcpc_strtoevent\fR(3CPC), the event selections can be bound to the calling
  34 \fBLWP\fR using \fBcpc_bind_event()\fR. If \fBcpc_bind_event()\fR returns
  35 successfully, the system has associated performance counter context with the
  36 calling \fBLWP\fR. The context allows the system to virtualize the hardware
  37 counters to that specific \fBLWP\fR, and the counters are enabled.
  38 .sp
  39 .LP
  40 Two flags are defined that can be passed into the routine to allow the behavior
  41 of the interface to be modified, as described below.
  42 .sp
  43 .LP
  44 Counter values can be sampled at any time by calling \fBcpc_take_sample()\fR,
  45 and dereferencing the fields of the \fBce_pic\fR\fB[]\fR array returned. The
  46 \fBce_hrt\fR field contains the timestamp at which the kernel last sampled the
  47 counters.
  48 .sp
  49 .LP
  50 To immediately remove the performance counter context on an \fBLWP\fR, the
  51 \fBcpc_rele()\fR interface should be used. Otherwise, the context will be
  52 destroyed after the \fBLWP\fR or process exits.
  53 .sp
  54 .LP
  55 The caller should take steps to ensure that the counters are sampled often
  56 enough to avoid the 32-bit counters wrapping. The events most prone to wrap are
  57 those that count processor clock cycles. If such an event is of interest,
  58 sampling should occur frequently so that less than 4 billion clock cycles can
  59 occur between samples. Practically speaking, this is only likely to be a
  60 problem for otherwise idle systems, or when processes are bound to processors,
  61 since normal context switching behavior will otherwise hide this problem.
  62 .SH RETURN VALUES
  63 .sp
  64 .LP
  65 Upon successful completion, \fBcpc_bind_event()\fR and \fBcpc_take_sample()\fR
  66 return \fB0\fR. Otherwise, these functions return \fB\(mi1\fR, and set
  67 \fBerrno\fR to indicate the error.
  68 .SH ERRORS
  69 .sp
  70 .LP
  71 The \fBcpc_bind_event()\fR and \fBcpc_take_sample()\fR functions will fail if:
  72 .sp
  73 .ne 2
  74 .na
  75 \fB\fBEACCES\fR\fR
  76 .ad
  77 .RS 11n
  78 For \fBcpc_bind_event()\fR, access to the requested hypervisor event was
  79 denied.
  80 .RE
  81 
  82 .sp
  83 .ne 2
  84 .na
  85 \fB\fBEAGAIN\fR\fR
  86 .ad
  87 .RS 11n
  88 Another process may be sampling system-wide CPU statistics. For
  89 \fBcpc_bind_event()\fR, this implies that no new contexts can be created. For
  90 \fBcpc_take_sample()\fR, this implies that the performance counter context has
  91 been invalidated and must be released with \fBcpc_rele()\fR. Robust programs
  92 should be coded to expect this behavior and recover from it by releasing the
  93 now invalid context by calling \fBcpc_rele()\fR sleeping for a while, then
  94 attempting to bind and sample the event once more.
  95 .RE
  96 
  97 .sp
  98 .ne 2
  99 .na
 100 \fB\fBEINVAL\fR\fR
 101 .ad
 102 .RS 11n
 103 The \fBcpc_take_sample()\fR function has been invoked before the context is
 104 bound.
 105 .RE
 106 
 107 .sp
 108 .ne 2
 109 .na
 110 \fB\fBENOTSUP\fR\fR
 111 .ad
 112 .RS 11n
 113 The caller has attempted an operation that is illegal or not supported on the
 114 current platform, such as attempting to specify signal delivery on counter
 115 overflow on a CPU that doesn't generate an interrupt on counter overflow.
 116 .RE
 117 
 118 .SH USAGE
 119 .sp
 120 .LP
 121 Prior to calling \fBcpc_bind_event()\fR, applications should call
 122 \fBcpc_access\fR(3CPC) to determine if the counters are accessible on the
 123 system.
 124 .SH EXAMPLES
 125 .LP
 126 \fBExample 1 \fRUse hardware performance counters to measure events in a
 127 process.
 128 .sp
 129 .LP
 130 The example below shows how a standalone program can be instrumented with the
 131 \fBlibcpc\fR routines to use hardware performance counters to measure events in
 132 a process.  The program performs 20 iterations of a computation, measuring the
 133 counter values for each iteration.  By default, the example makes the counters
 134 measure external cache references and external cache hits; these options are
 135 only appropriate for UltraSPARC processors. By setting the \fBPERFEVENTS\fR
 136 environment variable to other strings (a list of which can be gleaned from the
 137 \fB-h\fR flag of the \fBcpustat\fR or \fBcputrack\fR utilities), other events
 138 can be counted.  The \fBerror()\fR routine below is assumed to be a
 139 user-provided routine analogous to the familiar \fBprintf\fR(3C) routine from
 140 the C library but which also performs an \fBexit\fR(2) after printing the
 141 message.
 142 
 143 .sp
 144 .in +2
 145 .nf
 146 \fB#include <inttypes.h>
 147 #include <stdlib.h>
 148 #include <stdio.h>
 149 #include <unistd.h>
 150 #include <libcpc.h>
 151 int
 152 main(int argc, char *argv[])
 153 {
 154 int cpuver, iter;
 155 char *setting = NULL;
 156 cpc_event_t event;
 157 
 158 if (cpc_version(CPC_VER_CURRENT) != CPC_VER_CURRENT)
 159     error("application:library cpc version mismatch!");
 160 
 161 if ((cpuver = cpc_getcpuver()) == -1)
 162     error("no performance counter hardware!");
 163 
 164 if ((setting = getenv("PERFEVENTS")) == NULL)
 165     setting = "pic0=EC_ref,pic1=EC_hit";
 166 
 167 if (cpc_strtoevent(cpuver, setting, &event) != 0)
 168     error("can't measure '%s' on this processor", setting);
 169 setting = cpc_eventtostr(&event);
 170 
 171 if (cpc_access() == -1)
 172     error("can't access perf counters: %s", strerror(errno));
 173 
 174 if (cpc_bind_event(&event, 0) == -1)
 175     error("can't bind lwp%d: %s", _lwp_self(), strerror(errno));
 176 
 177 for (iter = 1; iter <= 20; iter++) {
 178     cpc_event_t before, after;
 179 
 180     if (cpc_take_sample(&before) == -1)
 181         break;
 182 
 183     /* ==> Computation to be measured goes here <== */
 184 
 185     if (cpc_take_sample(&after) == -1)
 186         break;
 187     (void) printf("%3d: %" PRId64 " %" PRId64 "\en", iter,
 188         after.ce_pic[0] - before.ce_pic[0],
 189         after.ce_pic[1] - before.ce_pic[1]);
 190 }
 191 
 192 if (iter != 20)
 193     error("can't sample '%s': %s", setting,    strerror(errno));
 194 
 195 free(setting);
 196 return (0);
 197 }\fR
 198 .fi
 199 .in -2
 200 
 201 .LP
 202 \fBExample 2 \fRWrite a signal handler to catch overflow signals.
 203 .sp
 204 .LP
 205 This example builds on Example 1, but demonstrates how to write the signal
 206 handler to catch overflow signals. The counters are preset so that counter zero
 207 is 1000 counts short of overflowing, while counter one is set to zero. After
 208 1000 counts on counter zero, the signal handler will be invoked.
 209 
 210 .sp
 211 .LP
 212 First the signal handler:
 213 
 214 .sp
 215 .in +2
 216 .nf
 217 #define PRESET0        (UINT64_MAX - UINT64_C(999))
 218 #define PRESET1        0
 219 
 220 void
 221 emt_handler(int sig, siginfo_t *sip, void *arg)
 222 {
 223 ucontext_t *uap = arg;
 224 cpc_event_t sample;
 225 
 226 if (sig != SIGEMT || sip->si_code != EMT_CPCOVF) {
 227     psignal(sig, "example");
 228     psiginfo(sip, "example");
 229     return;
 230 }
 231 
 232 (void) printf("lwp%d - si_addr %p ucontext: %%pc %p %%sp %p\en",
 233     _lwp_self(), (void *)sip->si_addr,
 234     (void *)uap->uc_mcontext.gregs[PC],
 235     (void *)uap->uc_mcontext.gregs[USP]);
 236 
 237 if (cpc_take_sample(&sample) == -1)
 238     error("can't sample: %s", strerror(errno));
 239 
 240 (void) printf("0x%" PRIx64 " 0x%" PRIx64 "\en",
 241     sample.ce_pic[0], sample.ce_pic[1]);
 242 (void) fflush(stdout);
 243 
 244 sample.ce_pic[0] = PRESET0;
 245 sample.ce_pic[1] = PRESET1;
 246 if (cpc_bind_event(&sample, CPC_BIND_EMT_OVF) == -1)
 247     error("cannot bind lwp%d: %s", _lwp_self(), strerror(errno));
 248 }
 249 .fi
 250 .in -2
 251 
 252 .sp
 253 .LP
 254 and second the setup code (this can be placed after the code that selects the
 255 event to be measured):
 256 
 257 .sp
 258 .in +2
 259 .nf
 260 \fBstruct sigaction act;
 261 cpc_event_t event;
 262 \&...
 263 act.sa_sigaction = emt_handler;
 264 bzero(&act.sa_mask, sizeof (act.sa_mask));
 265 act.sa_flags = SA_RESTART|SA_SIGINFO;
 266 if (sigaction(SIGEMT, &act, NULL) == -1)
 267     error("sigaction: %s", strerror(errno));
 268 event.ce_pic[0] = PRESET0;
 269 event.ce_pic[1] = PRESET1;
 270 if (cpc_bind_event(&event, CPC_BIND_EMT_OVF) == -1)
 271     error("cannot bind lwp%d: %s", _lwp_self(), strerror(errno));
 272 
 273 for (iter = 1; iter <= 20; iter++) {
 274     /* ==> Computation to be measured goes here <== */
 275 }
 276 
 277 cpc_bind_event(NULL, 0);    /* done */\fR
 278 .fi
 279 .in -2
 280 
 281 .sp
 282 .LP
 283 Note that a more general version of the signal handler would use \fBwrite\fR(2)
 284 directly instead of depending on the signal-unsafe semantics of \fBstderr\fR
 285 and \fBstdout\fR. Most real signal handlers will probably do more with the
 286 samples than just print them out.
 287 
 288 .SH ATTRIBUTES
 289 .sp
 290 .LP
 291 See \fBattributes\fR(5) for descriptions of the following attributes:
 292 .sp
 293 
 294 .sp
 295 .TS
 296 box;
 297 c | c
 298 l | l .
 299 ATTRIBUTE TYPE  ATTRIBUTE VALUE
 300 _
 301 MT-Level        MT-Safe
 302 _
 303 Interface Stability     Obsolete
 304 .TE
 305 
 306 .SH SEE ALSO
 307 .sp
 308 .LP
 309 \fBcpustat\fR(1M), \fBcputrack\fR(1), \fBwrite\fR(2). \fBcpc\fR(3CPC),
 310 \fBcpc_access\fR(3CPC), \fBcpc_bind_curlwp\fR(3CPC),
 311 \fBcpc_set_sample\fR(3CPC), \fBcpc_strtoevent\fR(3CPC), \fBcpc_unbind\fR(3CPC),
 312 \fBlibcpc\fR(3LIB), \fBattributes\fR(5)
 313 .SH NOTES
 314 .sp
 315 .LP
 316 The \fBcpc_bind_event()\fR, \fBcpc_take_sample()\fR, and \fBcpc_rele()\fR
 317 functions exist for binary compatibility only. Source containing these
 318 functions will not compile. These functions are obsolete and might be removed
 319 in a future release. Applications should use \fBcpc_bind_curlwp\fR(3CPC),
 320 \fBcpc_set_sample\fR(3CPC), and \fBcpc_unbind\fR(3CPC) instead.
 321 .sp
 322 .LP
 323 Sometimes, even the overhead of performing a system call will be too disruptive
 324 to the events being measured. Once a call to \fBcpc_bind_event()\fR has been
 325 issued, it is possible to directly access the performance hardware registers
 326 from within the application. If the performance counter context is active, then
 327 the counters will count on behalf of the current \fBLWP\fR.
 328 .SS "SPARC"
 329 .sp
 330 .in +2
 331 .nf
 332 rd %pic, %r\fBN\fR        ! All UltraSPARC
 333 wr %r\fIN\fR, %pic        ! (ditto, but see text)
 334 .fi
 335 .in -2
 336 
 337 .SS "x86"
 338 .sp
 339 .in +2
 340 .nf
 341 rdpmc               ! Pentium II only
 342 .fi
 343 .in -2
 344 
 345 .sp
 346 .LP
 347 If the counter context is not active or has been invalidated, the \fB%pic\fR
 348 register (SPARC), and the \fBrdpmc\fR instruction (Pentium) will become
 349 unavailable.
 350 .sp
 351 .LP
 352 Note that the two 32-bit UltraSPARC performance counters are kept in the single
 353 64-bit \fB%pic\fR register so a couple of additional instructions are required
 354 to separate the values. Also note that when the \fB%pcr\fR register bit has
 355 been set that configures the \fB%pic\fR register as readable by an application,
 356 it is also writable. Any values written will be preserved by the context
 357 switching mechanism.
 358 .sp
 359 .LP
 360 Pentium II processors support the non-privileged \fBrdpmc\fR instruction which
 361 requires [5] that the counter of interest be specified in \fB%ecx\fR, and
 362 returns a 40-bit value in the \fB%edx:%eax\fR register pair.  There is no
 363 non-privileged access mechanism for Pentium I processors.
 364 .SS "Handling counter overflow"
 365 .sp
 366 .LP
 367 As described above, when counting events, some processors allow their counter
 368 registers to silently overflow. More recent CPUs such as UltraSPARC III and
 369 Pentium II, however, are capable of generating an interrupt when the hardware
 370 counter overflows. Some processors offer more control over when interrupts will
 371 actually be generated. For example, they might allow the interrupt to be
 372 programmed to occur when only one of the counters overflows. See
 373 \fBcpc_strtoevent\fR(3CPC) for the syntax.
 374 .sp
 375 .LP
 376 The most obvious use for this facility is to ensure that the full 64-bit
 377 counter values are maintained without repeated sampling. However, current
 378 hardware does not record which counter overflowed. A more subtle use for this
 379 facility is to preset the counter to a value to a little less than the maximum
 380 value, then use the resulting interrupt to catch the counter overflow
 381 associated with that event. The overflow can then be used as an indication of
 382 the frequency of the occurrence of that event.
 383 .sp
 384 .LP
 385 Note that the interrupt generated by the processor may not be particularly
 386 precise.  That is, the particular instruction that caused the counter overflow
 387 may be earlier in the instruction stream than is indicated by the program
 388 counter value in the ucontext.
 389 .sp
 390 .LP
 391 When \fBcpc_bind_event()\fR is called with  the \fBCPC_BIND_EMT_OVF\fR flag
 392 set, then as before, the control registers and counters are preset from the
 393 64-bit values contained in \fBevent\fR. However, when the flag is set, the
 394 kernel arranges to send the calling process a \fBSIGEMT\fR signal when the
 395 overflow occurs, with the \fBsi_code\fR field of the corresponding
 396 \fBsiginfo\fR structure set to \fBEMT_CPCOVF\fR, and the \fBsi_addr\fR field is
 397 the program counter value at the time the overflow interrupt was delivered.
 398 Counting is disabled until the next call to \fBcpc_bind_event()\fR. Even in a
 399 multithreaded process, during execution of the signal handler, the thread
 400 behaves as if it is temporarily bound to the running \fBLWP\fR.
 401 .sp
 402 .LP
 403 Different processors have different counter ranges available, though all
 404 processors supported by Solaris allow at least 31 bits to be specified as a
 405 counter preset value; thus portable preset values lie in the range
 406 \fBUINT64_MAX\fR to \fBUINT64_MAX\fR\(mi\fBINT32_MAX\fR.
 407 .sp
 408 .LP
 409 The appropriate preset value will often need to be determined experimentally.
 410 Typically, it will depend on the event being measured, as well as the desire to
 411 minimize the impact of the act of measurement on the event being measured; less
 412 frequent interrupts and samples lead to less perturbation of the system.
 413 .sp
 414 .LP
 415 If the processor cannot detect counter overflow, this call will fail
 416 (\fBENOTSUP\fR). Specifying a null event unbinds the context from the
 417 underlying \fBLWP\fR and disables signal delivery.  Currently, only user events
 418 can be measured using this technique. See Example 2, above.
 419 .SS "Inheriting events onto multiple \fBLWP\fRs"
 420 .sp
 421 .LP
 422 By default, the library binds the performance counter context to the current
 423 \fBLWP\fR only.  If the \fBCPC_BIND_LWP_INHERIT\fR flag is set, then any
 424 subsequent \fBLWP\fRs created by that \fBLWP\fR will automatically inherit the
 425 same performance counter context.  The counters will be initialized to 0 as if
 426 a \fBcpc_bind_event()\fR had just been issued. This automatic inheritance
 427 behavior can be useful when dealing with multithreaded programs to determine
 428 aggregate statistics for the program as a whole.
 429 .sp
 430 .LP
 431 If the \fBCPC_BIND_EMT_OVF\fR flag is also set, the process will immediately
 432 dispatch a \fBSIGEMT\fR signal to the freshly created \fBLWP\fR so that it can
 433 preset its counters appropriately on the new \fBLWP\fR. This initialization
 434 condition can be detected using \fBcpc_take_sample()\fR to check that both
 435 \fBce_pic\fR[] values are set to \fBUINT64_MAX\fR.