Tools : psrinfo uptime vmstat sar -u sar -q (cpu queue, same as vmstat kthr:r) mpstat dtrace $ psrinfo -vp The physical processor has 8 virtual processors (32-39) SPARC64-VII (portid 1056 impl 0x7 ver 0x91 clock 2400 MHz) The physical processor has 8 virtual processors (40-47) SPARC64-VII (portid 1064 impl 0x7 ver 0x91 clock 2400 MHz) The physical processor has 8 virtual processors (48-55) SPARC64-VII (portid 1072 impl 0x7 ver 0x91 clock 2400 MHz) The physical processor has 8 virtual processors (56-63) SPARC64-VII (portid 1080 impl 0x7 ver 0x91 clock 2400 MHz) $ uptime 1:04pm up 114 day(s), 14:15, 6 users, load average: 4.41, 4.39, 3.76 the intervals are 1,5,15 minutes $ vmstat 5 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 sd sd in sy cs us sy id 0 0 0 107752976 11913152 74 296 901 2 2 0 0 2 2 -0 -349 5960 727992 5910 9 2 89 0 0 0 106811152 8983688 8 31 2 0 0 0 0 0 0 0 2 8818 615705 13764 13 2 85 0 0 0 106701904 8884808 0 14 0 0 0 0 0 0 0 0 2 14250 584633 24692 12 3 85 0 0 0 106535552 8730784 9 87 0 0 0 0 0 9 9 0 0 9247 379414 14642 12 2 87 0 0 0 106403648 8603144 0 10 0 0 0 0 0 0 0 0 0 6668 550100 9664 12 2 87 0 0 0 106296328 8504712 0 5 0 0 0 0 0 0 0 0 0 10490 471906 17677 12 2 85 0 0 0 106277472 8510232 1 3 6 0 0 0 0 0 0 0 0 12011 535986 20477 11 2 87 0 0 0 106277472 8527712 3 37 0 0 0 0 0 0 0 0 0 13685 668057 23689 9 2 89 The first line is the summary since boot. vmstat reads its statistic from kstat, which maintains CPU utilizztion statistics for each CPU. cpu:id : for percent idle. kthr:r : the total number of threads on the ready to run queues, CPU saturation. faults:in : number of interrupts/sec faults:sys: number of system calls/sec faults:cs : number of context switches/sec cpu:us : percent user time/user-mode threads cpu:sy : percent system time cpu:id : percent idle sar : turn on sar if it's not on. $ svcs -a | grep sar online Jan_22 svc:/system/sar:default $ svcs -l svc:/system/sar:default fmri svc:/system/sar:default name system activity reporting package enabled true state online next_state none state_time Fri Jan 22 22:08:22 2010 logfile /var/svc/log/system-sar:default.log restarter svc:/system/svc/restarter:default dependency require_all/none svc:/system/filesystem/minimal (online) $ sar 5 10 SunOS betapm 5.10 Generic_142900-03 sun4u 05/17/2010 13:07:48 %usr %sys %wio %idle 13:07:53 13 4 0 84 13:07:58 11 4 0 85 13:08:03 9 3 0 88 13:08:08 9 3 0 88 13:08:13 11 4 0 85 13:08:18 11 3 0 86 13:08:23 10 3 0 86 13:08:28 9 3 0 88 13:08:33 9 3 0 88 13:08:38 11 3 0 86 Average 10 3 0 86 %usr vs %sys : 70% vs 30% %wio is always 0 in solaris 10 (disabled) $ sar -q 2 10 SunOS betapm 5.10 Generic_142900-03 sun4u 05/17/2010 13:11:34 runq-sz %runocc swpq-sz %swpocc 13:11:36 1.0 50 0.0 0 13:11:38 0.0 0 0.0 0 13:11:40 0.0 0 0.0 0 13:11:42 0.0 0 0.0 0 13:11:44 0.0 0 0.0 0 13:11:46 0.0 0 0.0 0 13:11:48 0.0 0 0.0 0 13:11:50 1.0 50 0.0 0 13:11:52 0.0 0 0.0 0 13:11:54 0.0 0 0.0 0 Average 1.0 10 0.0 0 runq-sz: run queue size , same as vmstat kthr:r %runocc: identify short bursts of run queue activity occurred swpq-sz: number of swapped-out threads %swpocc: swapped out occupancy, percentage of time there were swapped out threads $ mpstat 3 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 32 0 0 18 460 126 257 0 3 10 0 2858 0 0 0 99 33 0 0 14 131 34 180 1 7 49 0 740 5 1 0 94 34 0 0 5 134 45 161 1 6 3 0 879 0 0 0 100 35 0 0 60 78 33 93 0 7 54 0 457 0 0 0 100 36 0 0 3 199 91 204 0 4 2 0 1125 0 0 0 100 37 0 0 105 49 13 55 0 3 17 0 164 0 0 0 100 38 0 0 2 124 56 126 0 2 1 0 1716 0 0 0 99 39 30 0 21 5 0 7 0 2 4 0 13 0 0 0 100 40 0 0 1 6 2 3 0 1 0 0 50 0 0 0 100 41 0 0 0 3 0 5 0 3 5 0 0 0 0 0 100 42 0 0 10 22 10 0 0 0 0 0 0 0 0 0 100 43 0 0 1 7 1 10 0 5 12 0 1 0 0 0 100 44 0 0 0 2 0 0 0 0 0 0 0 0 0 0 100 45 0 0 1 6 0 8 0 4 7 0 2 0 0 0 100 46 0 0 29 68 30 0 6 0 510 0 795891 72 28 0 0 47 0 0 2 5 1 2 0 1 4 0 0 0 0 0 100 48 0 0 26 346 153 331 1 6 8 0 4467 1 0 0 99 49 0 0 0 3 0 7 0 4 8 0 0 0 0 0 100 50 0 0 53 129 92 22 0 0 3 0 8 0 0 0 100 51 0 0 65 131 42 142 0 7 61 0 568 1 0 0 99 52 0 0 23 878 671 382 0 4 22 0 3765 1 2 0 97 53 0 0 43 43 32 15 0 3 9 0 0 0 0 0 100 54 0 0 0 22 6 27 0 0 0 0 39 0 0 0 100 55 0 0 91 237 108 241 0 6 11 0 213 0 0 0 100 56 0 0 38 303 143 231 0 5 2 0 3360 1 1 0 99 57 0 0 8 59 14 81 0 2 19 0 352 3 0 0 97 58 0 0 1 5 1 3 0 1 0 0 7 0 0 0 100 59 0 0 9 61 18 57 0 3 13 0 62 0 1 0 99 60 0 0 23 212 96 183 1 6 2 0 1159 0 0 0 99 61 0 0 25 134 38 169 0 5 81 0 921 5 0 0 95 62 0 0 0 19 3 27 0 0 2 0 27 0 0 0 100 63 0 0 6 62 22 62 0 6 7 0 271 0 0 0 100 csw : the number of context switch (both voluntary and involuntary); occurs when a thread perfomrs a blocking system call e.g. performing I/O when the thread sleeps until the I/O event has completed. icsw: the number of involuntary context switches; indicates if there were more threads ready to run than available phyiscal processors. migr: migrations of threads between processors; the number of times the OS scheduler moves ready-to-run threads to an idle prcessor. OS normally keeps the threads on the last processor it ran unless the process is busy, then the thread migrates. Such migration is bad for performance. intr: the number of interrupts taken on CPU. ithr: the number of interrupts that are converted to real threads, as result of inbound network packets, blocking for a mutex, or a synchronization event. smtx: the number of mutex contentions in the kernel, high shows high %sys time. srw : the number of reader-writer lock contention events in the kernel. Excessive reader/write lock contention results in nonscaling performance and systems that are unable to use all the available CPU resources (symptom is idle time). $ mpstat -ap 5 20 SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze 0 183 1 1186 874 207 1104 7 158 156 0 4229 0 1 0 99 16 1 112 104 2329 5085 2222 4805 215 710 1958 187 723869 17 4 0 80 16 SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze 0 107 0 1505 995 323 1128 5 78 368 0 3755 2 1 0 97 16 1 0 0 374 3011 1364 2399 7 68 35 0 38155 0 0 0 99 16 SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze 0 4 0 244 1010 322 1170 4 74 304 0 3549 2 0 0 98 16 1 0 0 472 2907 1313 2303 5 62 25 0 45064 0 0 0 99 16 SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze 0 1 0 290 990 323 1156 4 79 322 0 3577 2 1 0 97 16 1 0 0 265 2600 1119 2080 6 51 37 0 48610 1 0 0 99 16 SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze 0 74 0 715 1074 323 1275 6 90 287 0 3816 3 1 0 96 16 1 0 0 325 2695 1203 2077 6 67 31 0 59557 1 0 0 99 16 $prstat PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 29302 sybase 11G 11G sleep 59 0 195:35:53 0.1% dataserver/259 29347 sybase 11G 11G sleep 59 0 151:18:42 0.1% dataserver/259 29348 sybase 11G 11G sleep 59 0 153:15:41 0.0% dataserver/259 29340 sybase 11G 11G sleep 59 0 147:23:56 0.0% dataserver/259 29341 sybase 11G 11G sleep 59 0 153:08:31 0.0% dataserver/259 29342 sybase 11G 11G sleep 59 0 149:55:02 0.0% dataserver/259 29346 sybase 11G 11G sleep 59 0 149:56:52 0.0% dataserver/259 29334 sybase 11G 11G sleep 59 0 153:15:53 0.0% dataserver/259 29349 sybase 11G 11G sleep 59 0 182:37:54 0.0% dataserver/259 29343 sybase 11G 11G sleep 59 0 148:44:13 0.0% dataserver/259 29345 sybase 11G 11G sleep 59 0 152:04:00 0.0% dataserver/259 29333 sybase 11G 11G sleep 59 0 154:10:37 0.0% dataserver/259 29332 sybase 11G 11G sleep 59 0 146:54:46 0.0% dataserver/259 29335 sybase 11G 11G sleep 59 0 156:39:37 0.0% dataserver/259 29344 sybase 11G 11G sleep 59 0 144:58:21 0.0% dataserver/259 1540 sybase 14M 11M sleep 59 0 0:00:00 0.0% xpserver/1 29320 sybase 10M 9152K sleep 59 0 0:00:15 0.0% backupserver/1 28892 sybase 4296K 2136K sleep 59 0 0:00:06 0.0% SYBASE/1 29304 sybase 10M 6872K sleep 59 0 0:00:00 0.0% backupserver/1 28891 sybase 3440K 1880K sleep 59 0 0:00:06 0.0% lmgrd/1 Total: 20 processes, 3890 lwps, load averages: 1.02, 1.73, 2.12 Quick note for prstat: CPU field is thread->pr_pctcpu load average is the same as 'uptime' When average load average exceeds the number of CPUs in the system (psrinfo), the system is very likely over loaded. prstat -mL ; turn on microstates '-m' and show per-thread '-L' PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 14853 sjing 21 72 0.7 0.0 0.0 0.0 6.5 0.0 73 73 78K 0 prstat/1 12047 betapm 34 0.8 0.0 0.0 0.0 0.0 65 0.1 773 21 12K 0 MathKernel/1 29302 sybase 22 4.5 0.0 0.0 0.0 0.0 74 0.0 227 71 .3M 9 dataserver/1 29349 sybase 20 3.7 0.0 0.0 0.0 0.0 76 0.0 234 75 .2M 7 dataserver/1 29335 sybase 17 3.4 0.0 0.0 0.0 0.0 79 0.0 240 67 .2M 7 dataserver/1 29333 sybase 17 3.4 0.0 0.0 0.0 0.0 79 0.0 242 67 .2M 8 dataserver/1 29341 sybase 17 3.2 0.0 0.0 0.0 0.0 80 0.0 244 66 .2M 9 dataserver/1 29348 sybase 17 3.2 0.0 0.0 0.0 0.0 80 0.0 243 67 .2M 8 dataserver/1 29334 sybase 17 3.2 0.0 0.0 0.0 0.0 80 0.0 242 67 .2M 7 dataserver/1 29345 sybase 17 3.2 0.0 0.0 0.0 0.0 80 0.0 242 66 .2M 7 dataserver/1 29347 sybase 17 3.1 0.0 0.0 0.0 0.0 80 0.0 243 67 .2M 8 dataserver/1 29342 sybase 17 3.2 0.0 0.0 0.0 0.0 80 0.0 245 66 .2M 11 dataserver/1 29346 sybase 17 3.2 0.0 0.0 0.0 0.0 80 0.0 244 67 .2M 8 dataserver/1 29343 sybase 17 3.0 0.0 0.0 0.0 0.0 80 0.0 244 67 .2M 8 dataserver/1 29340 sybase 17 3.1 0.0 0.0 0.0 0.0 80 0.0 245 66 .2M 8 dataserver/1 Total: 141 processes, 4479 lwps, load averages: 0.87, 1.24, 1.83 PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 29302 sybase 10 4.5 0.0 0.0 0.0 0.0 85 0.0 367 9 .3M 126 dataserver/1 29347 sybase 9.9 4.0 0.0 0.0 0.0 0.0 86 0.0 373 8 .2M 127 dataserver/1 12047 betapm 9.1 0.4 0.0 0.0 0.0 0.0 90 0.1 536 6 8K 0 MathKernel/1 29348 sybase 1.0 0.4 0.0 0.0 0.0 0.0 99 0.0 303 1 29K 12 dataserver/1 29340 sybase 0.8 0.4 0.0 0.0 0.0 0.0 99 0.0 302 2 26K 9 dataserver/1 14853 sjing 0.3 0.8 0.0 0.0 0.0 0.0 99 0.0 17 1 727 0 prstat/1 14831 dbop 0.0 0.9 0.0 0.0 0.0 0.0 99 0.1 826 2 1K 0 find/1 29341 sybase 0.6 0.3 0.0 0.0 0.0 0.0 99 0.0 297 1 19K 7 dataserver/1 29342 sybase 0.6 0.2 0.0 0.0 0.0 0.0 99 0.0 298 0 17K 4 dataserver/1 29349 sybase 0.5 0.2 0.0 0.0 0.0 0.0 99 0.0 298 1 16K 5 dataserver/1 12063 betapm 0.2 0.4 0.0 0.0 0.0 0.0 99 0.0 505 1 6K 0 MathAddon/1 29346 sybase 0.4 0.2 0.0 0.0 0.0 0.0 99 0.0 297 1 14K 4 dataserver/1 29333 sybase 0.4 0.2 0.0 0.0 0.0 0.0 99 0.0 297 0 11K 3 dataserver/1 29334 sybase 0.4 0.2 0.0 0.0 0.0 0.0 99 0.0 299 0 12K 3 dataserver/1 29335 sybase 0.4 0.2 0.0 0.0 0.0 0.0 99 0.0 297 1 9K 2 dataserver/1 Total: 141 processes, 4479 lwps, load averages: 0.86, 1.23, 1.82 LAT : (latency) is the amount of time spent waiting for CPU.
About me I am Santosh Chalwad, a Sr. Unix Systems Engineer having 12+ years of Experience in IT industry having steep knowledge and experience in various Unix operating systems like Solaris,Linux, HP-UX and AIX. Also having exposure on VXVM, Clustering, Virtualization etc.,. If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.
Friday, March 13, 2015
Solaris 10 Performance Tuning - Analyzing CPU
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment