Unix Administration

Posts

To resolved fmd errors

June 10, 2014

After any hardware replacement or just reboot of server we need to check and cleared the fmd errors on the system. To check the fmd errors : # fmdump -v # fmadm faulty # fmadm faulty -r # fmadm faulty -a To rotate the fmd errors use below script ( run at least two times) . #GZ: fmdump #Clean up old entries: for i in `/usr/sbin/fmdump|awk '{print $4}'` do /usr/sbin/fmadm repair $i done sleep 2 /usr/sbin/fmadm rotate fltlog /usr/sbin/fmadm rotate errlog If the errors are still exist in # fmadm faulty –a then we need to clear the cache of fmd using the below steps. [Clear ereports and resource cache: # cd /var/fm/fmd # rm e* f* c*/eft/* r*/* [Clearing out FMA files with no reboot needed: svcadm disable -s svc:/system/fmd:default cd /var/fm/fmd find /var/fm/fmd -type f -exec ls {} \; find /var/fm/fmd -type f -exec rm {} \; svcadm enable svc:/system/fmd:default And monitor the system for few hrs (or one day) if the errors are came again then we need raise the oracle SR to get it f...

How to Force a Crash Dump When the Solaris Operating System is Hung

June 10, 2014

First of all, you need to drop the system into OK Prompt. For Old model which has Sun keyboard, you can press STOP+A, or for newer model / terminal, press break key sequences, example: ————- ~. #. #~ ~# If your console is a terminal, you can type : “shift-break” or “ctrl-break” or “ctrl-\” (ctrl-backslash) or “<enter>” followed by “~” and “ctrl-break” on Solaris Sparc, To send a <BREAK> from Hyperterm, use <Ctrl>-<Pause> or <Alt>-<Pause> On Hyperterminal, Ctrl-Break ———— Okay after you able to drop system to the OK prompt, you will see below PROMPT messages: Type ‘go’ to resume ok — All you need to do is to type ‘sync’ (without the quotes) and press Enter. The system will immediately panic. Now the hang condition has been converted into a panic, so an image of memory can be collected for later analysis. The system will attempt to reboot after the dump is complete.

Avoid filling zpools beyond 80% of their capacity

June 10, 2014

Keep pool space under 80% utilization to maintain pool performance. Currently, pool performance can degrade when a pool is very full and file systems are updated frequently, such as on a busy mail server. Full pools might cause a performance penalty, but no other issues. If the primary workload is immutable files (write once, never remove), then you can keep a pool in the 95-96% utilization range. Keep in mind that even with mostly static content in the 95-96% range, write, read, and resilvering performance might suffer. • Issues specific to 80% full zpool are: o If the zpool is fragmented and has less free space available, then it will take longer and require more CPU cycles in the kernel to find a suitable block of free space for each write. This results in lower write performance if the zpool has less than 20% free space. This issue is addressed by the following document: SunAlert: ZFS(zfs(1M)) filesystem(5) Performance May Drop Significantl...

Important when increasing mirrored stripe/concat volumes

June 10, 2014

mirror=target stands for = The attribute mirror=target specifies that volumes should be mirrored between identical target IDs on different controllers. root@# vxdisk -e list | egrep 'emc2_07ce|emc2_07cf|apevmx14_0880|apevmx14_0881' apevmx14_0880 auto:sliced emc1_1ffc_ol1_1dbdg_m2 ol1_1dbdg online thinrclm c15t5000097408472964d111s2 tprclm apevmx14_0881 auto:sliced emc1_200c_ol1_1dbdg_m2 ol1_1dbdg online thinrclm c15t5000097408472964d112s2 tprclm emc2_07ce auto:sliced emc1_1ffc_ol1_1dbdg_m ol1_1dbdg online thinrclm c15t500009740847255Cd105s2 tprclm emc2_07cf auto:sliced emc1_200c_ol1_1dbdg_m ol1_1dbdg online thinrclm c15t500009740847255Cd106s2 tprclm mirror=enclr enclr:enc1 enclr:enc2 The dis...

LiveUpgrade issue - Solution

June 10, 2014

root@# time lucreate -c s10u9 -m /:/dev/md/dsk/d210:ufs,mirror -m /:/dev/md/dsk/d12:detach,attach,preserve -m /var:/dev/md/dsk/d230:ufs,mirror -m /var:/dev/md/dsk/d32:detach,attach,preserve -m /opt:/dev/md/dsk/d250:ufs,mirror -m /opt:/dev/md/dsk/d52:detach,attach,preserve -n s10u11 -C /dev/dsk/c3t0d0s0 Determining types of file systems supported Validating file system requests The device name </dev/md/dsk/d210> expands to device path </dev/md/dsk/d210> The device name </dev/md/dsk/d230> expands to device path </dev/md/dsk/d230> The device name </dev/md/dsk/d250> expands to device path </dev/md/dsk/d250> Preparing logical storage devices Preparing physical storage devices Configuring physical storage devices Configuring logical storage devices Analyzing system configuration. No name for current boot environment. Current boot environment is named <s10u9>. Creating initial configuration for primary boot environment <s10u9>. IN...

e1000g_LSO_BUG_6909685_description

June 10, 2014

Large Send Offload Large Send Offload (LSO) is a hardware off-loading technology. LSO off-loads TCP Segmentation to NIC hardware to improve the network performance by reducing the workload on the CPUs. LSO is helpful for 10Gb network adoption on systems with slow CPU threads or lack of CPU resource. This feature integrates basic LSO framework in Solaris TCP/IP stack, so that any LSO-capable NIC might be enabled with LSO capability. Bug ID 6909685 Synopsis TCP/LSO should imply HW cksum State 11-Closed:Will Not Fix (Closed) Category:Subcategory kernel:tcp-ip Keywords BOP Responsible Engineer Jonathan Anderson Reported Against s10u9_02 , s10u8_fcs , solaris_10u8 Duplicate Of Introduced In Commit to Fix Fixed In Release Fixed Related Bugs 6838180 , 6855964 , 6908844 Submit Date 11-December-2009 Last Update Date 5-March-2010 Description As described in 6908844, customer disabled HW cksum in IP stack by /etc/system modification: set ip:dohwcksum=0 As result, the LSO packets are dropped by N...

Slow NFS due to FIXEDMTU After LiveUpgrae

June 10, 2014

Last days we were facing issue on performance issue with NFS share after live upgrade . Later we noticed it was due to the FIXEDMTU feature on interface . FIXEDMTU was not due to Live Upgrade . It was due to the mtu parameter on hostname.interface file which reflected on reboot. There is no reason to set a MTU value of 1500, the default mtu for Ethernet is already 1500 example: ./hostname.e1000g3 auksvaqs-prod mtu 1500 ./hostname.e1000g2 auksvaqs-nas mtu 1500 Please make sure LSO is disabled over reboot .. A side note regarding LSO , which could cause performance degradation over network . Large Send Offload and Network Performance One issue that I continually see reported by customers is slow network performance. Although there are literally a ton of issues that can affect how fast data moves to and from a server, there is one fix I’ve found that will resolve this 99% of time — disable Large Send Offload on the Ethernet adapter. So what is Large Sen...