Tuesday, May 5, 2020

Troubleshooting a Missing Solaris Library

Symptoms
# more  ./xyz
ld.so.1: more: fatal: libcurses.so.1: open failed: No such file or directory.

Cause
This can be due to several reasons:

 - deleting the library by error
 - corruption of file systems
 - missing library package(s)
 - wrong LD_LIBRARY_PATH variable
 - wrong entry in configfiles /var/ld/ld.config or /var/ld/64/ld.config
 - wrong version of 3rd party package

Checks:
-------------------------------------------------------------------------------

Capture the given error message of a  program
      i.e.  ld.so.1: more: fatal: libcurses.so.1: open failed: No such file or directory.
--------------------------------------------------------------------------------
 List which libraries are required by a program using  ldd(1) or pvs(1)  or dump(1)

% /usr/bin/ldd  /usr/bin/more
        libcurses.so.1 =>        /lib/libcurses.so.1
        libc.so.1 =>     /lib/libc.so.1
        libm.so.2 =>     /lib/libm.so.2

% pvs /usr/bin/more
        libcurses.so.1 (SUNW_1.1, SUNWprivate_1.1);
        libc.so.1 (SUNW_1.1, SUNWprivate_1.1);

% dump -Lv /usr/bin/more
/usr/bin/more:
  **** DYNAMIC SECTION INFORMATION ****
.dynamic:
[INDEX] Tag         Value
[1]     NEEDED          libcurses.so.1
[2]     NEEDED          libc.so.1
-- lines omitted --


// another example for /usr/bin/pidgin
% /usr/ccs/bin/elfdump  -d /usr/bin/pidgin | egrep PATH
      [33]  RUNPATH           0xe323              /usr/lib/gnome-private/lib:/usr/X11/lib:/usr/sfw/lib
      [34]  RPATH             0xe323              /usr/lib/gnome-private/lib:/usr/X11/lib:/usr/sfw/lib




--------------------------------------------------------------------------------


Check shell/programs  environment variable  LD_LIBRARY_PATH
Please bear in mind that some applications require LD_LIBRARY_PATH_64 variable, too

% echo $LD_LIBRARY_PATH
% echo $LD_LIBRARY_PATH_64


--------------------------------------------------------------------------------


check default configuration of library with crle

The following two crle commands are for check only:


# crle
Default configuration file (/var/ld/ld.config) not found
  Platform:     32-bit MSB SPARC
  Default Library Path (ELF):   /lib:/usr/lib  (system default)
  Trusted Directories (ELF):    /lib/secure:/usr/lib/secure  (system default)

# crle -64
Default configuration file (/var/ld/64/ld.config) not found
  Platform:     64-bit MSB SPARCV9
  Default Library Path (ELF):   /lib/64:/usr/lib/64  (system default)
  Trusted Directories (ELF):    /lib/secure/64:/usr/lib/secure/64  (system default)



!! Pease be very careful  using crle to expand the linker’s search path for a certain application. This can/will influence the rest of the system, too !!

Check existence and content of configfiles /var/ld/ld.config or /var/ld/64/ld.config
For details please see manpage of crle(1)

In normal cases, these files does *not* exist for good reasons. In case they are exist, please doublecheck whether the entries in these files are *really* required

Exceptional case is a Solaris 9 branded Zone with entries
LD_PRELOAD=/usr/lib/secure/64/s9_preload.so.1     in /var/ld/sparcv9/ld.config
LD_PRELOAD=/usr/lib/secure/s9_preload.so.1          in /var/ld/ld.config

--------------------------------------------------------------------------------


Check the full path of missing library.

For Solaris 10 and earlier
Find this information in  /var/sadm/install/contents file for libraries provided by Solaris or 3rd party packages
The /var/sadm/install/contents file contains all information about the properties for a specific library, binary such permissions, owner, group and package that includes it.

# /usr/bin/grep lib/libcurses.so.1 /var/sadm/install/contents
/lib/libcurses.so.1 f none 0755 root bin 299848 31377 1153867261 SUNWcslr
/usr/ccs/lib/libcurses.so=../../../lib/libcurses.so.1 s none SUNWcsl
/usr/ccs/lib/libtermcap.so=../../../lib/libcurses.so.1 s none SUNWcsl
/usr/ccs/lib/libtermlib.so=../../../lib/libcurses.so.1 s none SUNWcsl
/usr/lib/libcurses.so=../../lib/libcurses.so.1 s none SUNWcsl
/usr/lib/libcurses.so.1=../../lib/libcurses.so.1 s none SUNWcsl
/usr/lib/libtermcap.so=../../lib/libcurses.so.1 s none SUNWcsl
/usr/lib/libtermcap.so.1=../../lib/libcurses.so.1 s none SUNWcsl
/usr/lib/libtermlib.so=../../lib/libcurses.so.1 s none SUNWcsl
/usr/lib/libtermlib.so.1=../../lib/libcurses.so.1 s none SUNWcsl
/usr/ucblib/libcurses.so.1 f none 0755 root bin 60416 27920 1106443753 SUNWscpu
/usr/xpg4/lib/libcurses.so.1 f none 0755 root bin 204168 58641 1106444532 SUNWcsl
Solaris 11 and later (IPS)
i.e.

# pkg search  libcurses.so.1
INDEX      ACTION VALUE                               PACKAGE
-- lines omitted --
basename   file   lib/sparcv9/libcurses.so.1          pkg:/system/library@0.5.11-0.175.1.9.0.3.2
basename   file   lib/libcurses.so.1                  pkg:/system/library@0.5.11-0.175.1.9.0.3.2
-- lines omitted --


# pkg contents system/library  | egrep lib/libcurses.so.1
lib/libcurses.so.1
usr/lib/libcurses.so.1
usr/xpg4/lib/libcurses.so.1


--------------------------------------------------------------------------------


Check the package and/or the whole system to verify whether there is a problem with this library and to check if there are any other errors.
Solaris 10 and earlier

# /usr/sbin/pkgchk SUNWcsl
ERROR: /usr/lib/libcurses.so.1 pathname does not exist

# /usr/sbin/pkgchk -n
 ... some other possible error messages or warnings ...
Solaris 11 (IPS) and later

# pkg verify system/library


--------------------------------------------------------------------------------


start the command with truss(1M)  to verify the environment and systemcalls
( here, truss output is limited with option '-t' to show only open() system calls to see which libraries and files are opened )


# truss -afeld -vall -sall -t open /usr/bin/more  /etc/motd
Base time stamp:  1376150240.6587  [ Sat Aug 10 15:57:20 UTC 2013 ]
6591/1:          0.0000 execve("/usr/bin/more", 0xF90F9B94, 0xF90F9BA0)  argc = 2
6591/1:          argv: /usr/bin/more /etc/motd
6591/1:          envp: LC_MONETARY= TERM=xterm SHELL=/usr/bin/bash
6591/1:           SSH_TTY=/dev/pts/2 LC_ALL= USER=root PAGER=/usr/bin/less -ins
6591/1:           MAIL=/var/mail/root PATH=/usr/bin:/usr/sbin LC_MESSAGES=
6591/1:           LC_COLLATE= PWD=/root LANG=C TZ=localtime SHLVL=1 HOME=/root
6591/1:           LOGNAME=root
6591/1:           LC_TIME= _=/usr/bin/truss
6591/1:          0.0073 open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
6591/1:          0.0076 open("/lib/libc.so.1", O_RDONLY)                = 3
6591/1:          0.0083 open("/lib/libcurses.so.1", O_RDONLY)           = 3
6591/1:          0.0113 open("/usr/share/lib/terminfo//x/xterm", O_RDONLY) = 3
6591/1:          0.0122 open64("/etc/motd", O_RDONLY)                   = 3
start the command with truss(1M)  to verify the environment, systemcalls and user-level function call tracing (option 'u' )
( please be aware that the output can be very verbose with option '-u' )


% truss -afeld -vall -u '*'  /usr/bin/more  /etc/motd
Base time stamp:  1376719127.6085  [ Sat Aug 17 05:58:47 UTC 2013 ]
23378/1:         0.0000 execve("/usr/bin/more", 0xFFBFFC64, 0xFFBFFC70)  argc = 2
23378/1:         argv: /usr/bin/more /etc/motd
-- lines omitted --
23378/1@1:       0.6344 -> libc:atexit(0xff3b51b8, 0x28000, 0x0, 0x0)
23378/1@1:       0.6414 <- libc:atexit() = 0
23378/1@1:       0.6419 -> libc:atexit(0x1608c, 0xff3600c0, 0xff300, 0x0)
23378/1@1:       0.6477 <- libc:atexit() = 0
23378/1@1:       0.6483 -> libc:setlocale(0x6, 0x1609c, 0x0, 0x0)
23378/1@1:       0.6598 <- libc:setlocale() = 0xff29d7ee
23378/1@1:       0.6604 -> libc:getwidth(0x284ac, 0xff29d7ee, 0xff360180, 0x6)
23378/1@1:       0.6650 <- libc:getwidth() = 0x284ac
23378/1@1:       0.6655 -> libc:setlocale(0x6, 0x160a0, 0x28400, 0x1)
23378/1@1:       0.6758 <- libc:setlocale() = 0xff29d7ee
23378/1@1:       0.6763 -> libc:textdomain(0x160a4, 0xff29d7ee, 0xff360180, 0x6)
-- lines omitted --


--------------------------------------------------------------------------------


collect open() systemcalls with DTrace to find expected locations:

Example:
# devfsadm
devfsadm: dlopen failed: /usr/lib/devfsadm/linkmod/SUNW_scmd_link.so: ld.so.1:
devfsadm: fatal: libclconf.so.1: open failed: No such file or directory

use 2 terminals to troubleshoot this:
terminal1: Start the dtrace onliner or opensnoop script from DTraceToolkit as shown below
terminal2: start the command ( i.e. devfsadm )

// simple dtrace oneliner ato capture all open() systemcalls

# dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }'| egrep -i lib
dtrace: description 'syscall::open*:entry ' matched 2 probes
  0   5015                       open:entry devfsadm /lib/libdevinfo.so.1
  0   5015                       open:entry devfsadm /lib/libgen.so.1
  0   5015                       open:entry devfsadm /lib/libsysevent.so.1
  0   5015                       open:entry devfsadm /lib/libnvpair.so.1
  0   5015                       open:entry devfsadm /lib/libcmd.so.1
-- lines omitted  --
 10   5015                       open:entry devfsadm /usr/lib/devfsadm/linkmod/SUNW_cfg_link.so
 10   5015                       open:entry devfsadm /usr/lib/devfsadm/linkmod/SUNW_misc_link.so
 10   5015                       open:entry devfsadm /usr/lib/devfsadm/linkmod/SUNW_dtrace_link.so
 10   5015                       open:entry devfsadm /usr/lib/devfsadm/linkmod/SUNW_audio_link.so
// DTraceToolkit-0.99/opensnoop: successful open is being marked with '0' in column 'ERR'
// opensnoop is part of standard Solaris 11 installation  ( /usr/dtrace/DTT/opensnoop )
// opensnoop can be downloaded for Solaris 10 at  http://www.brendangregg.com/DTrace/opensnoop

# DTraceToolkit-0.99/opensnoop -a | egrep lib
TIME           STRTIME                UID    PID  FD ERR PATH                 ARGS
-- lines omitted  --
15816934805797 2013 Aug 19 19:46:20     0   1642   3   0 /lib/libdevinfo.so.1 devfsadm -Cv\0
15816934806157 2013 Aug 19 19:46:20     0   1642   3   0 /lib/libgen.so.1     devfsadm -Cv\0
15816934806489 2013 Aug 19 19:46:20     0   1642   3   0 /lib/libsysevent.so.1 devfsadm -Cv\0
15816934806833 2013 Aug 19 19:46:20     0   1642   3   0 /lib/libnvpair.so.1  devfsadm -Cv\0
15816934807130 2013 Aug 19 19:46:20     0   1642   3   0 /lib/libcmd.so.1     devfsadm -Cv\0
15816934807461 2013 Aug 19 19:46:20     0   1642   3   0 /lib/libdoor.so.1    devfsadm -Cv\0
-- lines omitted  --
15816934832591 2013 Aug 19 19:46:20     0   1642   7   0 /lib/libmeta.so.1    devfsadm -Cv\0
15816934833059 2013 Aug 19 19:46:20     0   1642   7   0 /lib/libadm.so.1     devfsadm -Cv\0
15816934833544 2013 Aug 19 19:46:20     0   1642   7   0 /lib/libefi.so.1     devfsadm -Cv\0
15816934833884 2013 Aug 19 19:46:20     0   1642   7   0 /lib/libuuid.so.1    devfsadm -Cv\0
15816934837327 2013 Aug 19 19:46:20     0   1642   7   0 /usr/lib/devfsadm/linkmod/SUNW_cfg_link.so devfsadm -Cv\0
15816934837864 2013 Aug 19 19:46:20     0   1642   7   0 /usr/lib/devfsadm/linkmod/SUNW_misc_link.so devfsadm -Cv\0
15816934838882 2013 Aug 19 19:46:20     0   1642   7   0 /usr/lib/devfsadm/linkmod/SUNW_dtrace_link.so devfsadm -Cv\0
15816934839375 2013 Aug 19 19:46:20     0   1642   7   0 /usr/lib/devfsadm/linkmod/SUNW_audio_link.so devfsadm -Cv\0

--------------------------------------------------------------------------------
In case the missing library is part of a third party SW or drivers,
 - please check with vendor for recent version of software
 - check whether the third party SW or drivers are compatible with the Solaris release and patches applied on system.

For Solaris 10, please see MOS document 1591106.1 (Items to Consider when Adding a Solaris 10 Package to an Existing System using pkgadd)

--------------------------------------------------------------------------------

References to related blogs and documentation:


https://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the  Avoiding LD_LIBRARY_PATH: The Options
https://blogs.oracle.com/rie/entry/tt_ld_library_path_tt              LD_LIBRARY_PATH - just say no

libc(3LIB)
pvs(1)
ld(1)
ldd(1)
elf(3ELF)

Solution
Perform appropriate corrections, depending on the results of the above commands.

As the possible reasons for the given error messages are in a very wide range,
we cannot provide a generic solution.

Installation of an additional package is quite easy in Solaris 11 (and later)
http://docs.oracle.com/cd/E20451_01/html/E25304/glqbr.html
      How to Manually Install Components on Oracle Solaris 11

http://docs.oracle.com/cd/E23824_01/html/E21802/gkoks.html
      Adding and Updating Oracle Solaris 11 Software Packages: Fixing Package Problems

Installation/replacement of a package is much more difficult in Solaris 10 (and earlier)

 Basic rules for addition/replacement of libraries in Solaris 10 ( and earlier )
  - do not replace libraries on live system
  - consider to restore from recent backup
  - don't mix binaries and libraries of different packages/patches

https://blogs.oracle.com/patch/entry/do_not_apply_packages_from 
Do not apply packages from one Update onto a system installed with a different Update