Saturday 2 October 2021

Failed to start Oracle OHASD service error while running 18c grid patch

18c grid opatchauto apply failed with errors, lets find root cause of failure and issue fix 


Apply grid patch with opatchauto:

root@racnode01:# opatchauto apply /u01/app/19c_Software/27494830  -oh /u01/app/18.3.0.0/grid


OPatchauto session is initiated at Tue Sep 28 04:08:55 2021


System initialization log file is /u01/app/18.3.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2021-09-28_04-08-58AM.log.


Session log file is /u01/app/18.3.0.0/grid/cfgtoollogs/opatchauto/opatchauto2021-09-28_04-11-20AM.log

The id for this session is 5INE


Executing OPatch prereq operations to verify patch applicability on home /u01/app/18.3.0.0/grid

Patch applicability verified successfully on home /u01/app/18.3.0.0/grid


Bringing down CRS service on home /u01/app/18.3.0.0/grid

CRS service brought down successfully on home /u01/app/18.3.0.0/grid


Start applying binary patch on home /u01/app/18.3.0.0/grid

Binary patch applied successfully on home /u01/app/18.3.0.0/grid


Starting CRS service on home /u01/app/18.3.0.0/grid

Failed to start CRS service on home /u01/app/18.3.0.0/grid


Execution of [GIStartupAction] patch action failed, check log for more details. Failures:

Patch Target : racnode01->/u01/app/18.3.0.0/grid Type[crs]

Details: [

---------------------------Patching Failed---------------------------------

Command execution failed during patching in home: /u01/app/18.3.0.0/grid, host: racnode01.

Command failed:  /u01/app/18.3.0.0/grid/perl/bin/perl -I/u01/app/18.3.0.0/grid/perl/lib -I/u01/app/18.3.0.0/grid/OPatch/auto/dbtmp/bootstrap_racnode01/patchwork/crs/install /u01/app/18.3.0.0/grid/OPatch/auto/dbtmp/bootstrap_racnode01/patchwork/crs/install/rootcrs.pl -postpatch

Command failure output: 

Using configuration parameter file: /u01/app/18.3.0.0/grid/OPatch/auto/dbtmp/bootstrap_racnode01/patchwork/crs/install/crsconfig_params

The log of current session can be found at:

  /app/grid/crsdata/racnode01/crsconfig/crs_postpatch_racnode01_2021-09-28_04-13-13AM.log

2021/09/28 04:13:37 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service'

2021/09/28 04:14:13 CLSRSC-318: Failed to start Oracle OHASD service 


After fixing the cause of failure Run opatchauto resume]

OPATCHAUTO-68061: The orchestration engine failed.

OPATCHAUTO-68061: The orchestration engine failed with return code 1

OPATCHAUTO-68061: Check the log for more details.

OPatchAuto failed.


OPatchauto session completed at Tue Sep 28 04:15:27 2021

Time taken to complete the session 6 minutes, 32 seconds


opatchauto failed with error code 42

 

From patch log file:

2021-09-28 05:47:03: The unit oracle-ohasd.service may not be installed

2021-09-28 05:47:03: isRunning: 0; isEnabled: 0

2021-09-28 05:47:03: remove service file: /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: Removing file /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: Successfully removed file: /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: SYSTEMD: Copying /u01/app/18.3.0.0/grid/crs/install/oracle-ohasd.service to /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: Executing cmd: /usr/bin/systemctl daemon-reload

2021-09-28 05:47:34: Command output:

>  Failed to execute operation: Connection timed out 

2021-09-28 05:47:34: failed to reload systemd for scanning for changed units

2021-09-28 05:47:34: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 213 'oracle-ohasd.service' '25'

2021-09-28 05:48:47: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 213 'oracle-ohasd.service' '25'

2021-09-28 05:48:47: Command output:

>  CLSRSC-213: Failure in reading file 'oracle-ohasd.service' (error: 25) 

>End Command output

2021-09-28 05:48:47: CLSRSC-213: Failure in reading file 'oracle-ohasd.service' (error: 25)

2021-09-28 05:48:47: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 318

2021-09-28 05:48:47: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 318

2021-09-28 05:48:47: Command output:

>  CLSRSC-318: Failed to start Oracle OHASD service 



From patch logs we see that connection timed out errors while running systemctl daemon-reload command,  it might be a reason for patch failure 

lets try daemon-reload 

root@racnode01:# /usr/bin/systemctl daemon-reload

Failed to execute operation: Connection timed out -- same error reported


Check os process related to systemd

root@racnode01:# ps -ef | grep systemd

root   1     0  0 Sep20 ?      00:01:07 /usr/lib/systemd/systemd --switched-root --system --deserialize 22


Issue:

process 1 (systemd) was started with --switched-root and --deserialize.looks it's trying to reload with previously saved state.


Solution: 

1. System reboot will clear all child process and help in applying patch successfully -- we rebooted node and resumed patching 

2. kill -9 process 1 also will help 


Resume failed patch, it will complete successfully now 

root@racnode01:# opatchauto resume

OPatchauto session is initiated at Tue Sep 28 23:17:38 2021

Session log file is /u01/app/18.3.0.0/grid/cfgtoollogs/opatchauto/opatchauto2021-09-28_11-17-40PM.log

Resuming existing session with id 5INE

Checking shared status of home.....


Starting CRS service on home /u01/app/18.3.0.0/grid

CRS service started successfully on home /u01/app/18.3.0.0/grid


OPatchAuto successful.

--------------------------------Summary--------------------------------

Patching is completed successfully. Please find the summary as follows:


Host:iadstgracdb44

CRS Home:/u01/app/18.3.0.0/grid

Version:18.0.0.0.0

Summary:


==Following patches were SUCCESSFULLY applied:


Patch: /u01/app/19c_Software/27494830/27494830

Log: /u01/app/18.3.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-09-28_04-12-27AM_1.log


OPatchauto session completed at Tue Sep 28 23:29:36 2021

Time taken to complete the session 11 minutes, 59 seconds

root@racnode01:# 



No comments:

Post a Comment