Thursday 21 October 2021

Oracle back ground process

The Oracle database system uses several background processes to perform various tasks and manage the database efficiently. Here are the details of the most important background processes:

PMON (Process Monitor):
    PMON is responsible for process monitoring and process recovery. It cleans up failed processes, releases resources associated with them, and performs process recovery in case of process failures. It also maintains the list of active processes and frees up resources held by terminated or failed processes.
 
SMON (System Monitor):
    SMON is responsible for system-level recovery, instance startup, and instance shutdown. It performs crash recovery during database startup to ensure the database is in a consistent state. SMON also coalesces free space in data blocks and deallocates temporary segments, among other tasks.

DBWn (Database Writer):db_writer_processses
·
Writes modified data blocks from the database buffer cache to the data files on disk.
·Scans the buffer cache and identifies dirty (modified) blocks.
·Writes dirty blocks to their respective data files, using multi-block writes to optimize disk I/O.
·Uses a write-ahead algorithm to ensure data changes are persisted to disk before commit.
·Implements the least-recently-used (LRU) algorithm to manage the buffer cache and make room for new data blocks.

DBWn process writes data from db buffer cache to datafiles under following conditions:
  • Checkpoint
  • Free Buffer Scans
  • User Process Request
  • Log Writer (LGWR) Request
  • Full Buffer Cache
  • Database Shutdown

CKPT (Checkpoint Process):
·Initiates the checkpoint process to synchronize the database buffers with data files.
·Signals DBWn to write dirty blocks to data files and update the control file and data file headers.
·Updates the control file to record the latest checkpoint position.
·Ensures that all changes made up to the checkpoint are recoverable in case of a system failure.
·Helps in reducing the time required for database recovery.

When does checkpoint occures:
  • Automatic Checkpoints, LOG_CHECKPOINT_INTERVAL and FAST_START_MTTR_TARGET.
  • Database Shutdown
  • manual CHECKPOINT
  • Tablespace or Data File Offline/Online Operations
  • Log Switch
  • Database Recovery

LGWR (Log Writer):
·Writes redo log entries from the redo log buffer to the redo log files on disk.
·Copies redo log entries generated by user transactions and background processes to the redo log buffer.
·Writes redo log entries sequentially to redo log files in a circular fashion.
·Uses the write-ahead logging (WAL) technique to ensure transaction durability.
·Writes redo log entries in a timely manner to minimize the risk of data loss during a system failure.
·Coordinates with the checkpoint process (CKPT) to perform consistent database recovery.

When does lgwr occures:

  • When a user transaction commits,
  • When a checkpoint occurs,
  • During a log switch
  • During instance recovery or media recovery operations,

 

ORA-00932: inconsistent datatypes: expected DATE got NUMBER

SQL> insert into table1 values(1,'Jay',2000,01-06-85,11-11-15);

insert into table1 values(1,'Jay',2000,01-06-85,11-11-15)

                                            *

ERROR at line 1:

ORA-00932: inconsistent datatypes: expected DATE got NUMBER


SQL> desc table1

 Name              Null?    Type

 ---------------- -------- ----------------------------

 SNO                            NUMBER(12)

 NAME                           VARCHAR2(32)

 SAL                            NUMBER(10)

 DOB                            DATE

 HIREDATE                       DATE   



01-06-85 is a numeric value , use to_date function to convert number to date


SQL> insert into table1 values(1,'Jay',2000,to_date('01-06-85','DD-MM-YY'),to_date('11-11-15','DD-MM-YY'));

1 row created.


SQL> commit;

Commit complete.

Tablespace creation failed with master key not yet set

master key not yet set Error while creating tablespace on On prem database,

SQL> create bigfile tablespace USER_DATA01 datafile '+PRD_DATA' size 5G autoextend on next 1g maxsize unlimited
*
ERROR at line 1:
ORA-28361: master key not yet set

Check db parameters:
SQL> show parameter tablespace

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
encrypt_new_tablespaces              string      ALWAYS
undo_tablespace                      string      UNDOTBS1


SQL> alter system set encrypt_new_tablespaces='' scope=both sid='*';
alter system set encrypt_new_tablespaces='' scope=both sid='*'
*
ERROR at line 1:
ORA-00096: invalid value  for parameter encrypt_new_tablespaces, must be from
among DDL, ALWAYS, CLOUD_ONLY

Change
encrypt_new_tablespaces parameter to cloud_only:
SQL> alter system set encrypt_new_tablespaces='CLOUD_ONLY' scope=both sid='*';
System altered.

Able to create tablespace without any issues
SQL> create bigfile tablespace USER_DATA01 datafile '+PRD_DATA' size 5G autoextend on next 1g maxsize unlimited;

Tablespace created.

Wednesday 20 October 2021

RMAN Duplicate failed , errors RMAN-05501: aborting duplication of target database,RMAN-03015,RMAN-04006,ORA-12514

rman duplicate command failed with error while allocating  multiple channel through run block


$ cat Proddb_standby_build.sh

connect target sys/Password@PRODDB_PRIM

connect auxiliary  sys/Password@PRODDB_OCI_STDBY

run {

allocate channel ch1 type disk;

allocate channel ch2 type disk;

allocate channel ch3 type disk;

allocate channel ch4 type disk;

allocate channel ch5 type disk;

allocate channel ch6 type disk;

allocate auxiliary channel ch7 type disk;

allocate auxiliary channel ch8 type disk;

DUPLICATE TARGET DATABASE FOR STANDBY FROM ACTIVE DATABASE DORECOVER;

}



error details:

$ tail -90f Proddb_standby_build.log


Spooling started in log file: /home/oracle/scripts/Migration/proddb/Proddb_standby_build.log


Recovery Manager19.10.0.0.0


RMAN>

PL/SQL package SYS.DBMS_BACKUP_RESTORE version 19.08.00.00 in TARGET database is not current

PL/SQL package SYS.DBMS_RCVMAN version 19.08.00.00 in TARGET database is not current

connected to target database: PRODDB (DBID=2143832746)


RMAN>

connected to auxiliary database: PRODDB (not mounted)


RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11>

using target database control file instead of recovery catalog

allocated channel: ch1

channel ch1: SID=1565 instance=proddb1 device type=DISK


allocated channel: ch2

channel ch2: SID=1711 instance=proddb1 device type=DISK


allocated channel: ch3

channel ch3: SID=1778 instance=proddb1 device type=DISK


allocated channel: ch4

channel ch4: SID=1850 instance=proddb1 device type=DISK


allocated channel: ch5

channel ch5: SID=2134 instance=proddb1 device type=DISK


allocated channel: ch6

channel ch6: SID=2205 instance=proddb1 device type=DISK


allocated channel: ch7

channel ch7: SID=2411 device type=DISK


allocated channel: ch8

channel ch8: SID=2477 device type=DISK


Starting Duplicate Db at 19-OCT-21

current log archived


contents of Memory Script:

{

   backup as copy reuse

   passwordfile auxiliary format  '/u02/app/oracle/product/19.0.0.0/dbhome_2/dbs/orapwproddb11'   ;

}

executing Memory Script


Starting backup at 19-OCT-21

Finished backup at 19-OCT-21

duplicating Online logs to Oracle Managed File (OMF) location

duplicating Datafiles to Oracle Managed File (OMF) location


contents of Memory Script:

{

   sql clone "alter system set  control_files =

  ''+DATAC1/PRODDB_OCI/CONTROLFILE/current.328.1079736267'' comment=

 ''Set by RMAN'' scope=spfile";

   backup as copy current controlfile for standby auxiliary format  '+DATAC1/PRODDB_OCI/CONTROLFILE/current.328.1079736267';

   sql clone "alter system set  control_files =

  ''+DATAC1/PRODDB_OCI/CONTROLFILE/current.328.1079736267'' comment=

 ''Set by RMAN'' scope=spfile";

   shutdown clone immediate;

   startup clone nomount;

}

executing Memory Script


sql statement: alter system set  control_files =   ''+DATAC1/PRODDB_OCI/CONTROLFILE/current.328.1079736267'' comment= ''Set by RMAN'' scope=spfile


Starting backup at 19-OCT-21

channel ch1: starting datafile copy

copying standby control file

output file name=+DATAC1/PRODDB_OCI/CONTROLFILE/current.328.1079736267 tag=TAG20211019T015841

channel ch1: datafile copy complete, elapsed time: 00:00:03

Finished backup at 19-OCT-21


sql statement: alter system set  control_files =   ''+DATAC1/PRODDB_OCI/CONTROLFILE/current.328.1079736267'' comment= ''Set by RMAN'' scope=spfile


Oracle instance shut down


released channel: ch1

released channel: ch2

released channel: ch3

released channel: ch4

released channel: ch5

released channel: ch6

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of Duplicate Db command at 10/19/2021 01:59:21

RMAN-05501: aborting duplication of target database

RMAN-03015: error occurred in stored script Memory Script

RMAN-04006: error from auxiliary database: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor



Solution: Configure parallelism in rman and run duplicate 


Connect to rman target:

rman 

connect target sys/Password@PRODDB_PRIM  

connect auxiliary  sys/Password@PRODDB_OCI_STDBY


Configure parallelism:

CONFIGURE DEVICE TYPE DISK PARALLELISM 8 BACKUP TYPE TO BACKUPSET;


Duplicate will run without any errors now

DUPLICATE TARGET DATABASE FOR STANDBY FROM ACTIVE DATABASE DORECOVER;



Saturday 2 October 2021

Failed to start Oracle OHASD service error while running 18c grid patch

18c grid opatchauto apply failed with errors, lets find root cause of failure and issue fix 


Apply grid patch with opatchauto:

root@racnode01:# opatchauto apply /u01/app/19c_Software/27494830  -oh /u01/app/18.3.0.0/grid


OPatchauto session is initiated at Tue Sep 28 04:08:55 2021


System initialization log file is /u01/app/18.3.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2021-09-28_04-08-58AM.log.


Session log file is /u01/app/18.3.0.0/grid/cfgtoollogs/opatchauto/opatchauto2021-09-28_04-11-20AM.log

The id for this session is 5INE


Executing OPatch prereq operations to verify patch applicability on home /u01/app/18.3.0.0/grid

Patch applicability verified successfully on home /u01/app/18.3.0.0/grid


Bringing down CRS service on home /u01/app/18.3.0.0/grid

CRS service brought down successfully on home /u01/app/18.3.0.0/grid


Start applying binary patch on home /u01/app/18.3.0.0/grid

Binary patch applied successfully on home /u01/app/18.3.0.0/grid


Starting CRS service on home /u01/app/18.3.0.0/grid

Failed to start CRS service on home /u01/app/18.3.0.0/grid


Execution of [GIStartupAction] patch action failed, check log for more details. Failures:

Patch Target : racnode01->/u01/app/18.3.0.0/grid Type[crs]

Details: [

---------------------------Patching Failed---------------------------------

Command execution failed during patching in home: /u01/app/18.3.0.0/grid, host: racnode01.

Command failed:  /u01/app/18.3.0.0/grid/perl/bin/perl -I/u01/app/18.3.0.0/grid/perl/lib -I/u01/app/18.3.0.0/grid/OPatch/auto/dbtmp/bootstrap_racnode01/patchwork/crs/install /u01/app/18.3.0.0/grid/OPatch/auto/dbtmp/bootstrap_racnode01/patchwork/crs/install/rootcrs.pl -postpatch

Command failure output: 

Using configuration parameter file: /u01/app/18.3.0.0/grid/OPatch/auto/dbtmp/bootstrap_racnode01/patchwork/crs/install/crsconfig_params

The log of current session can be found at:

  /app/grid/crsdata/racnode01/crsconfig/crs_postpatch_racnode01_2021-09-28_04-13-13AM.log

2021/09/28 04:13:37 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service'

2021/09/28 04:14:13 CLSRSC-318: Failed to start Oracle OHASD service 


After fixing the cause of failure Run opatchauto resume]

OPATCHAUTO-68061: The orchestration engine failed.

OPATCHAUTO-68061: The orchestration engine failed with return code 1

OPATCHAUTO-68061: Check the log for more details.

OPatchAuto failed.


OPatchauto session completed at Tue Sep 28 04:15:27 2021

Time taken to complete the session 6 minutes, 32 seconds


opatchauto failed with error code 42

 

From patch log file:

2021-09-28 05:47:03: The unit oracle-ohasd.service may not be installed

2021-09-28 05:47:03: isRunning: 0; isEnabled: 0

2021-09-28 05:47:03: remove service file: /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: Removing file /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: Successfully removed file: /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: SYSTEMD: Copying /u01/app/18.3.0.0/grid/crs/install/oracle-ohasd.service to /etc/systemd/system/oracle-ohasd.service

2021-09-28 05:47:03: Executing cmd: /usr/bin/systemctl daemon-reload

2021-09-28 05:47:34: Command output:

>  Failed to execute operation: Connection timed out 

2021-09-28 05:47:34: failed to reload systemd for scanning for changed units

2021-09-28 05:47:34: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 213 'oracle-ohasd.service' '25'

2021-09-28 05:48:47: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 213 'oracle-ohasd.service' '25'

2021-09-28 05:48:47: Command output:

>  CLSRSC-213: Failure in reading file 'oracle-ohasd.service' (error: 25) 

>End Command output

2021-09-28 05:48:47: CLSRSC-213: Failure in reading file 'oracle-ohasd.service' (error: 25)

2021-09-28 05:48:47: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 318

2021-09-28 05:48:47: Executing cmd: /u01/app/18.3.0.0/grid/bin/clsecho -p has -f clsrsc -m 318

2021-09-28 05:48:47: Command output:

>  CLSRSC-318: Failed to start Oracle OHASD service 



From patch logs we see that connection timed out errors while running systemctl daemon-reload command,  it might be a reason for patch failure 

lets try daemon-reload 

root@racnode01:# /usr/bin/systemctl daemon-reload

Failed to execute operation: Connection timed out -- same error reported


Check os process related to systemd

root@racnode01:# ps -ef | grep systemd

root   1     0  0 Sep20 ?      00:01:07 /usr/lib/systemd/systemd --switched-root --system --deserialize 22


Issue:

process 1 (systemd) was started with --switched-root and --deserialize.looks it's trying to reload with previously saved state.


Solution: 

1. System reboot will clear all child process and help in applying patch successfully -- we rebooted node and resumed patching 

2. kill -9 process 1 also will help 


Resume failed patch, it will complete successfully now 

root@racnode01:# opatchauto resume

OPatchauto session is initiated at Tue Sep 28 23:17:38 2021

Session log file is /u01/app/18.3.0.0/grid/cfgtoollogs/opatchauto/opatchauto2021-09-28_11-17-40PM.log

Resuming existing session with id 5INE

Checking shared status of home.....


Starting CRS service on home /u01/app/18.3.0.0/grid

CRS service started successfully on home /u01/app/18.3.0.0/grid


OPatchAuto successful.

--------------------------------Summary--------------------------------

Patching is completed successfully. Please find the summary as follows:


Host:iadstgracdb44

CRS Home:/u01/app/18.3.0.0/grid

Version:18.0.0.0.0

Summary:


==Following patches were SUCCESSFULLY applied:


Patch: /u01/app/19c_Software/27494830/27494830

Log: /u01/app/18.3.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-09-28_04-12-27AM_1.log


OPatchauto session completed at Tue Sep 28 23:29:36 2021

Time taken to complete the session 11 minutes, 59 seconds

root@racnode01:#