Sunday, 10 August 2025

DML Error Logging

What Is DBMS_ERRLOG?
DBMS_ERRLOG is a built-in Oracle package that helps you manage errors during DML operations (like INSERT, UPDATE, or DELETE). Its core function is to create and use an error logging table. Instead of failing the entire statement when it encounters a bad row, database simply logs the error and moves on to the next one. 

How Does It Work?
The process is simple and involves two main steps:
    Create an Error Logging Table: use the DBMS_ERRLOG.CREATE_ERROR_LOG procedure to create a table that will store information about any rows that fail your DML statement. only need to do this once for each target table.
    Enable Error Logging in Your DML Statement: add the LOG ERRORS clause to your INSERT, UPDATE, or DELETE statement. This tells the database to use the error logging table  created.

Lets try sample table
SQL> CREATE TABLE contacts (
    contact_id NUMBER PRIMARY KEY,
    first_name VARCHAR2(50),
    last_name VARCHAR2(50)
);  

Table created.
Create an error log table for the contacts table using the
DBMS_ERRLOG.CREATE_ERROR_LOG procedure. This only needs to be done once per target table.
SQL> BEGIN
  DBMS_ERRLOG.CREATE_ERROR_LOG('contacts', 'contacts_err_log');
END;
/  
PL/SQL procedure successfully completed.


Insert data to contacts table
SQL> INSERT INTO contacts (contact_id, first_name, last_name) VALUES (1, 'John', 'Doe') LOG ERRORS INTO contacts_err_log ('bulk_insert_test') REJECT LIMIT UNLIMITED;
1 row created.

SQL> INSERT INTO contacts (contact_id, first_name, last_name) VALUES (2, NULL, 'Smith') LOG ERRORS INTO contacts_err_log ('bulk_insert_test') REJECT LIMIT UNLIMITED;
1 row created.

SQL> INSERT INTO contacts (contact_id, first_name, last_name) VALUES (1, 'Jane', 'Doe') LOG ERRORS INTO contacts_err_log ('bulk_insert_test') REJECT LIMIT UNLIMITED;
0 rows created.

SQL> INSERT INTO contacts (contact_id, first_name, last_name) VALUES (3, 'Peter', 'Jones') LOG ERRORS INTO contacts_err_log ('bulk_insert_test') REJECT LIMIT UNLIMITED;
1 row created.

SQL> commit;
Commit complete.

Lets check error details from error log table:

SQL> SELECT *  FROM contacts_err_log;
   ORA_ERR_NUMBER$ ORA_ERR_MESG$                                                            ORA_ERR_ROWID$         OR ORA_ERR_TAG$                 CONTACT_ID               FIRST_NAME     LAST_NAME
------------------ ------------------------------------------------------------------------ ---------------------- -- ---------------------------- ------------------------ -------------- ----------------
                 1 ORA-00001: unique constraint (SYS.SYS_C008882) violated                                         I  bulk_insert_test             1                        Jane           Doe


The output will show you the specific error codes and messages for each failed row, allowing you to easily identify and correct the data issues.


Sunday, 20 July 2025

Why Local Indexes Stay Usable in Oracle Partitioned/Subpartitioned Tables

Understanding the Players: Partitioning and Indexes
Before we jump into TRUNCATE, let's quickly recap the two stars of our show:
    Partitioning: Imagine your massive database table as a giant book. Partitioning is like dividing that book into smaller, more manageable chapters (partitions) and even sections within those chapters (subpartitions). This helps with performance, maintenance, and data lifecycle management.
 Indexes: Indexes are like the index at the back of our book. They help Oracle quickly find specific rows without scanning the entire table.
Now, when you have a partitioned table, you can have two main types of indexes:
 Global Indexes: These are like a single, comprehensive index for the entire book. They don't care about your chapters; they just list every keyword and its page number, no matter which chapter it's in.
 Local Indexes: These are the game-changers for partitioned tables. With a local index, for every chapter (or sub-chapter), you get its own mini-index. So, "Chapter 1 Index," "Chapter 2 Index," and so on. Critically, the local index is partitioned exactly the same way as its table. This means the index entries for data in Chapter 1 only reside in Chapter 1 Index.

Power of TRUNCATE
The TRUNCATE command is a Data Definition Language (DDL) operation. Unlike DELETE (which logs every row deletion and generates undo), TRUNCATE rapidly deallocates all space used by the data in a table or partition/subpartition. It essentially "resets" the high-water mark, making it incredibly efficient for emptying large segments.

Why TRUNCATE Keeps Local Indexes Usable
This is where the magic happens! When you TRUNCATE a subpartition (or a partition):
 Direct Correspondence: Because a local index subpartition is explicitly linked, one-to-one, with its corresponding table subpartition, Oracle knows exactly which part of the index needs to be cleared.
 Atomic Operation: When you issue TRUNCATE SUBPARTITION, Oracle doesn't just clear the data. It simultaneously and atomically truncates the associated local index subpartition. The index entries belonging to that specific data segment are removed right alongside the data.
 No Inconsistency: Since the index entries are removed from their respective local index subpartition, there are no "dangling pointers" or inconsistencies. The local index remains valid and perfectly synchronized with its remaining data. Other local index subpartitions (for untouched data) are completely unaffected.
 No Rebuild Needed: This is the best part! Because the local index subpartition is reset and ready, there's no need for a costly ALTER INDEX REBUILD operation. Your index is immediately usable for any new data inserted into the truncated subpartition.

Contrast this with Global Indexes: If you TRUNCATE a subpartition when you have a global index, that global index will become UNUSABLE. Why? Because it contains entries from all subpartitions. When one goes away, the global index is left with pointers to non-existent data, making it inconsistent and requiring a full rebuild.

Sunday, 13 July 2025

What is "enq: TX - index contention"?

At its core, "enq: TX - index contention" signifies that multiple sessions are trying to acquire a Transaction (TX) enqueue on the same index block concurrently. An enqueue is a locking mechanism used by Oracle to protect shared resources. When sessions are waiting on a TX enqueue for an index, it means they are effectively "stuck" trying to modify (insert, update, or delete) rows that would impact the same small portion of an index.  

Think of it like this: Imagine a very popular book in a library. Many people want to borrow it, but there's only one copy. Everyone lines up, waiting for their turn. In our database analogy, the "book" is a specific index leaf block, and the "people" are concurrent transactions trying to write to it. The "line" is the "enq: TX - index contention" wait.

The Root Causes: Why Does This Happen?
"enq: TX - index contention" isn't random; it stems from specific access patterns that create "hot spots" within your indexes.

1. The Monotonically Increasing Key (The Most Common Culprit)
This is by far the leading cause. When you have an index built on a column whose values are always increasing, like:
    Sequence-generated Primary Keys (e.g., ID NUMBER DEFAULT MY_SEQ.NEXTVAL)
    DATE or TIMESTAMP columns using SYSDATE / SYSTIMESTAMP
    Application-generated unique identifiers that increment sequentially

Every new INSERT operation will attempt to add a new entry to the "right-hand side" or the highest value end of the index. If you have many sessions inserting concurrently, they all converge on the same few index leaf blocks at the "end" of the index. This creates a severe bottleneck as sessions contend for exclusive access to these blocks, leading to:
    Frequent Block Splits: As the hot block fills up, Oracle has to split it into two blocks to make room for new entries. This is an expensive operation and exacerbates contention.
    Increased Buffer Cache Activity: The hot block is constantly being read into and written from the buffer cache.
    High Interconnect Traffic (in RAC): In a RAC environment, these hot blocks are constantly "ping-ponging" between instances as different nodes try to insert into them, leading to high "gc buffer busy" waits in addition to "enq: TX" waits.

Example Scenario:
Imagine an ORDER table with ORDER_ID as a primary key, populated by ORDER_SEQ.NEXTVAL. In a busy e-commerce application, hundreds of orders are placed per second. Each INSERT INTO ORDERS (ORDER_ID, ...) attempts to write a new, higher ORDER_ID into the ORDER_PK index. All these inserts bottleneck on the same few index leaf blocks, resulting in "enq: TX - index contention."

2. Heavy Deletes Followed by Inserts (Index Fragmentation)
While less common than monotonic keys, this can also contribute. If you perform large-scale DELETE operations on a table, the index entries are marked as deleted but the space isn't immediately reclaimed. Subsequent INSERT operations might then try to reuse this freed space. If many concurrent sessions are attempting to insert into these "fragmented" or "sparse" index blocks, it can lead to contention as they fight for ITL (Interested Transaction List) slots or space within those blocks.

3. Low PCT_FREE for the Index
PCT_FREE specifies the minimum percentage of free space reserved in each data block for future updates. While more critical for data blocks, a very low PCT_FREE on an index could theoretically contribute to contention if updates to existing index entries cause the blocks to become excessively full, leading to more frequent splits or difficulty in allocating ITL slots for concurrent transactions. This is a less frequent direct cause but can be a contributing factor.
The Impact: How Does It Hurt Performance?
The consequences of "enq: TX - index contention" are significant:
    Reduced Throughput: Transactions wait, slowing down overall DML operations.
    Increased Response Times: User experience suffers as applications become sluggish.
    High CPU Utilization: Index block splits, consistent read generation, and frequent buffer cache operations consume more CPU.
    Increased I/O (Indirectly): While primarily a CPU/contention issue, increased block splits can indirectly lead to more I/O.
    RAC-Specific Woes: In RAC, this contention translates directly to high "gc buffer busy" waits, as instances spend excessive time transferring the same hot blocks back and forth across the interconnect.

The Fixes: Solutions to Alleviate Index Contention
Once you've identified "enq: TX - index contention" as a problem (typically via AWR reports showing high waits for this event and pinpointing the exact index), here are the most effective strategies:

Solution 1: Reverse Key Indexes (The Go-To for Monotonic Keys)
    A reverse key index physically reverses the byte order of the column's values before storing them in the index. This effectively scrambles the sequence of monotonically increasing keys, distributing them randomly across the index's leaf blocks instead of concentrating them at one end.
    When to Use: Ideal for indexes on sequence-generated primary keys or other monotonically increasing values where range scans are not critical.
    -- Original index (if it exists)
    DROP INDEX my_table_pk;
    -- Create a reverse key index
    CREATE UNIQUE INDEX my_table_pk ON my_table (id) REVERSE;
    Pros: Highly effective at eliminating hot blocks and reducing contention for inserts.
    Cons: Significantly degrades performance for index range scans (e.g., WHERE ID BETWEEN 100 AND 200). This is because the values are no longer stored in logical order. You would need to perform a full index scan or a full table scan for such queries, which can be much slower.
    Consideration: Use this only if your application primarily performs direct lookups (WHERE ID = 123) and high-volume inserts. If range scans are frequent, look at other solutions first.

Example:
If ORDER_ID is a sequence, creating CREATE UNIQUE INDEX ORDER_PK ON ORDERS (ORDER_ID) REVERSE; would distribute new order inserts across the index. Queries like SELECT * FROM ORDERS WHERE ORDER_ID = 54321; would still be efficient, but SELECT * FROM ORDERS WHERE ORDER_ID BETWEEN 1000 AND 2000; would likely be slow.

Solution 2: Hash Partitioned Global Indexes (Requires Partitioning License)
    Concept: If you have the Oracle Partitioning option, you can create a Global Hash Partitioned Index. Oracle applies a hash function to the index key, distributing the index entries across multiple partitions. This effectively spreads out the hot spot caused by monotonic keys over multiple physical index segments.
    When to Use: When you need the benefits of distributing contention but also require efficient range scans (which reverse key indexes sacrifice).

    -- Assuming MY_TABLE is already hash partitioned by some column,
    -- or you just want to partition the index itself.
    CREATE UNIQUE INDEX my_table_pk ON my_table (id)
    GLOBAL PARTITION BY HASH (id)
    (
        PARTITION p1,
        PARTITION p2,
        PARTITION p3,
        PARTITION p4
    );

    Pros: Excellent for distributing inserts and maintaining good range scan performance.
    Cons: Requires the Oracle Partitioning option (additional license). Can be more complex to manage than simple indexes.

Solution 3: Increase Sequence CACHE Size
    If your index key is derived from a sequence (e.g., MY_SEQ.NEXTVAL), increasing the CACHE size for that sequence can help. Instead of fetching one number at a time, each session (or instance in RAC) can fetch a block of numbers (e.g., 100 or 1000). This reduces the contention on the sequence object itself. More importantly, in RAC, using CACHE with NOORDER allows different instances to use different ranges of sequence numbers simultaneously, indirectly distributing the inserts across the index.
    When to Use: Always consider this if sequences are generating your contended keys.
    ALTER SEQUENCE my_schema.my_sequence CACHE 1000 NOORDER;
    (Adjust 1000 based on your concurrency, often 10000 or more for very high volumes).
    Pros: Simple to implement, no impact on index scan performance.
    Cons: Does not directly address the index block contention but reduces pressure on the sequence and can help distribute inserts if combined with other factors like different session start times. It's more of a complementary solution.

Solution 4: Rebuild/Shrink Indexes After Large Deletes
If your contention arises after massive DELETE operations (which leave "holes" in the index), rebuilding or shrinking the index reclaims the space and reorganizes the index structure. This can reduce contention by ensuring blocks are full and contiguous, minimizing splits when subsequent inserts occur.
    When to Use: After significant data purges.

    ALTER INDEX my_schema.my_index REBUILD ONLINE; -- Rebuilds index
    -- OR
    ALTER INDEX my_schema.my_index SHRINK SPACE; -- Shrinks index (less disruptive)

    Pros: Reclaims space, improves index efficiency, can reduce future block splits.
    Cons: Rebuilds can be resource-intensive and lock the index (though ONLINE mitigates this).

Solution 5: Increase PCT_FREE for the Index (Less Common Primary Fix)
PCT_FREE reserves space within blocks. If index blocks are consistently filling up too fast, increasing PCT_FREE provides more space for new entries and ITL slots, potentially reducing immediate contention.

When to Use: If analysis shows contention is due to rapid block filling and ITL waits.
    ALTER INDEX my_schema.my_index PCTFREE 20; -- Default is 10
    Pros: Simple change.
    Cons: Can increase index size (less efficient storage), and often not the primary solution for monotonic key contention.


Monday, 30 June 2025

Oracle Indexes: What BLEVEL, LEAF_BLOCKS, and CLUSTERING_FACTOR Really Tell You

BLEVEL: The GPS Navigation System
Imagine you're trying to find a book in a massive library. BLEVEL tells you how many floors you need to climb to reach your destination.

What BLEVEL really means:
- 0: You're already on the right floor (tiny indexes)
- 1: One elevator ride to the correct section (most common)
- 2+: Multiple elevator rides and stair climbs (very large tables)

Real Story:  BLEVEL is your index's "navigation depth." It counts how many levels Oracle must traverse from the root block to reach your data. Think of it as the number of "hops" your query makes through the index structure.
When to Worry:  A BLEVEL that suddenly jumps from 1 to 3 without a corresponding data increase might signal index fragmentation. But don't panic—modern Oracle versions handle this pretty well.

LEAF_BLOCKS: The Storage Bill
LEAF_BLOCKS is straightforward: it's how much disk space your index is consuming at the leaf level.
The Simple Truth:  More data = more leaf blocks = bigger index size.

Performance Reality Check:
- Single-row lookups: Don't care much about leaf blocks (you only read one)
- Range scans: Care a lot (you read multiple blocks)

Pro Tip: If your LEAF_BLOCKS is growing faster than your table data, you might be over-indexing or indexing the wrong columns.

CLUSTERING_FACTOR: The Performance Crystal Ball
This is where things get interesting. CLUSTERING_FACTOR is like a fortune teller for your query performance.
CLUSTERING_FACTOR = Number of times Oracle switches table blocks when reading index entries in order

What the Numbers Mean:
- Low CF (close to LEAF_BLOCKS): Your data is well-organized—like books sorted by author on library shelves
- High CF (close to table NUM_ROWS): Your data is scattered—like books randomly placed throughout the library

The Performance Impact: Range scans become expensive when Oracle has to jump between hundreds of different table blocks instead of reading consecutive blocks.

## Let's Build a Real Example

Step 1: Create Our Test Environment
CREATE TABLE employees_test (
    employee_id NUMBER(6) PRIMARY KEY,
    first_name VARCHAR2(20),
    last_name VARCHAR2(25),
    email VARCHAR2(25),
    phone_number VARCHAR2(20),
    hire_date DATE,
    job_id VARCHAR2(10),
    salary NUMBER(8,2),
    commission_pct NUMBER(2,2),
    manager_id NUMBER(6),
    department_id NUMBER(4)
);

-- Populate with 100,000 employees
BEGIN
    FOR i IN 1..100000 LOOP
        INSERT INTO employees_test VALUES (
            i,                                    -- Sequential employee_id (good clustering)
            'FirstName' || i,
            'LastName' || i,
            'email' || i || '@example.com',
            '555-123-' || LPAD(i, 4, '0'),
            SYSDATE - DBMS_RANDOM.VALUE(1, 365*5),
            CASE MOD(i, 5) WHEN 0 THEN 'IT_PROG' ELSE 'SA_REP' END,
            ROUND(DBMS_RANDOM.VALUE(3000, 15000), 2),
            ROUND(DBMS_RANDOM.VALUE(0, 0.4), 2),
            CASE WHEN MOD(i, 100) = 0 THEN NULL 
                 ELSE ROUND(DBMS_RANDOM.VALUE(100, 200)) END,
            CASE MOD(i, 3) WHEN 0 THEN 90 
                           WHEN 1 THEN 60 
                           ELSE 30 END            -- Scattered department_id (poor clustering)
        );
    END LOOP;
    COMMIT;
END;
/

-- Create our test index
CREATE INDEX idx_emp_dept_id ON employees_test (department_id);
-- Gather fresh statistics
EXEC DBMS_STATS.GATHER_TABLE_STATS(
    OWNNAME => USER, 
    TABNAME => 'EMPLOYEES_TEST', 
    CASCADE => TRUE
);

Step 2: The Moment of Truth
Now let's see what our indexes look like:
-- Check our table stats
SELECT table_name, blocks, num_rows  FROM dba_tables  WHERE table_name = 'EMPLOYEES_TEST';
TABLE_NAME                           BLOCKS   NUM_ROWS
-------------------------------- ---------- ----------
EMPLOYEES_TEST                         1467     100000

-- Check our index stats
SELECT index_name, table_name, CLUSTERING_FACTOR,blevel, leaf_blocks, num_rows, distinct_keys FROM user_indexes WHERE table_name = 'EMPLOYEES_TEST';
INDEX_NAME                       TABLE_NAME                       CLUSTERING_FACTOR     BLEVEL LEAF_BLOCKS   NUM_ROWS DISTINCT_KEYS
-------------------------------- -------------------------------- ----------------- ---------- ----------- ---------- -------------
SYS_C008470                      EMPLOYEES_TEST                                1464          1         187     100000        100000
IDX_EMP_DEPT_ID                  EMPLOYEES_TEST                                4392          1         196     100000             3

## Decoding the Results
Good Guy:  Primary Key Index (SYS_C008470)
BLEVEL = 1:  Perfect! One hop to reach any employee ID
LEAF_BLOCKS = 187:  Reasonable size for 100,000 unique values
CLUSTERING_FACTOR = 1464:  Excellent! Very close to table blocks (1467)
Why This Rocks:  Employee IDs were inserted sequentially (1, 2, 3...), so the physical table order matches the index order. When you query `WHERE employee_id BETWEEN 1000 AND 1050`, Oracle reads consecutive table blocks—like reading pages in a book.
Troublemaker:  Department Index (IDX_EMP_DEPT_ID)
BLEVEL = 1: Same efficient navigation
LEAF_BLOCKS = 196: Slightly larger (non-unique values)
CLUSTERING_FACTOR = 4392: Uh-oh! Much higher than table blocks
Why This Hurts: Department IDs are scattered (90, 60, 30, 90, 60, 30...). A query like `WHERE department_id = 90` will find all matching rows in the index, but then jump all over the table to fetch them—like trying to read a book where every page is in a different room.

## Performance Showdown
Let's see these differences in action with real explain plans:
###Fast Query (Good Clustering)
SELECT * FROM employees_test WHERE employee_id BETWEEN 1000 AND 1050;
Explain Plan:
Plan hash value: 240947081
------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                |    52 |  5200 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES_TEST |    52 |  5200 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SYS_C008470    |    52 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
   2 - access("EMPLOYEE_ID">=1000 AND "EMPLOYEE_ID"<=1050)

Scorecard:
- Cost: 3 - Lightning fast
- INDEX RANGE SCAN - Oracle loves this index
- 52 rows retrieved efficiently

###Slow Query (Poor Clustering)

SELECT * FROM employees_test  WHERE department_id = 90;
Explain Plan:
Plan hash value: 304096605
------------------------------------------------------------------------------------
| Id  | Operation         | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                | 33333 |  3255K|   400   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPLOYEES_TEST | 33333 |  3255K|   400   (1)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
   1 - filter("DEPARTMENT_ID"=90)

Scorecard:
- Cost: 400 - 133x slower!
- TABLE ACCESS FULL - Oracle completely ignored the index!
- 33,333 rows processed with full table scan

Plot Twist:  Why Oracle Ignored the Index
Here's the fascinating part:  Oracle's optimizer looked at the department index and said, "No thanks, I'll do a full table scan instead."

Math Behind the Decision:
- Department 90:  ~33,333 rows (1/3 of 100,000)
- Clustering Factor:  4,392 (very high)
- Table Blocks:  1,467
Oracle calculated that using the index would require reading nearly 4,400 different table blocks randomly. A full table scan (1,467 blocks sequentially) was actually faster! 

Lesson: High clustering factors can make indexes so inefficient that the optimizer prefers full table scans. This is why monitoring CLUSTERING_FACTOR is crucial.

Sunday, 15 June 2025

Global Hash-Partitioned Index

What is a Global Hash-Partitioned Index?
A global index is a single index structure that spans across all partitions of a partitioned table. It's not tied to the table's partitioning strategy.

Hash Partitioning:
The index itself is divided into a fixed number of partitions using a hash algorithm on the index key. This aims to distribute index entries evenly across these index partitions.
Independence from Table Partitioning: The partitioning of a global index does not necessarily align with the partitioning of the underlying table. You can have a range-partitioned table with a hash-partitioned global index, for example.
Supports Unique Constraints: Global indexes can enforce unique constraints across the entire table, even if the table is partitioned. This is a key advantage over local indexes in certain scenarios where the unique key doesn't include the table's partition key.

How do Global Hash-Partitioned Indexes Improve Performance?
Global hash-partitioned indexes are particularly beneficial in specific scenarios, primarily by:
Reducing Contention for "Hot Blocks" (Especially during Inserts):
In OLTP (Online Transaction Processing) environments, where there are a high number of concurrent inserts into a table, especially when the indexed column is a sequence-generated primary key (e.g., an IDENTITY column or a sequence NEXTVAL), a standard B-tree index can become a "hot spot." This is because new entries are constantly being added to the "rightmost" leaf block of the index, leading to contention for that single block.

Hash partitioning distributes these new index entries across multiple index partitions (and thus multiple physical blocks) based on the hash of the key. This spreads the insert workload, reducing contention, "buffer busy waits," and improving overall insert throughput. This is often referred to as addressing the "90-10 split problem."
Improving Point Queries:
For equality queries (e.g., WHERE index_column = :value), the database can quickly calculate the hash of the value and go directly to the specific index partition where that value's entry is located. This allows for efficient access to data, often with fewer logical I/Os compared to a non-partitioned index on a very large table, as the index tree to traverse within a partition is smaller.
Facilitating Parallelism:
While not as inherently parallel for range scans as local non-prefixed indexes, global hash-partitioned indexes can still benefit from parallel query execution for certain operations, as the workload can be distributed across the index partitions.
Managing Large Indexes:
Partitioning a very large index into smaller, more manageable pieces can simplify maintenance tasks (e.g., rebuilding a single partition instead of the entire index), though global indexes do require more manual maintenance if table partitions are dropped or added, as the global index may become unusable and need rebuilding or UPDATE INDEXES clause to stay valid.

When to Consider Using Global Hash-Partitioned Indexes:

High-volume OLTP systems: Especially when suffering from "hot block" contention on indexes due to sequential key generation.
Indexes on non-partitioning key columns: 

When you need a unique constraint or fast access on columns that are not part of the table's partitioning key.
Queries primarily involving equality (=) or IN list predicates: These types of queries can leverage partition pruning on the index. Range predicates (like BETWEEN, >, <) generally won't benefit from hash partition pruning on the index side.
When even data distribution is crucial: Hash partitioning aims to distribute data evenly, which can be beneficial for load balancing. For optimal distribution, the number of hash partitions should ideally be a power of 2 (e.g., 2, 4, 8, 16, 32).
When the partitioning key column has high cardinality and is fairly unique.


Potential Drawbacks and Considerations:

No Partition Pruning for Range Scans: 

A significant limitation is that global hash-partitioned indexes typically do not benefit from partition pruning for range queries. If your queries frequently use BETWEEN, >, or < on the indexed column, a different index strategy (like a global range-partitioned index or a local index if applicable) might be more suitable.


Maintenance Overhead: 

When table partitions are dropped or truncated, global indexes (both hash and range) can become UNUSABLE and require rebuilding or an UPDATE INDEXES clause during the table partition operation. This can lead to downtime for large indexes. Local indexes are often preferred for tables where partitions are frequently added/dropped (e.g., historical data).
Choosing the Right Partitioning Key: Just like with table partitioning, selecting an appropriate partitioning key for the index is crucial for achieving even data distribution and performance benefits. A poor choice can lead to skewed data and "hot" index partitions, negating the benefits.
Complexity: Managing partitioned indexes adds a layer of complexity to database administration.

Saturday, 31 May 2025

Understanding GV$INSTANCE_PING:RAC Interconnect Performance Monitoring

In an Oracle Real Application Clusters (RAC) environment, maintaining optimal interconnect performance is crucial for database operations. The GV$INSTANCE_PING view serves as a vital tool for Database Administrators (DBAs) to monitor and analyze the latency between RAC instances. This blog post explores how this view helps DBAs ensure optimal RAC performance.

What is GV$INSTANCE_PING?
GV$INSTANCE_PING is a dynamic performance view that provides detailed information about the measured latency of the interconnect between all instances in an Oracle RAC environment. Every few seconds, the PING process of each instance checks the response time of the interconnect to all other instances in the same database.

Key Features and Measurements
The view tracks two types of messages:
1. **500-byte messages (500B) Smaller test messages
2. **8-kilobyte messages (8K) Larger test messages

For each message type, the view provides comprehensive metrics:
Current Performance Metrics
- CURRENT_500B and CURRENT_8K: Latest response times in microseconds
- AVERAGE_500B and AVERAGE_8K: Average response times since instance startup
- MAX_500B and MAX_8K: Maximum observed response times

Cumulative Statistics
- COUNT_500B and COUNT_8K: Number of measurements taken
- WAIT_TIME_500B and WAIT_TIME_8K: Total accumulated response times
- WAIT_TIME_SQUARED_500B and WAIT_TIME_SQUARED_8K: Sum of squared response times (useful for statistical analysis)

Let's analyze an actual output from a 4-node RAC environment:
SQL> select * from GV$INSTANCE_PING;
   INST_ID   INSTANCE CURRENT_500B AVERAGE_500B   MAX_500B COUNT_500B WAIT_TIME_500B WAIT_TIME_SQUARED_500B CURRENT_8K AVERAGE_8K     MAX_8K   COUNT_8K WAIT_TIME_8K WAIT_TIME_SQUARED_8K     CON_ID
---------- ---------- ------------ ------------ ---------- ---------- -------------- ---------------------- ---------- ---------- ---------- ---------- ------------ -------------------- ----------
         2          1            7            7        250     370458        2924730                   1253          8          8        941     370458      3380206                 2547          0
         2          2            3            3       2249     370458        1217885                   5695          4          4        551     370458      1539529                  881          0
         2          3            7            7        304     370457        2887515                   1393          8          8      14780     370457      3389103               220212          0
         2          4            7            7        349     370458        2847595                   1298         10          8        515     370458      3313461                 1551          0
         3          1            8            7        227     370454        2952602                   1285          8          8        919     370454      3370483                 2855          0
         3          2            8            7      17647     370454        2901210                 314283          8          8       2206     370454      3416102                 6594          0
         3          3            3            3        531     370454        1226169                    760          4          3        611     370454      1557469                 1527          0
         3          4            7            7      17321     370454        2922328                 326255          8          8        617     370454      3322511                 1796          0
         4          1            7            7       1967     370454        2905185                   4652          8          8       1683     370454      3344914                 4081          0
         4          2            8            7        443     370454        2877140                   1129         22          8        709     370454      3396211                 1514          0
         4          3            7            7       1478     370454        2893488                   3060          8          9        367     370454      3372591                 1441          0
         4          4            3            3        749     370455        1211323                    790          4          4        199     370455      1517458                  421          0
         1          1            4            3        458     370459        1215589                    732          4          3        569     370459      1539772                 1330          0
         1          2            7            7        208     370458        2921436                   1109         10          9        561     370458      3448708                 1981          0
         1          3           15            7        151     370458        2929392                   1054          8          8        214     370458      3420401                 1464          0
         1          4            7            7        234     370459        2902809                   1181          8          8        203     370459      3362650                 1415          0

16 rows selected.


Key Observations from the Data:
1.Instance 2's Communication Patterns
   - Current 500B latency to Instance 1: 7 microseconds
   - Current 500B latency to Instance 2: 3 microseconds (self-ping)
   - Current 500B latency to Instance 3: 7 microseconds
   - Current 500B latency to Instance 4: 7 microseconds

2.Performance Characteristics
   - Average 500B latency is consistently 7 microseconds for most connections
   - Self-ping (Instance 2 to 2) shows better performance at 3 microseconds
   - Maximum latencies show some spikes:
     * 2249 microseconds to Instance 2
     * 304 microseconds to Instance 3
     * 349 microseconds to Instance 4

3.8K Message Performance
   - Current 8K latencies range from 4 to 10 microseconds
   - Maximum 8K latency shows significant variation:
     * 941 microseconds to Instance 1
     * 551 microseconds to Instance 2
     * 14780 microseconds to Instance 3 (notable spike)
     * 515 microseconds to Instance 4

4.Measurement Volume
   - Approximately 370,458 measurements taken for each connection
   - Consistent measurement count across instances indicates stable monitoring

Why is it Important for DBAs?
1.Performance Monitoring
   - Helps identify interconnect latency issues
   - Enables proactive detection of performance degradation
   - Provides historical data for trend analysis

2.Troubleshooting
   - Assists in diagnosing RAC performance problems
   - Helps identify instance-to-instance communication issues
   - Enables comparison of current vs. historical performance

Wednesday, 14 May 2025

Oracle Database 19c: Lost Write Protection Gets Smarter with AUTO

One of the most under-the-radar but important changes introduced in Oracle Database 19c Release Update (RU) 19.26 is the new default setting for the parameter DB_LOST_WRITE_PROTECT.
🔄 From NONE to AUTO: A Quiet but Powerful Shift

Previously, the default setting for DB_LOST_WRITE_PROTECT was:
DB_LOST_WRITE_PROTECT = NONE

This meant lost write detection was completely disabled by default unless explicitly turned on.

Starting with RU 19.26, Oracle has made a significant shift:
DB_LOST_WRITE_PROTECT = AUTO

This change is also reflected in Oracle Database 23ai. It’s more than just a default toggle — it's a smarter, adaptive mechanism that brings better protection with minimal performance trade-offs.
💡 What Is a Lost Write?

A Lost Write is a critical data corruption event where:
    Oracle writes a data block to storage,
    The storage subsystem acknowledges the write as successful,
    But the data block is never actually written to persistent storage.

These can be caused by memory errors, firmware bugs, OS issues, or storage controller failures. Lost writes are notoriously difficult to detect and often surface during recovery or standby failovers.

Oracle defines a lost write as:
    "A data block lost write occurs when an I/O subsystem acknowledges the completion of the block write, while in fact the write did not occur in the persistent storage."

🧠 Why AUTO Matters: Intelligent Lost Write Detection
The new AUTO value for DB_LOST_WRITE_PROTECT changes how Oracle handles lost write detection:
▶️ On a Primary Database:
    Oracle only logs buffer cache reads in the redo log if:
        Physical standby databases exist, and
        Real-time redo apply is enabled.

▶️ On a Standby Database:
    Oracle performs lost write detection only if the apply lag is within reasonable bounds.
    If apply lag exceeds:
        60 seconds, or
        2/3 of the FSFO apply lag threshold (if Fast-Start Failover is enabled),
    Then lost write detection is temporarily skipped to reduce overhead and ensure smooth role transitions.

⚙️ Comparison of DB_LOST_WRITE_PROTECT Modes
Value           Description                                                                Performance Impact
NONE         No lost write detection.                                                🔋 Low
TYPICAL    Logs reads for read/write tablespaces.                       ⚖️ Moderate
FULL           Logs reads for all tablespaces (read-only + read/write).    🛠️ High
AUTO (NEW)    Adapts based on real-time redo and apply lag.    ⚙️ Balanced


🚨 Real-World Impact: Why It Matters
With older RUs and the default NONE, lost write protection was disabled by default — even in Data Guard environments. This left many systems unprotected by mistake, simply because DBAs weren’t aware or didn’t configure it explicitly.

Now, AUTO intelligently enables protection only when it's needed, helping DBAs avoid the all-or-nothing trade-off between performance and data integrity.
🧪 What Should You Do?

Here are a few actions to take if you’ve recently applied Oracle 19.26 or higher:
✅ 1. Check the Current Setting
SHOW PARAMETER DB_LOST_WRITE_PROTECT;

If you see AUTO, you're on the new default.
✅ 2. Monitor Redo Generation
You may notice an increase in redo generation, especially if:
    You have a physical standby.
    Real-time redo apply is enabled.
See MOS Note: 37260974
“Database Generating Large Amount Of Redo After Applying Jan 2025 DBRU”


✅ 3. Understand the Error ORA-00752

If you ever encounter:
ORA-00752: recovery detected a lost write of a data block
Do not attempt to fix this manually. Instead:
    Shutdown the primary database.
    Failover to the standby.
    Open a support SR with Oracle immediately.


Thursday, 24 April 2025

Starting DBRU 19.27 and 23.8 Small Pages Are Not Allowed for RDBMS SGA In Exadata

If you are planning to apply the latest DBRU 19.27 or DBRU 23.8, please be aware of a significant change impacting SGA memory management. Small pages are no longer permitted for the System Global Area (SGA) on Exadata systems.

Attempting to start an Oracle instance configured to use small pages for the SGA will likely fail, and you might encounter the following error:
ORA-27138: unable to allocate large pages with current parameter setting

Your database instance alert log may also report errors related to the use_large_pages parameter, indicating that FALSE, TRUE, or AUTO are no longer supported settings for databases on Exadata.

Why this change?

DBRU 19.27 introduces this restriction to prevent performance and stability issues. Utilizing small 4k memory pages for the SGA can lead to:
    Page table bloat within the virtual machine.
    RDMA resource memory bloat on both the VM and the hypervisor.
    Potential instability affecting the database node and all instances running on it.

Resolution:
To ensure successful database startup and optimal performance after applying these DBRUs, please verify and adjust your large page configuration at the operating system level. For your SGA configuration, we strongly recommend the following:
    Remove memory_target and memory_max_target
    SET sga_target to your desired SGA size.
    Set use_large_pages=ONLY. This is the Maximum Availability Architecture (MAA) recommended value and ensures your entire SGA resides in huge pages on Linux-based systems.

📚 Further Reading & Reference Materials:
    My Oracle Support Note 401749.1: HugePages / Large Pages on Linux
    My Oracle Support Note 361323.1: Checklist for Valid HugePages Setup
    My Oracle Support Note 1392497.1: How to Calculate Recommended HugePages/LargePages Size for Oracle Database
    Oracle Documentation (Doc ID 3081878.1): Starting DBRU 19.27 and 23.8 Small Pages Are Not Allowed for RDBMS SGA In Exadata.

Please take the necessary steps to review your configurations and implement these recommendations to maintain the stability and performance of your Exadata environment.


Sunday, 13 April 2025

Smarter Patching in Oracle Database 23ai: Two-Stage Rolling Updates and Local Rolling Maintenance

Downtime is the eternal nemesis of enterprise systems, especially those powering critical workloads. With the release of Oracle Database 23ai,
Oracle has introduced a set of intelligent patching and maintenance features that drastically reduce downtime and improve database availability during upgrades and patching

Oracle RAC Two-Stage Rolling Updates
Starting with Oracle Database 23ai, the Oracle RAC two-stage rolling patches feature enables you to apply previously non-rolling patches in a rolling fashion.

Oracle RAC two-stage rolling patches are new types of patches, which you can apply in a rolling fashion in stages. Once the patch is applied on the first node, the second node is patched, and so on. When all the nodes are patched, you can enable the patches. Fixes that you apply using this feature are disabled by default.

You can view the patches applied via this method using:
SELECT * FROM V$RAC_TWO_STAGE_ROLLING_UPDATES;

Local Rolling Database Maintenance:
Another standout feature in 23ai is Local Rolling Database Maintenance—an enhancement designed to keep node-level downtime invisible to users during rolling patching. 

What It Does
During a rolling patch, Oracle can now start a second instance on the same node and relocate sessions to it, reducing the patching impact on connected applications.
This technique:
    Enables session failover on the same node, minimizing CPU and network overhead
    Reduces or eliminates application interruptions during patching
    Works great when paired with (Transparent) Application Continuity
Requirements
    The node must have enough CPU and memory resources to run two instances simultaneously.
    DBAs need to manage new ORACLE_HOME paths and instance configurations.
To prepare and perform local rolling maintenance:
srvctl modify database -d <dbname> -o $NEW_HOME --localrolling
srvctl transfer instance -d <dbname>
This makes it easier to patch a single node while keeping user sessions online and preventing workload relocation to other nodes in the cluster.

Thursday, 10 April 2025

How to Resolve "CheckActiveFilesAndExecutables" Failure in Oracle OPatch

When applying or rolling back a patch using Oracle OPatch, you might encounter the following error:
Prerequisite check "CheckActiveFilesAndExecutables" failed.
OPatch failed with error code 73
This typically happens when some files or libraries in the Oracle Home directory are currently being used by running processes

here i tried opatch rollback
[oracle@prdracdb01 ~]$ opatch rollback -id 35739076
Oracle Interim Patch Installer version 12.2.0.1.45
Copyright (c) 2025, Oracle Corporation.  All rights reserved.

Oracle Home       : /u01/app/oracle/product/19.0.0.0/dbhome_1
Central Inventory : /app/oraInventory
   from           : /u01/app/oracle/product/19.0.0.0/dbhome_1/oraInst.loc
OPatch version    : 12.2.0.1.45
OUI version       : 12.2.0.7.0
Log file location : /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_10-47-49AM_1.log
Patches will be rolled back in the following order:
   35739076
Prerequisite check "CheckActiveFilesAndExecutables" failed.
The details are:
Following active files/executables/libs are used by ORACLE_HOME :/u01/app/oracle/product/19.0.0.0/dbhome_1
/u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1

UtilSession failed: Prerequisite check "CheckActiveFilesAndExecutables" failed.
Log file location: /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_10-47-49AM_1.log

OPatch failed with error code 73

[oracle@prdracdb01 ~]$

Patching failed with UtilSession failed: Prerequisite check "CheckActiveFilesAndExecutables" failed.

Analyzing the OPatch Log:
[oracle@prdracdb01 ~]$ cat /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_10-47-49AM_1.log
[Apr 10, 2025 10:47:49 AM] [INFO]   CAS Dynamic Loading :
[Apr 10, 2025 10:47:49 AM] [INFO]   CUP_LOG: Trying to load HomeOperations object
[Apr 10, 2025 10:47:49 AM] [INFO]   CUP_LOG: HomeOperations object created. CUP1.0 is enabled
[Apr 10, 2025 10:47:49 AM] [INFO]   OPatch invoked as follows: 'rollback -id 35739076 -invPtrLoc /u01/app/oracle/product/19.0.0.0/dbhome_1/oraInst.loc '
[Apr 10, 2025 10:47:49 AM] [INFO]   Runtime args: [-Xverify:none, -Xmx3072m, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch, -DCommonLog.LOG_SESSION_ID=, -DCommonLog.COMMAND_NAME=rollback, -DOPatch.ORACLE_HOME=/u01/app/oracle/product/19.0.0.0/dbhome_1, -DOPatch.DEBUG=false, -DOPatch.MAKE=false, -DOPatch.RUNNING_DIR=/u01/app/oracle/product/19.0.0.0/dbhome_1/OPatch, -DOPatch.MW_HOME=, -DOPatch.WL_HOME=, -DOPatch.COMMON_COMPONENTS_HOME=, -DOPatch.OUI_LOCATION=/u01/app/oracle/product/19.0.0.0/dbhome_1/oui, -DOPatch.FMW_COMPONENT_HOME=, -DOPatch.OPATCH_CLASSPATH=, -DOPatch.WEBLOGIC_CLASSPATH=, -DOPatch.SKIP_OUI_VERSION_CHECK=, -DOPatch.NEXTGEN_HOME_CHECK=false, -DOPatch.PARALLEL_ON_FMW_OH=]
[Apr 10, 2025 10:47:49 AM] [INFO]   Heap in use : 112 MB
                                    Total memory: 1963 MB
                                    Free memory : 1850 MB
                                    Max memory  : 2731 MB
[Apr 10, 2025 10:47:49 AM] [INFO]   Oracle Home       : /u01/app/oracle/product/19.0.0.0/dbhome_1
                                    Central Inventory : /app/oraInventory
                                       from           : /u01/app/oracle/product/19.0.0.0/dbhome_1/oraInst.loc
                                    OPatch version    : 12.2.0.1.45
                                    OUI version       : 12.2.0.7.0
                                    OUI location      : /u01/app/oracle/product/19.0.0.0/dbhome_1/oui
                                    Log file location : /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_10-47-49AM_1.log
[Apr 10, 2025 10:47:49 AM] [INFO]   Patch history file: /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch_history.txt
[Apr 10, 2025 10:47:51 AM] [INFO]   [OPSR-TIME] Loading raw inventory
[Apr 10, 2025 10:47:51 AM] [INFO]   [OPSR-MEMORY] Loaded all components from inventory. Heap memory in use: 153 (MB)
[Apr 10, 2025 10:47:51 AM] [INFO]   [OPSR-MEMORY] Loaded all one offs from inventory. Heap memory in use: 174 (MB)
[Apr 10, 2025 10:47:51 AM] [INFO]   [OPSR-TIME] Raw inventory loaded successfully
[Apr 10, 2025 10:47:51 AM] [INFO]   NRollback::no CAS enabled, OPatch runs with legacy process.
[Apr 10, 2025 10:47:51 AM] [INFO]   opatch-external.jar is in /u01/app/oracle/product/19.0.0.0/dbhome_1/OPatch/jlib/opatch-external.jar
[Apr 10, 2025 10:47:53 AM] [INFO]   [OPSR-TIME] Loading cooked inventory
[Apr 10, 2025 10:47:53 AM] [INFO]   [OPSR-MEMORY] : Loading cooked one offs. Heap memory used 215 (MB)
[Apr 10, 2025 10:47:55 AM] [INFO]   [OPSR-MEMORY] : Loaded cooked oneoffs. Heap memory used : 253 (MB)
[Apr 10, 2025 10:47:55 AM] [INFO]   [OPSR-TIME] Cooked inventory loaded successfully
[Apr 10, 2025 10:48:00 AM] [INFO]   [OPSR-TIME] buildFilesConflict begins
[Apr 10, 2025 10:48:00 AM] [INFO]   [OPSR-TIME] checkFileVersionConflict begins
[Apr 10, 2025 10:48:00 AM] [INFO]   Alias feature is enable?false
[Apr 10, 2025 10:48:00 AM] [INFO]   [OPSR-TIME] checkFileVersionConflict begins
[Apr 10, 2025 10:48:00 AM] [INFO]   [OPSR-TIME] buildFilesConflict ends
[Apr 10, 2025 10:48:00 AM] [INFO]   Subset Patch 29517242 remain inactive due to active superset patch 35643107
[Apr 10, 2025 10:48:00 AM] [INFO]   Subset Patch 29585399 remain inactive due to active superset patch 35655527
[Apr 10, 2025 10:48:00 AM] [INFO]   OPatchSessionHelper::sortOnOverlay() Sorting is not needed
[Apr 10, 2025 10:48:02 AM] [INFO]   Patches will be rolled back in the following order:
                                       35739076
[Apr 10, 2025 10:48:02 AM] [INFO]   Running prerequisite checks...
[Apr 10, 2025 10:48:02 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/oracle at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:02 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/oracle at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:02 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:02 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/extjob at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:02 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/extjob at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:02 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:02 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/extjobo at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:02 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/extjobo at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:02 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:02 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/setasmgid at Thu Apr 10 10:48:02 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/setasmgid at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:03 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/kfod at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/kfod at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:03 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/renamedg at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/bin/renamedg at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:03 AM] [INFO]   Start fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   Finish fuser command /sbin/fuser /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   SKIP_FUSER_WARNINGS is set to true (flag was set in opatch.properties)
[Apr 10, 2025 10:48:03 AM] [INFO]   Files in use by a process: /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 PID( 42574 92627 )
[Apr 10, 2025 10:48:03 AM] [INFO]   Printing more details of active processes:
[Apr 10, 2025 10:48:03 AM] [INFO]   START PARENT PROCESS DETAILS
                                    PID COMMAND
                                    83924 bash
                                    END PARENT PROCESS DETAILS
[Apr 10, 2025 10:48:03 AM] [INFO]   START CHILD PROCESS DETAILS FOR PARENT PROCESS: 83924
                                    PID COMMAND
                                    92627 rman
                                    END CHILD PROCESS DETAILS FOR PARENT PROCESS: 83924
[Apr 10, 2025 10:48:03 AM] [INFO]   START PARENT PROCESS DETAILS
                                    PID COMMAND
                                    42548 Standby_sync.sh
                                    END PARENT PROCESS DETAILS
[Apr 10, 2025 10:48:03 AM] [INFO]   START CHILD PROCESS DETAILS FOR PARENT PROCESS: 42548
                                    PID COMMAND
                                    42574 python
                                    END CHILD PROCESS DETAILS FOR PARENT PROCESS: 42548
[Apr 10, 2025 10:48:03 AM] [INFO]   Following active files/executables/libs are used by ORACLE_HOME :/u01/app/oracle/product/19.0.0.0/dbhome_1
                                    /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1
[Apr 10, 2025 10:48:03 AM] [INFO]   Prerequisite check "CheckActiveFilesAndExecutables" failed.
                                    The details are:                      
                                    
                                    Following active files/executables/libs are used by ORACLE_HOME :/u01/app/oracle/product/19.0.0.0/dbhome_1
                                    /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1
[Apr 10, 2025 10:48:03 AM] [SEVERE] OUI-67073:UtilSession failed: Prerequisite check "CheckActiveFilesAndExecutables" failed.
[Apr 10, 2025 10:48:03 AM] [INFO]   Finishing UtilSession at Thu Apr 10 10:48:03 PDT 2025
[Apr 10, 2025 10:48:03 AM] [INFO]   Log file location: /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_10-47-49AM_1.log
[Apr 10, 2025 10:48:03 AM] [INFO]   Stack Description: java.lang.RuntimeException: Prerequisite check "CheckActiveFilesAndExecutables" failed.
                                        at oracle.opatch.OPatchSessionHelper.runRollbackPrereqs(OPatchSessionHelper.java:5253)
                                        at oracle.opatch.opatchutil.NRollback.legacy_process(NRollback.java:762)
                                        at oracle.opatch.opatchutil.NRollback.process(NRollback.java:217)
                                        at oracle.opatch.opatchutil.OUSession.nrollback(OUSession.java:1154)
                                        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                                        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                                        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                                        at java.lang.reflect.Method.invoke(Method.java:498)
                                        at oracle.opatch.UtilSession.process(UtilSession.java:355)
                                        at oracle.opatch.OPatchSession.process(OPatchSession.java:2640)
                                        at oracle.opatch.OPatch.process(OPatch.java:888)
                                        at oracle.opatch.OPatch.main(OPatch.java:945)
                                    Caused by: oracle.opatch.PrereqFailedException: Prerequisite check "CheckActiveFilesAndExecutables" failed.
                                        ... 12 more
[oracle@prdracdb01 ~]$

revealed that the rollback failed due to the shared library libclntsh.so.19.1 being used by running processes.
The log details also confirmed active processes holding the file:
Files in use by a process:
/u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 PID( 42574 92627 )
...
PID COMMAND
92627 rman
42574 python

Identify and Kill the Processes:
Check which processes were using libclntsh.so.19.1:
[oracle@prdracdb01 bin]$ lsof | grep libclntsh.so.19.1
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
      Output information may be incomplete.
lsof: WARNING: can't stat() bpf file system /opt/sentinelone/ebpfs/bpf_mount
      Output information may be incomplete.
python    42574               oracle  mem       REG              252,0  82204624          25219363 /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1
rman      92627               oracle  mem       REG              252,0  82204624          25219363 /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1

Kill both sessions
[oracle@prdracdb01 bin]$ kill -9 42574
[oracle@prdracdb01 bin]$ lsof | grep libclntsh.so.19.1
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
      Output information may be incomplete.
lsof: WARNING: can't stat() bpf file system /opt/sentinelone/ebpfs/bpf_mount
      Output information may be incomplete.
rman      92627               oracle  mem       REG              252,0  82204624          25219363 /u01/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1
[oracle@prdracdb01 bin]$ ps -ef | grep rman
oracle   11618 88116  0 11:02 pts/4    00:00:00 grep --color=auto rman
oracle   92627 83924  0 Jan06 pts/1    00:00:05 rman
[oracle@prdracdb01 bin]$ kill -9 92627

tried optach and completed successfully  
[oracle@prdracdb01 OPatch]$ opatch rollback -id 35739076
Oracle Interim Patch Installer version 12.2.0.1.45
Copyright (c) 2025, Oracle Corporation.  All rights reserved.
Oracle Home       : /u01/app/oracle/product/19.0.0.0/dbhome_1
Central Inventory : /app/oraInventory
   from           : /u01/app/oracle/product/19.0.0.0/dbhome_1/oraInst.loc
OPatch version    : 12.2.0.1.45
OUI version       : 12.2.0.7.0
Log file location : /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_11-03-11AM_1.log
Patches will be rolled back in the following order:
   35739076
The following patch(es) will be rolled back: 35739076

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/19.0.0.0/dbhome_1')

Is the local system ready for patching? [y|n]
y
User Responded with: Y
Rolling back patch 35739076...
RollbackSession rolling back interim patch '35739076' from OH '/u01/app/oracle/product/19.0.0.0/dbhome_1'
Patching component oracle.rdbms, 19.0.0.0.0...
Patching component oracle.rdbms.rsf, 19.0.0.0.0...
RollbackSession removing interim patch '35739076' from inventory
Log file location: /u01/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatch/opatch2025-04-10_11-03-11AM_1.log
OPatch succeeded.
[oracle@prdracdb01 OPatch]$

 

Sunday, 6 April 2025

Log Generation Rate in Azure SQL Database Hyperscale Pool

What is Log Generation Rate?
Log generation rate refers to the speed at which transaction logs are produced in a database. In Hyperscale Pool, log generation is closely monitored and regulated to prevent overloading the system. Azure implements log rate governance to ensure that log generation stays within defined limits, keeping the system stable and performing efficiently.
Log Rate Governance in Hyperscale
By default, Hyperscale databases have a log generation limit of 105 MB/s, irrespective of the compute size. If everything is running smoothly, the log generation can reach 100 MiB/s. This is designed to ensure that logs are consistently processed and replicated without overwhelming system resources.
However, there may be situations where Azure needs to temporarily reduce the log generation rate. This happens when a secondary replica or page server falls behind in applying the transaction logs. The system will then throttle the log generation rate to allow the lagging components to catch up, ensuring the overall stability of the database.
When Does Log Generation Rate Get Reduced?
Log generation rate may be reduced for several reasons:
    Delayed log consumption by a page server or replica.
    A geo-secondary replica might be lagging in applying logs.
    Slow database checkpointing could delay log processing on the page server.
    Migration or reverse migration from Hyperscale to a non-Hyperscale database can also cause temporary delays in log consumption.
Monitoring Log Generation Rate with sys.dm_hs_database_log_rate
Azure provides the sys.dm_hs_database_log_rate dynamic management function (DMF) to monitor and troubleshoot log generation rates in Hyperscale. This function returns detailed information on which components are limiting the log generation rate, including:
    Current log rate limit
    Catch-up rate of components (bytes per second)
    Component-specific delays and logs that are behind
Key Columns in the DMF:
    current_max_log_rate: Maximum log rate limit in bytes per second.
    catchup_rate: The rate at which lagging components are catching up.
    catchup_bytes: The amount of log data that must be processed to catch up.
    role_desc: Describes the role of the component affecting log rate, such as a page server, replica, or geo-replica.
This tool helps you quickly identify any components causing delays and allows you to take corrective actions if needed.
How to Check Log Generation Rate in Your Database
To check the log generation rate for a specific database, use the following query:
SELECT current_max_log_rate_bps, role_desc, catchup_rate_bps, catchup_distance_bytes
FROM sys.dm_hs_database_log_rate(DB_ID(N'YourDatabaseName'));

For databases within an elastic pool, you can use NULL to get results for all databases in the pool:
SELECT current_max_log_rate_bps, role_desc, catchup_rate_bps, catchup_distance_bytes
FROM sys.dm_hs_database_log_rate(NULL);

Wait types appear in sys.dm_os_wait_stats when the log rate is reduced:
Wait type    Reason
RBIO_RG_STORAGE    Delayed log consumption by a page server
RBIO_RG_DESTAGE    Delayed log consumption by the long-term log storage
RBIO_RG_REPLICA    Delayed log consumption by an HA secondary replica or a named replica
RBIO_RG_GEOREPLICA    Delayed log consumption by a geo-secondary replica
RBIO_RG_DESTAGE    Delayed log consumption by the log service
RBIO_RG_LOCALDESTAGE    Delayed log consumption by the log service
RBIO_RG_STORAGE_CHECKPOINT    Delayed log consumption on by a page server due to slow database checkpoint
RBIO_RG_MIGRATION_TARGET    Delayed log consumption by the non-Hyperscale database during reverse migration
 


 

Wednesday, 2 April 2025

Understanding IPv4 and IPv6

Have you ever wondered how your computer knows where to send that email or how Netflix finds your device to stream your favorite show? 
The answer lies in something called Internet Protocol addresses - the postal system of the digital world.

What Are IP Addresses?
Think of IP addresses like street addresses for the internet. Just as your home needs a unique address for mail delivery, every device connected to the internet needs a unique identifier so data can find its way to the right destination.

IPv4: The Current Standard
IPv4 (Internet Protocol version 4) has been the backbone of internet communication since the 1980s. Here's what makes it tick:

The Format
IPv4 addresses look like this: 192.168.1.1 or 8.8.8.8. These are four numbers (each between 0-255) separated by dots. Behind the scenes, your computer sees this as a 32-bit binary number.

The Problem
Here's where things get interesting - and problematic. IPv4 can only create about 4.3 billion unique addresses. That sounds like a lot, but with billions of smartphones, computers, smart TVs, and IoT devices, we've essentially run out of new IPv4 addresses.


Types You Might Recognize
Public IP: Your internet-facing address (like 74.125.224.72)
Private IP: Your device's address on your home network (usually starts with 192.168)
Localhost: 127.0.0.1 - your computer talking to itself

IPv6: The Future is Here
IPv6 (Internet Protocol version 6) is the solution to IPv4's limitations, and it's already being rolled out worldwide.

The Format
IPv6 addresses are much longer: 2001:0db8:85a3:0000:0000:8a2e:0370:7334. They use hexadecimal (0-9, a-f) and are separated by colons instead of dots.

The Solution
IPv6 provides approximately 340 undecillion addresses. To put that in perspective, that's enough addresses to give every grain of sand on Earth its own IP address - with plenty left over!

What's Better About IPv6?
Unlimited addresses - no more address exhaustion
Better security - built-in encryption capabilities
Faster routing - simplified packet headers
Auto-configuration - devices can set up their own addresses

The Transition Challenge
So why aren't we all using IPv6 yet? 
The transition is like trying to upgrade every road in the world simultaneously while traffic keeps flowing. Most networks now run "dual-stack" - supporting both IPv4 and IPv6 - allowing for a gradual transition.

What This Means for You
As an everyday internet user, this transition is mostly invisible. Your devices likely already support both protocols, and your internet service provider is handling the complexity behind the scenes.
However, understanding these concepts helps you appreciate the incredible engineering that makes our connected world possible. Every time you send a message, stream a video, or browse the web, you're participating in one of humanity's greatest collaborative achievements - the internet.

Looking Forward
IPv6 isn't just about solving today's address shortage; it's about enabling tomorrow's innovations. From smart cities with millions of connected sensors to personal devices we haven't even imagined yet, IPv6 provides the foundation for our digital future.
The next time you effortlessly connect to the internet, remember the elegant addressing system working behind the scenes - and marvel at how 32 bits evolved into 128 bits to keep our world connected.

Sunday, 30 March 2025

VECTOR_DISTANCE

What is VECTOR_DISTANCE?
The VECTOR_DISTANCE function calculates the distance between two vectors (represented as expr1 and expr2). Depending on the context, the vectors can represent various types of data, such as images, text, or numbers.
Key Points:
    Purpose: Calculates the distance between two vectors.
    Optional Metric: You can specify a distance metric. If not specified:
        The default metric is Cosine Distance for general vectors.
        For binary vectors, the default is Hamming Distance.
If you do not specify a distance metric, Cosine Distance is used by default for most cases, and Hamming Distance for binary vectors.
Shorthand Functions for Common Distance Metrics
To make it easier to calculate distances, the VECTOR_DISTANCE function comes with shorthand functions for common distance metrics. These are equivalent to the more detailed functions, providing a more compact way to express vector distance calculations.
Here are the shorthand functions:
    L1_DISTANCE: Manhattan (L1) distance.
    L2_DISTANCE: Euclidean (L2) distance.
    COSINE_DISTANCE: Cosine similarity distance.
    INNER_PRODUCT: Negative dot product (used to compare similarity).
    HAMMING_DISTANCE: Hamming distance for binary vectors.
    JACCARD_DISTANCE: Jaccard distance for binary vectors.

Distance Metrics Available:
    COSINE: Measures the cosine of the angle between two vectors, useful for high-dimensional data like text.
    DOT: Calculates the negated dot product of two vectors, useful for measuring similarity.
    EUCLIDEAN: Measures the straight-line (L2) distance between two vectors, commonly used in spatial data.
    EUCLIDEAN_SQUARED: Euclidean distance without taking the square root, often used in optimization tasks.
    HAMMING: Counts the number of differing dimensions between two binary vectors, typically used in error correction.
    MANHATTAN: Also known as L1 distance, calculates the sum of absolute differences between vector components, useful for grid-based problems.
    JACCARD: Measures dissimilarity between binary vectors based on the ratio of the intersection to the union of the vectors.
    
Shorthand Operators for Distance Metrics:
Instead of specifying the distance metric explicitly, you can use shorthand operators for quicker calculations. These are especially handy when writing queries or performing similarity searches:
    <->: Equivalent to L2_DISTANCE (Euclidean distance).
    <=>: Equivalent to COSINE_DISTANCE (Cosine similarity).
    <#>: Equivalent to -1 * INNER_PRODUCT (Negative dot product).