Thursday, 27 November 2025

Java Garbage Collection (GC): How Modern JVM GC Works, Evolves, and Scales

Garbage Collection is the JVM’s silent guardian. It quietly reclaims memory from objects your application no longer needs—no manual freeing, no memory leaks (well, mostly), no pointer nightmares.

But as applications scale and heap sizes grow into gigabytes or even terabytes, those tiny moments when GC stops your application (known as Stop-The-World pauses) can become the single biggest threat to performance.

To understand why GC pauses happen—and how modern collectors like G1, ZGC, and Shenandoah nearly eliminate them—we need to start with the basics: how Java organises memory.

Java Heap: Where Objects Live and Die
The design of the Java Heap is based on one powerful, observed truth: the "Weak Generational Hypothesis"—that is, most objects die very young. This insight led to Generational Garbage Collection, where the heap is strategically partitioned based on an object's expected lifespan.


GC Roots
This is the starting line for the GC process. An object is only considered "live" if the GC can trace a path to it from one of these roots. They are the application's solid reference points, the objects that must absolutely not be collected:
    Local variables on your thread stacks.
    Static fields of loaded classes.
    Active threads and native JNI references.

Eden:
Every new object you create with new is born here. This is the most volatile area, constantly being collected by the Minor GC. It acts like a nursery where the majority of objects (≈90% or more) are created and die almost instantly, never leaving this space.

Survivor Spaces (S0 / S1):
Objects that managed to survive their first encounter with the Minor GC in Eden are moved here. They ping-pong back and forth between the two small spaces (S0 and S1). Each time an object survives this trip, its "age" counter ticks up, proving its longevity.

Old Generation: 
Objects that successfully pass a predefined age threshold (usually around 15 minor collections) are considered long-lived and are promoted to the Old Generation. This area contains the stable, long-term residents, and consequently, it is collected much less often by a Major GC or Full GC.

Metaspace: 
This area is technically outside the Heap in native system memory. Since Java 8 (it replaced the old PermGen), Metaspace holds the metadata about the classes your application loads—the structure, names, and methods. It's the blueprint archive for your application's code. 

GC Mechanisms: How Garbage is Found and Removed
How does the JVM actually clean up? There are three primary mechanisms that all GCs use in some combination.
Mark & Sweep:
Mark Phase: The GC walks the object graph starting from the GC Roots and marks everything reachable (live).
Sweep Phase: The GC scans the heap and reclaims memory from unmarked (garbage) objects.
The catch? This leaves the heap with Swiss-cheese-like holes, known as fragmentation. This fragmentation can lead to a dreaded Full GC when the JVM can't find a contiguous space large enough for a new object, even if there is technically enough free memory overall.

Mark–Sweep–Compact: Solving the Fragmentation Problem
To fix fragmentation, a third step is added:
    Compact Phase: All live objects are shuffled to one side of the heap, leaving the free space as one large, clean block. This is great for allocation, but compaction takes time, adding significantly to the STW pause.

The Copying Algorithm :
In the Young Generation, the JVM uses a far faster trick: copying. Instead of marking, sweeping, and compacting, it simply copies live objects from the active spaces (Eden + S0) into the empty space (S1). It then wipes the old spaces clean. Copying is naturally compacting and lightning-fast—this is why Minor GCs are usually so quick.

Tri-Color Marking:
For modern GCs (G1, ZGC, Shenandoah) to work concurrently—meaning the application runs while the GC cleans—they use Tri-Color Marking. This helps the GC understand the current state of objects even as application threads (Mutators) are busy changing references.
    White: Unvisited (suspected garbage).
    Gray: Visited, but its object references have not yet been scanned.
    Black: Visited, and all of its references have been scanned (known-live).

To prevent the application from accidentally hiding a live object (the "tri-color invariant" violation), these GCs use write barriers or load barriers—tiny, quick bits of code inserted by the JVM compiler to manage references whenever the application touches memory.

GC Evolution & Timelines:

Java VersionCollectorNotes
Java 1.3Serial GCFirst simple GC
Java 1.4Parallel GCMultithreaded, throughput-focused
Java 5CMSFirst low-pause GC
Java 7G1 (experimental)Region-based innovation
Java 9G1 defaultCMS deprecated
Java 11ZGC (experimental)Sub-millisecond pauses
Java 15ZGC GAProduction-ready
Java 12–15ShenandoahUltra-low latency
Java 14CMS removedEnd of an era


Serial GC:
Think of Serial GC as a single janitor who locks the doors before cleaning.
    The Vibe: Simple and sequential. It uses a single thread for all collection work.
    The Cost: This is the definition of a Stop-The-World (STW) pause. Every single application thread must halt for both Young and Old generation collections.
    Best For: Tiny stuff. We're talking small clients, embedded systems, or containers with heaps well under 100MB. If you have plenty of CPU cores, don't use this.
    Enable: -XX:+UseSerialGC

Parallel GC:
Parallel GC is the natural evolution of Serial: "If one thread is slow, use ten!"
    The Goal: It’s nicknamed the Throughput Collector because its mission is to maximize the total amount of work your application gets done. It does this by using multiple GC threads to speed up the collection phase.
    The Tradeoff: It still pauses the world (it’s an STW collector), but the pauses are much shorter than Serial. However, on multi-gigabyte heaps, these pauses can still be noticeable—sometimes hitting the half-second or even one-second mark.
    Mechanism: It uses multi-threaded Mark–Sweep–Compact for both Young and Old collections.
    Enable: -XX:+UseParallelGC

CMS (Concurrent Mark Sweep):
CMS was Java's first serious attempt at achieving low latency. It was a game-changer but came with baggage.
    The Breakthrough: It figured out how to do most of the marking concurrently—meaning the GC was tracking objects while your application threads were still running. This dramatically minimized the longest STW pauses.
    The Flaw: CMS was a non-compacting collector. Over time, the heap became terribly fragmented (Swiss cheese holes!). Eventually, the JVM would fail to find a large enough contiguous block for a new object, leading to a catastrophic, hours-long STW Full GC just to compact everything.
    Status: Due to its complexity and fragmentation issues, CMS is considered legacy—it was deprecated in Java 9 and removed entirely in Java 14.
    Enable: -XX:+UseConcMarkSweepGC
    
G1 GC (Garbage-First):
G1 is the modern standard, a massive leap forward that shifted the focus from the whole heap to manageable regions.
    Core Idea: Instead of treating the heap as three fixed blocks (Eden/Survivor/Old), G1 carves it up into ≈2048 fixed-size regions. These regions dynamically switch roles (Young, Old, Humongous) as needed.
    Pause Prediction: G1 tracks which regions have the most garbage (the best "return on investment"). It follows the Garbage-First principle, prioritizing those regions to meet your specified pause time goal (e.g., "I promise to pause no longer than 200ms").
    Collection: It uses Evacuation (copying) to move live objects out of the selected regions. This means it compacts memory as it cleans, eliminating the fragmentation nightmare that plagued CMS. G1 is the default collector since Java 9 for a reason: it's a great all-around performer.
    Enable: -XX:+UseG1GC

ZGC (Ultra-Low Latency):
ZGC is the future. Its design goal was radical: pause times must be independent of the heap size. You can run a TB-sized heap, and your application will pause for the same fraction of a millisecond as a 1GB heap.
    Concurrent Everything: It does marking, relocation, and reference processing all concurrently with the application.
    The Magic: It achieves this via Colored Pointers and Load Barriers. The GC can literally move an object while your application is using it. When your code tries to access the object, the Load Barrier briefly intercepts the call, corrects the old pointer to the object's new location, and lets the application continue. The pause for this fix-up is incredibly brief.
    Pause Time: Guaranteed ≈1−3ms pauses. This is the choice for extreme low-latency and massive memory systems.
    Enable: -XX:+UseZGC

Shenandoah:
Developed by Red Hat (now part of OpenJDK), Shenandoah shares ZGC's goal of achieving ultra-low pause times independent of heap size.
    Similarities: It is also region-based and uses a concurrent approach.
    Distinction: Shenandoah's key innovation is its highly optimized concurrent compaction. It can perform memory consolidation while your application is fully running, ensuring the heap stays compact and healthy without any long STW events.
    Best For: Scenarios similar to ZGC—very large heaps and demanding latency requirements.
    Enable: -XX:+UseShenandoahGC
    
 







Sunday, 9 November 2025

OCI Ops Insights: Turning Data Into Proactive Intelligence

 What Is OCI Ops Insights? 

Ops Insights is Oracle’s intelligent observability and analytics service that provides comprehensive visibility into resource usage, capacity, and SQL performance across databases and hosts — whether they run on OCI, on-premises, or in hybrid environments.

Think of it as your command center for operational intelligence — combining analytics, automation, and AI-driven recommendations to keep your systems optimized and predictable.

Core Capabilities

 

1. Database Insights
Gain complete visibility into the performance and health of your databases.
SQL Insights – Analyze SQL performance trends, find inefficient queries, and identify tuning opportunities.
Database Performance – Track database-level metrics and diagnose bottlenecks before they impact users.
ADDM Spotlight & AWR Hub – Access Automatic Workload Repository data across your entire fleet for unified analysis.

2. Capacity Planning
Forecast capacity issues before they happen.
Monitor CPU and storage utilization across databases, hosts, and Exadata systems.
Predict growth trends to plan for future expansion or cost optimization.

3. Exadata Insights
Get specialized performance and capacity visibility for Exadata infrastructure.
Analyze workloads with Exadata Warehouse.
Explore data with Exadata Explorer to pinpoint system-level trends.

4. Dashboards & Reporting
Visualize and communicate insights effectively:
Create custom dashboards using out-of-box widgets or saved searches.
Generate news-style reports to share operational summaries with teams and management.
Use the AWR Explorer and Data Object Explorer for deep performance exploration.

5. Administration & Configuration
Seamlessly manage your monitored environment:
Configure agent-managed and Enterprise Manager-managed resources.
Enable Autonomous AI Database Full Feature for advanced analytics.
Manage endpoints, AWR Hubs, and collection configurations with ease.  

 

 

 

 

 

 

 

 

 

Saturday, 1 November 2025

Securing Oracle Databases with Oracle Data Safe

 

 
What Is Oracle Data Safe?
Oracle Data Safe is a cloud-based, unified security control center designed specifically for Oracle Databases — whether they reside in Oracle Cloud Infrastructure (OCI), Autonomous Database, or on-premises deployments.

It simplifies the complex, manual tasks involved in securing databases and meeting compliance requirements. With a few clicks, you can evaluate risks, analyze user privileges, discover sensitive data, apply masking policies, and audit activities.
 
Features of Oracle Data Safe:
 
 
 
🔍 1. Security Assessment
The Security Assessment feature evaluates the security posture of your Oracle Databases.
It reviews configurations, user accounts, and security controls, then provides detailed findings with actionable recommendations to reduce or mitigate risks.

Key aspects:
  • Analyzes configuration settings, user privileges, and security parameters.
  • Compares against industry frameworks like STIG, CIS Benchmarks, EU GDPR, and Oracle best practices.
  • Generates an overall Security Score and a prioritized list of vulnerabilities.
  • This ensures your databases consistently align with compliance standards and internal security policies.

👥 2. User Assessment
User Assessment identifies users and accounts that may pose security risks due to excessive privileges, weak authentication, or poor password practices.
It analyzes user data stored in the database dictionary and assigns a risk score to each user.

Capabilities include:
  • Identifies highly privileged or inactive accounts.
  • Evaluates password policies, authentication types, and password change frequency.
  • Links directly to related audit trail entries for deeper investigation.
  • This enables DBAs and security teams to implement least-privilege access controls and strengthen user governance.

🧭 3. Data Discovery

Data Discovery automates the identification of sensitive data within your Oracle Databases.
It scans both data and metadata to locate information that could fall under privacy or compliance regulations.

Highlights:
  • Detects data across multiple sensitivity categories — personal, financial, healthcare, employment, academic, and more.
  • Offers default discovery templates or lets you define custom data models to fit your organization’s classification standards.
  • Produces clear reports listing schemas, tables, and columns containing sensitive data.
  • With Data Discovery, you know exactly where your critical data resides — a foundational step toward compliance and data protection.
🧩 4. Data Masking
The Data Masking feature helps organizations protect sensitive data when replicating or sharing databases for development, testing, or analytics.
It replaces real values with realistic but fictitious data, maintaining referential integrity while ensuring privacy.

Key benefits:
  • Supports multiple masking formats — randomization, substitution, nullification, and lookup-based.
  • Integrates seamlessly with Data Discovery results for consistent masking policies.
  • Enables safe use of production-like data in non-production environments.
  • This reduces the risk of data exposure and helps organizations comply with data privacy regulations.
📜 5. Activity Auditing
Activity Auditing provides continuous visibility into who is doing what in your databases.
It captures user activities — from logins and schema changes to data queries and privilege modifications.

Capabilities:
  • Monitors database activity in real time.
  • Generates audit reports for compliance and governance reviews.
  • Detects unusual or unauthorized access patterns.
  • Auditing is crucial for incident investigation, accountability, and regulatory compliance.
⚡ 6. Alerts
Alerts keep you informed of unusual or high-risk database activities as they occur.
You can define custom thresholds or use predefined alert templates to detect anomalies in user behavior or database operations.
With proactive alerting, teams can respond faster to threats, minimizing potential damage and downtime.

🧱 7. SQL Firewall (New in Oracle AI Database 26ai)
The SQL Firewall introduces an advanced layer of protection directly at the SQL level, helping safeguard databases from SQL injection attacks, compromised accounts, and unauthorized queries.
Oracle Data Safe acts as the central management hub for SQL Firewall policies across all connected databases.

Capabilities:
  • Collects and baselines authorized SQL activities for each user.
  • Generates allowlist-based firewall policies that define approved SQL statements and connection paths.
  • Monitors and reports SQL Firewall violations in real time across your entire database fleet.
  • This feature enables a zero-trust approach to database access — ensuring only verified SQL statements are executed against your most sensitive systems. 
 
Step-by-Step Configuration Guide:
  • Sign in to your OCI Console with appropriate privileges (Security Administrator or tenancy-level admin).
  • In the left navigation menu, go to Oracle AI Database → Data Safe - Database Security 
 

  
Step 2: Register Your Database
Before you can run any assessments or audits, your database needs to be registered with Data Safe.

Supported Target Databases:
  • On-Premises Oracle AI Database
  • Oracle Autonomous AI Database on Dedicated Exadata Infrastructure 
  • Oracle Autonomous AI Database on Exadata Cloud@Customer
  • Oracle Autonomous AI Database Serverless
  • Oracle Base Database Service
  • Oracle AI Database on a compute instance in Oracle Cloud Infrastructure
  • Oracle Exadata Database Service on Cloud@Customer
  • Oracle Exadata Database Service on Dedicated Infrastructure
  • Oracle Exadata Database Service on Exascale Infrastructure
  • Amazon RDS for Oracle
  • Oracle Database@AWS
  • Oracle Database@Azure
  • Oracle Database@Google Cloud
Lets Register an Autonomous Database 
In the OCI Console, navigate to Data Safe → Targets → Register Target Database.
 

 
For Database Type, select Autonomous Database.
Under Data Safe Target Information:
  • Choose the Compartment where your database resides.
  • Select your database from the drop-down list of available Autonomous Databases.
  • Enter a Display Name for your Data Safe target.
  • (Optional) Add a Description to help identify the purpose or environment of this database (e.g., “Data Safe practice environment”).
  • Choose a Compartment for the target registration and (Optional) apply Tags for easier management and automation.
  • Review the connection details to ensure the selected database and compartment information are correct. 

 

Click Register to complete the process.
 
 
Step 3: Explore the Data Safe Dashboard

After completing the registration, your target database will now appear in the Targets list with an Active status — confirming a successful connection to Oracle Data Safe.



Now, let’s move to the Oracle Data Safe Dashboard, the central console where you can view, monitor, and manage all your database security operations.
 
In the OCI Console, navigate to
Oracle AI Database → Data Safe - Database Security → Dashboard and click

This will take you to the Data Safe → Security Center → Dashboard, where you can view an integrated overview of your database security posture — including assessments, user risks, sensitive data discovery, and audit summaries across all registered databases.

 

 

You can view quick summaries such as:

Security assessment:

 

User assessment:

 

From this dashboard, you can easily navigate to each of the key features:
Assessments – Run or view Security and User Assessments
Data Discovery & Masking – Identify and protect sensitive data
Auditing – Monitor and analyze database activities
SQL Firewall & Alerts – Manage SQL protection and incident notifications

This blog covers the high-level steps to set up Oracle Data Safe.
In the next post, I will share more detailed insights and advanced configurations to get the most out of Data Safe.