Oracle (Active) Data Guard 19c

11m ago
93 Views
9 Downloads
1.58 MB
23 Pages
Last View : 1d ago
Last Download : 1m ago
Upload by : Elise Ammons
Transcription

Oracle (Active)Data Guard 19cReal-Time Data Protection and AvailabilityWHITE PAPER / MARCH 7, 2019

TABLE OF CONTENTSIntroduction . 3Oracle Active Data Guard – An Overview. 4Data Guard 19c New Features . 5How Data Guard Synchronizes Standby Database(s) . 8Protection Modes . 11Managing a Data Guard Configuration . 11Using Data Guard to reduce Planned Downtime . 13Active Data Guard. 14Conclusion . 18Appendix A: Summary of (Active) Data Guard Features by Version . 19Appendix B: Transient Logical Database Rolling Upgrade . 222WHITE PAPER / Oracle (Active)Data Guard 19c

INTRODUCTIONSuccessful high availability (HA) architectures prevent downtime and dataloss by using redundant systems and software to eliminate single points offailure. The same principle applies to mission critical databases.Administrator error, data corruption caused by system or software faults, orcomplete site failures can affect the availability of a database. Even aclustered database running on multiple servers using shared storage canbe exposed to single points of failure if not adequately protected.The only way to prevent being impacted by single points of failure is tohave a completely independent copy of a production database alreadyrunning on a different system and ideally deployed at a second location,which can be quickly accessed if the production database becomesunavailable for any reason.Oracle Active Data Guard is the most comprehensive solution available toeliminate single points of failure for mission critical Oracle Databases. Itprevents data loss and downtime in the simplest and most economicalmanner by maintaining a synchronized physical replica of a productiondatabase at a remote location. If the production database is unavailable forany reason, client connections can quickly, and in some configurationstransparently, failover to the synchronized replica to restore service. ActiveData Guard eliminates the high cost of idle redundancy by allowingreporting applications, ad-hoc queries, and data extracts to be offloaded toread-only copies of the production database. Active Data Guard’s deepintegration with Oracle Database and complete focus on real-time dataprotection and availability avoids compromises found in storage remotemirroring or other host-based replication solutions.This paper describes both Active Data Guard (a licensed option) and DataGuard (included in Oracle Database Enterprise Edition) in detail and istailored to IT managers, Database Administrators and technical staff, whoare evaluating different alternatives to protect against data loss anddatabase downtime.3WHITE PAPER / Oracle (Active)Data Guard 19c

ORACLE ACTIVE DATA GUARD – AN OVERVIEWOracle (Active) Data Guard capabilities in Oracle Database 19c further enhance its strategic objectiveof preventing data loss, providing high availability, eliminating risk, and increasing return on investmentby enabling highly functional active disaster recovery systems that are simple to deploy and manage.It accomplishes this by providing the management, monitoring, and automation software infrastructureto create and maintain one or more synchronized standby databases that protect Oracle data fromfailures, data corruption, human error, and disasters.Figure 1: Oracle Active Data Guard Architecture OverviewActive Data Guard uses the simplicity of physical replication, but its deep integration with OracleDatabase provides unique isolation between primary and standby databases to deliver the highestlevel of protection against data loss. Active Data Guard supports both synchronous (guaranteed zerodata loss) and asynchronous (near-zero data loss) protection. To maintain high availability for missioncritical applications, database administrators can choose either manual or automatic failover to astandby should the primary system become unavailable for any reason.Active Data Guard is a licensed option for Oracle Database Enterprise Edition. All capabilitiesdescribed in the following sections that are explicitly referred to as being ‘Active Data Guard’ requirean Active Data Guard license. All capabilities that are explicitly referred to as ‘Data Guard’ areincluded with Oracle Enterprise Edition; no option license is required. Active Data Guard is a supersetof Data Guard thus inherits all Data Guard capabilities.One of the big advantages of Active Data Guard 19c is the better capability to offline read intensiveapplications to the standby. It is now possible to also issue occasional DML against the standbydatabase, so this is now a fully functional reporting database. This leverages the return on investmentas the primary database is used in a more optimal way and the resources of the DR system are usedin an optimal way.4WHITE PAPER / Oracle (Active)Data Guard 19c

DATA GUARD 19C NEW FEATURESACTIVE DATA GUARD DML REDIRECTThis is an Active Data Guard Only Feature, which enables DML operations on the standby database tobe redirected to the primary database to allow and accommodate for reporting applications that makeinfrequent writes to actively run on the Active Data Guard standby database.Figure 2: DML RedirectApplying DML on the standby database can be achieved in 5 simple steps:1.The user issues DML against the open standby database2.This DML is redirected to the primary database3.The DML is then applied in the primary database4.The redo information generated the change is streamed back to the standby database5.The application of the change-based redo information completes the DML redirectDML Redirect can be configured for all sessions connection to the standby database, by setting thesystem initialization parameter “ADG REDIRECT DML” to TRUE. Alternatively, and to override thesystem parameter, “ADG REDIRECT DML” can be used in an alter session command to enable theDML redirect for the current session only:ALTER SESSION ENABLE ADG REDIRECT DML;In either case, DML Redirect should mainly be used for read-mostly, occasional updates applications.FAST-START FAILOVERFast-Start Failover (FSFO) is a feature of the Oracle Data Guard Broker that enables an automaticfailover to the standby database upon failure of the primary. FSFO can be configured in either anactive or an Observer only mode. The benefit of the observer only mode is that it allows for trackingthe behavior of the Data Guard Broker and see the interaction that would have occurred during normalproduction processing.5WHITE PAPER / Oracle (Active)Data Guard 19c

This allows the user to tune the FSFO properties more precisely and to discover under whatcircumstances an automatic failover would have occurred in their environment. This makes it easier tojustify using automatic failovers in order to reduce the recovery time.The fast-start failover target (active or observer only) can be changed dynamically and withoutdisabling fast-start failover as well as without impacting the current environment, which allows users totest how fast-start failover will work by using the observe-only as needed.NEW PARAMETERS FOR TUNING AUTOMATIC OUTAGE RESOLUTION WITH DATA GUARDOracle Data Guard has several processes on the Primary and Standby databases that handle redotransport and archiving which communicate with each other over the network. In certain failuresituations, network hangs, disconnects, and disk I/O issues, these processes can hang potentiallycausing delays in redo transport and gap resolution. Data Guard has an internal mechanism to detectthese hung processes and terminate them allowing the normal outage resolution to occur.The following parameters allow the waits times to be tuned for a specific Data Guard configurationbased on the user network and Disk I/O behavior: DATA GUARD MAX IO TIMEo This parameter sets the maximum number of seconds that can elapse before aprocess is considered hung while performing a regular I/O operation in an OracleData Guard environment. Regular I/O operations include read, write, and statusoperations.DATA GUARD MAX LONGIO TIME.o This parameter sets the maximum number of seconds that can elapse before aprocess is considered hung while performing a long I/O operation in an Oracle DataGuard environment. Long I/O operations include open and close operations.SIMPLIFIED DATABASE PARAMETER MANAGEMENT IN A BROKER CONFIGURATIONUsers can now manage all Data Guard related parameter settings using the SQL*Plus ALTERSYSTEM commands or in DGMGRL with the new EDIT DATABASE . SET PARAMETER command.Parameter changes made in the DGMGRL interface are immediately executed on the target database.In addition, this new capability allows the user to modify a parameter on all databases in a Data Guardconfiguration using the ALL qualifier, eliminating the requirement to attach to each database andexecute an ALTER SYSTEM command or set a Broker property for each database with multiple EDITPROPERTY commands.The SHOW command has also been updated to show the current setting of a parameter in the targetdatabase.SUPPORT FOR MULTI-SHARD QUERY COORDINATORS ON SHARD CATALOG STANDBY DATABASESBefore Oracle Database 19c, only the primary shard catalog database could be used as the multishard query coordinator. In Oracle Database 19c, you can also enable the multi-shard querycoordinator on the shard catalog's Oracle Active Data Guard standby databases.6WHITE PAPER / Oracle (Active)Data Guard 19c

RESTORE POINT REPLICATIONThe process of flashing back a physical standby to a point in time that was captured on the primary issimplified by automatically replicating restore points from primary to the standby. These restore pointsare called replicated restore points. Irrespective of whether a restore point on the primary database isa guaranteed restore point or a normal restore point, the corresponding replicated restore point isalways a normal restore point.The replication of restore points depends on 2 conditions:1.2.The COMPATIBLE initialization parameter for both the primary database and the standbydatabase is set to 19.0.0 or higherThe primary database is open. A restore point that is created on a primary database whenthe primary is in mount mode is not replicated. This restriction is because the restore pointinformation is replicated though the redo.These restore points can be Identified by “ PRIMARY” at the end of the original name and aredisplayed in V RESTORE POINT. This view has been updated and has new column ‘REPLICATED’.When you delete a restore point on the primary, the corresponding replicated restore point on thestandby is also deleted.The managed redo process (MRP) manages the creation and maintenance of replicated restorepoints. If restore points are created on the primary database when MRP is not running, then theserestore points are replicated to the standby database after MRP is started.PHYSICAL STANDBY RECOVERYWhen flashback or point-in-time recovery is performed on the primary database, a standby that is inmounted mode can automatically follow the same recovery procedure performed on the primary.This means that when the standby database was in mount mode on time of the recovery operation ofthe Primary database, that no user intervention is needed.When the Standby database was open. It is necessary to restart the standby database in mount modeand restart the recovery. This recovery will automatically flashback the standby database whennecessary, restart itself and follow the Primary database. For this to succeed you will need to set theparameter DB FLASHBACK RETENTION TARGET to a sufficiently high value so the standbydatabase can perform these operations.7WHITE PAPER / Oracle (Active)Data Guard 19c

HOW DATA GUARD SYNCHRONIZES STANDBY DATABASE(S)A Data Guard configuration includes a production database referred to as the primary database, andup to 30 directly connected replicas referred to as standby databases. Primary and standby databasesconnect over TCP/IP using Oracle Net Services. There are no restrictions on where the databases arephysically located provided they can communicate with each other. A standby database is createdfrom a backup of the primary database without requiring any downtime of the Production application ordatabase. Once a standby database has been created and configured, Data Guard automaticallysynchronizes the primary database and the standby database by transmitting the primary databaseredo - the change vector information used by every Oracle Database to protect transactions – as it isgenerated at the Primary database and applying it to the standby database.REDO TRANSPORT SERVICEData Guard redo transport services handle all aspects of transmitting redo from a primary to a standbydatabases(s). As users commit transactions at a primary database, redo records are generated andwritten to a local online log file. Data Guard transport services simultaneously transmit the same redodirectly from the primary database log buffer (memory allocated within system global area) to thestandby database(s) where it is written to a standby redo log file. Data Guard redo transport is veryefficient for the following reasons: Data Guard’s direct transmission from memory avoids disk I/O overhead on a primary database.This is different from how other host-based replication solutions increase I/O on a primary databaseby reading data from disk and writing captured data back to disk in special-purpose files utilized bytheir replication processes. Data Guard transmits only database redo. This is in stark contrast to storage remote-mirroring whichmust transmit every changed block of every file in order to maintain real-time synchronization.Oracle tests have shown that storage remote-mirroring transmits up to 7 times more networkvolume, and 27 times more network I/O operations than Data Guard. Data Guard physical standby also avoids the I/O overhead of supplemental logging at the primarydatabase required by logical replication solutions. The advantages of physical replication inminimizing I/O impact also extend to the standby database where, unlike logical replication, the DataGuard apply process does not generate local redo that must be written and archived to diskData Guard offers two choices of transport services: synchronous and asynchronousSYNCHRONOUS REDO TRANSPORTSynchronous redo transport requires a primary database to wait for confirmation from the standby thatredo has been received and written to disk (a standby redo log file) before commit success is signaledto the application. Synchronous transport combined with the deep understanding of transactionsemantics by Data Guard apply services provides a guarantee of zero data loss if the primarydatabase suddenly fails.8WHITE PAPER / Oracle (Active)Data Guard 19c

Although there is no physical limit to the distance between primary and standby sites, there is apractical limit to the distance that can be supported. As distance increases, the amount of time that theprimary must wait to receive standby acknowledgement also increases, directly impacting applicationresponse time and throughput. There were two new synchronous transport options implemented inOracle Database 12c Release 1 designed to address this performance concern: Fast Sync provides an easy way of improving performance in synchronous zero data lossconfigurations. Fast Sync allows a standby to acknowledge the primary database as soon as itreceives redo in memory, without waiting for disk I/O to a standby redo log file (SYNC NOAFFIRM).This reduces the impact of synchronous transport on primary database performance by shorteningthe total round-trip time between primary and standby. Fast Sync can introduce a very smallexposure to data loss should simultaneous failures impact both primary and standby databasesbefore the standby I/O completes. The time interval, however, is so brief (both failures must occurwithin milliseconds of each other) and the circumstances so unique that there is a very low likelihoodthat this would occur. Fast Sync is included with Data Guard Far Sync enables a zero data loss failover to a remote standby database even if it is locatedthousands of miles away, without affecting primary database performance or materially increasingcost or complexity. Far Sync is included with Active Data Guard (see the Active Data Guard sectionof this paper for more details).ASYNCHRONOUS REDO TRANSPORTAsynchronous redo transport avoids any impact to primary database performance by acknowledgingcommit success to the application as soon as the local log-file write is complete; it never waits for thestandby database to acknowledge receipt. This performance benefit comes with the potential for asmall amount of data loss because there can be no guarantee that at any moment in time all redo forcommitted transactions has been received by the standby.DATA GUARD TRANSPORT AND MULTI-STANDBY CONFIGURATIONSData Guard transport and multi-standby configurations avoids any impact to primary databaseperformance by acknowledging commit success to the application as soon as the local log-file write iscomplete; it never waits for the standby database to acknowledge receipt. This performance benefitcomes with the potential for a small amount of data loss because there can be no guarantee that atany moment in time all redo for committed transactions has been received by the standby.A multi-standby configuration having both a local and remote standby databases provides thefollowing benefits: Best data protection. The close proximity of the local Data Guard standby enables zero data lossfailover with minimal impact to database performance. Data Guard Fast-Start Failover can also beused to automatically failover to the local standby without manual intervention. Highest availability. Client database connections can rapidly and transparently failover to the localstandby using Transparent Application Failover and Fast Connection Failover. In-flight transactionsalso failover transparently using Application Continuity, new with Oracle Database 12c Release1and included with Active Data Guard or Oracle RAC.9WHITE PAPER / Oracle (Active)Data Guard 19c

Simple operation with continuous data protection. Following a failover to the local standby, theremote standby database automatically recognizes that failover has occurred and begins receivingredo from the new primary database - maintaining DR protection at all times. Cost effective and flexible. While always ready to serve as the Production database in case of afailure, the standby databases can be multi-purposed to function as a test system using Data GuardSnapshot Standby. In addition, they can be used offload read-only workloads from the primarydatabase, offload fast incremental backups, or to perform database rolling upgrades using ActiveData Guard.AUTOMATIC GAP RESOLUTIONIn cases where primary and standby databases become disconnected (network failures or standbyserver failures) redo stops being shipped to that standby database. The primary database continues toprocess transactions and accumulate a backlog of redo until a new connection to the standbydatabase has been established. This disconnected period is reported as an archive log gap andmeasured as transport lag. While in this state, Data Guard monitors the status of the disconnectedstandby database, detects when the connection is re-established, and automatically reconnects andresynchronizes the standby database with the primary by sending the archive log files generatedduring the disconnected period. Note that in Maximum Protection mode if the disconnected standbydatabase is the last remaining synchronous redo destination then there cannot be a redo gap, as thePrimary database will abort itself to guarantee zero data loss. For more detail, see Protection Modeslater in this paper.REDO APPLY SERVICESRedo Apply services run on a physical standby database. Redo Apply reads redo records from astandby redo log file, performs Oracle validation to ensure that redo is not corrupt, and then appliesthose redo changes to the standby database. Redo apply functions independently of redo transport toensure that the primary database performance and data protection (Recovery Point Objective - RPO)is not affected by apply performance at the standby database. Even in the extreme case where applyservices have been stopped, Data Guard transport continues to protect primary data by transmittingredo to the standby where it is archived for later use when the apply process is restarted. The RedoApply processes run on one node of the Physical Standby system even if there are multiple nodes inthe standby cluster. Starting with Oracle Database 12c Release 2 the Redo Apply services can bespread across multiple nodes, referred to as “Multi-Instance Redo Apply”, increasing the apply rate atan almost linear rate.CONTINUOUS ORACLE DATA VALIDATIONData Guard uses Oracle Database processes constantly validate redo before it is applied to thestandby database. Redo is completely isolated from I/O corruptions on the primary because it isshipped directly from the primary log buffer – the equivalent of a memory copy (memcpy) functionacross the network. Knowledge of the Oracle block format is used by the Oracle Database to enablecorruption-detection checks to occur at several key interfaces during redo transport and apply toensure both physical and logical intra-block consistency. The software code-path executed on astandby database is also fundamentally different from that of the primary - effectively isolating thestandby database from firmware and software errors that can affect a primary database.10 WHITE PAPER / Oracle (Active)Data Guard 19c

Data Guard also detects silent corruption caused by lost-writes. A lost-write occurs when an I/Osubsystem acknowledges the completion of a write that did not actually occur in the persistent storage.On a subsequent block read the I/O subsystem returns the stale version of the data block which canbe used to update other blocks of the database, thereby spreading corruption. Data Guard preventsthis by performing lost-write validation at the standby database (offload the primary database of thisoverhead). Data Guard detects lost-write corruption whether it occurs at the primary or at the standby.As of Oracle Database 19c, Lost Write detection (called Shadow lost write protection) can beimplemented on a Production database to detect lost writes even if a standby is not configured or isnot applying redo at that time. Shadow lost write protection detects a lost write before it can result in amajor data corruption. You can enable shadow lost write protection for a database, a tablespace, or adata file without requiring an Oracle Data Guard standby database. Shadow lost write protectionprovides fast detection and immediate response to a lost write, thus minimizing the data loss that canoccur in a database due to data corruption.PROTECTION MODESData Guard provides three different modes to balance cost, availability, performance, and dataprotection shown in Table 1. Each mode uses a specific redo transport method and defines thebehavior of the Data Guard configuration if a primary database loses contact with its standbyMaximum Availability Maximum Performance Maximum ProtectionAFFIRMNOAFFIRMAFFIRMSYNCASYNCSYNCTable 1These values are configured in the SERVICE descriptor of the LOG ARCHIVE DEST N parameterfor redo transport.MANAGING A DATA GUARD CONFIGURATIONYou can use SQL*Plus to manage primary and standby databases and their various interactions. DataGuard also offers a distributed management framework called the Data Guard broker, whichautomates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration.Note that the actual creation of the standby database is performed outside the broker using one of theprescribed methods, Enterprise Manager Cloud Control, RMAN duplicate commands or by using theDatabase Creation Assistant (DBCA), which is new in Oracle Database 19c.Database Administrators (DBAs) interact with the broker using either the broker’s command-lineinterface or Oracle Enterprise Manager Cloud Control. Enterprise Manager includes wizards thatfurther simplify the creation of a Data Guard configuration and its standby databases. Key Data Guardmetrics such as apply lag, transport lag, redo rate and configuration status are displayed on both theData Guard management page (see Figure 3) and on the consolidated HA Console. EnterpriseManager also enables automatic notification should any metric exceed pre-configured thresholdvalues.11 WHITE PAPER / Oracle (Active)Data Guard 19c

Figure 3: Data Guard Management in Enterprise Manager Cloud ControlROLE MANAGEMENT SERVICES – SWITCHOVER AND FAILOVERData Guard role management services quickly transition a designated standby database to theprimary role. A switchover is a planned event used to reduce downtime during planned maintenance,such as operating system or hardware upgrades, rolling upgrades of Oracle Database, and otherdatabase maintenance. Maintenance is first performed at a standby database and a switchover movesproduction from the primary to the standby operating at the new version. A switchover is always a zerodata loss operation regardless of the transport method or protection mode used.A failover brings a standby online as the new primary during an unplanned outage of the originalprimary database. A failover does not require the standby database to be restarted in order to assumethe primary role. Also, as long as the original primary database can be mounted and its files are intact,it can be quickly reinstated and resynchronized as a standby database using Flashback Database;there is no need to restore from a backup.Manual failover is initiated by the DBA using the Oracle Enterprise Manager GUI interface, the DataGuard broker’s command line interface, or SQL*Plus. Optionally, Data Guard can perform automaticfailover using the broker’s Fast-Start Failover (FSFO).FAST-START FAILOVERThe Data Guard Broker’s Fast-Start Failover allows Data Guard to automatically failover to apreviously chosen standby database without requiring manual intervention to invoke the failover. DataGuard continuously monitors the status of the configuration and initiates a failover if needed. FastStart Failover has built-in controls to prevent split-brain (a condition where more than a one databasebelieves it is the primary at the same time). This simple yet tightly controlled architecture makes faststart failover ideal when both HA and DR are required.12 WHITE PAPER / Oracle (Active)Data Guard 19c

AUTOMATING CLIENT FAILOVERThe ability to quickly perform a database failover is only the first requirement for HA. Applications mustalso be able to quickly drop their connections to a failed primary database and quickly reconnect to thenew primary database.Effective client failover in a Data Guard context has three components: Fast database failover Fast start of database services on the new primary database Fast notification of clients and reconnection to the new primary databaseRole transitions managed by the Data Guard broker can automatically transition a standby database tothe primary role, start database services appropriate for the primary role, notify application clients todisconnect from the failed primary (breaking them out of TCP time-out), and direct them to the newprimary database, all without manual intervention. Data Guard role change events can also be used toautomate cases where a global load balancer and DNS failover are used to redirect user connectionsto a new middle-tier.Application Continuity is a new capability for Oracle Database 12c Release 1 and beyond that enablestransactions that are in-flight when a database failover occurs to complete without needing a rollbackof the transaction and resubmitting it at the new primary database. Application Continuity is includedwith Active Data Guard.Global Data Services (GDS) is a new capability for Oracle Database 12c Release 1 and beyond thatextends intelligent load balancing and client failover concepts to globally distributed environments inwhich there are two or more failover targets that can be used to maintain availability. The multistandby Data Guard configuration described earlier would be an example of such an environment.GDS is included with Active Data Guard.USING DATA GUARD TO REDUCE PLANNED DOWNTIMEData Guard can be used to reduce downtime and risk for many kinds of planned maintenance. Thegeneral approach is to first implement changes on a standby database, test, and then switchover. Theproduction applications run unaffected on the primary database while maintenance is being performedat the standby database. Downtime is limited to the time required to switch production processing tothe upgraded standby database. Specific details of the process used depend upon the type ofmaintenance being performed.PLATFORM/CLOUD MIGRATION, HARDWARE AND O.S. MAINTENANCE, DATA CENTER MOVESData Guard Redo Apply offers some flexibility for primary and standby databases to run on systemswith different operating systems or hardware architectures. See My Oracle Support Note 413484.1 fordetails on mixed platform combinations supported in a Data Guard configuration3. Redo Apply can beused to facilitate migration of On-Premise Production databases to Oracle’s Cloud, pe

data loss) and asynchronous (near-zero data loss) protection. To maintain high availability for mission critical applications, database administrators can choose either manual or automatic failover to a standby sho