Oracle Rollback pending transaction

Bei 2PC (two phase commit) transaktionen kann es zu penndings kommen.
In der Alert Log sind folgende Einträge zu finden, oder auch ein ORA-03113 Fehler.

Wed Jun 14 08:17:54 2017
DISTRIB TRAN 00000001.31346132316432372D353063392D313165372D393139622D396562313431323433393531
  is local tran 1.10.248284 (hex=01.0a.3c9dc))
  delete pending committed tran, scn= (hex=0.00000000)

Apfragen der Transacktioen kann man in der View DBA_2PC_PENDING.

SELECT LOCAL_TRAN_ID, GLOBAL_TRAN_ID, STATE, FAIL_TIME, RETRY_TIME FROM DBA_2PC_PENDING;

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                                                                   STATE            FAIL_TIM RETRY_TI
---------------------- -------------------------------------------------------------------------------- ---------------- -------- --------
7.6.262199             1.61633666323865302D346637342D313165372D626432662D633636613666343361396335       prepared         12.06.17 14.06.17

Die Transaktion befindet sich im Status prepared.

ROLLBACK FORCE '7.6.262199' ;

-- alternative mit dem Package DBMS_TRANSACTION
execute SYS.DBMS_TRANSACTION.ROLLBACK_FORCE ( '7.6.262199' );

commit;
execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('7.6.262199') ;
commit;

Um sicher zustellen das keine Remote Transaktion mehr vorhanden ist, sollte auch noch die View DBA_2PC_NEIGHBORS kontrolliert werden.

SELECT DATABASE,LOCAL_TRAN_ID,DBID,SESS#,BRANCH FROM DBA_2PC_NEIGHBORS;

[spoiler title=’Two Phase Commit 2PC Tips – by Donald Burleson Consulting’ style=’default’ collapse_link=’true’]
Two Phase Commit 2PC Tips – by Donald Burleson Consulting

Two Phase Commit 2PC Tips

Oracle Database Tips by Donald BurlesonConsulting

June 2, 2015 – Updated January 21, 2016

Question:  What is an Oracle two phase commit?  I have a distributed update and I want to understand how Oracle prevents partial updates with the two phase commit mechanism.

Answer:  Oracle developed the two phase commit to allow t]for the controlling SQL to ensure that all remote sites have committed their data before issuing a local commit.

While data integrity is managed very effectively within a single database with row locking, deadlock detection, and roll-back features, distributed data integrity is far more complex. Recovery in a distributed database environment involves ensuring that the entire transaction has completed successfully before issuing a COMMIT to each of the subcomponents in the overall transaction. This can often be a cumbersome chore, and it is the idea behind the the two-phase commit.

One popular alternative to the two-phase commit is replicating information and relying on asynchronous replication techniques to enforce the data integrity. Asynchronous replication refers to Oracle snapshots and requires a master-slave type of configuration, whereby a master database relays updates to the slave database on a periodic basis (using Oracle snapshots to create master-slave replication). The snapshot approach makes sense when an overall system does not require instant referential integrity.

Managing Two-Phase Commits (2PCs) with SQL*Net

When a distributed update (or delete) has finished processing, SQL*Net will coordinate COMMIT processing, which means that the entire transaction will roll back if any portion of the transaction fails.

The first phase of this process is a prepare phase to each node, followed by the COMMIT, and then terminated by a forget phase.

If a distributed update is in the process of issuing the 2PC and a network connection breaks, Oracle will place an entry in the dba_2pc_pending table. The recovery background process (RECO) will then roll back or commit the good node to match the state of the disconnected node to ensure consistency. You can activate RECO via the ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY command.

The dba_2pc_pending table contains an ADVISE column that directs the database to either commit or roll back the pending item. You can use the ALTER SESSION ADVISE syntax to direct the 2PC mechanism. For example, to force the completion of an INSERT, you could enter the following:

ALTER SESSION ADVISE COMMIT;
INSERT INTO PAYROLL@LONDON . . . ;

When a 2PC transaction fails, you can query the dba_2pc_pending table to check the STATE column. Oracle notes:

Before RECO recovers an in-doubt transaction, the transaction appears in DBA_2PC_PENDING STATE as COLLECTING, COMMITTED, or PREPARED.

If you force an in-doubt transaction using COMMIT FORCE or ROLLBACK FORCE, then the states FORCED COMMIT or FORCED ROLLBACK may appear.
Automatic recovery normally deletes entries in these states.

The only exception is when recovery discovers a forced transaction that is in a state inconsistent with other sites in the transaction. In this case, the entry can be left in the table and the MIXED column in DBA_2PC_PENDING has a value of YES.

These entries can be cleaned up with the DBMS_TRANSACTION.PURGE_MIXED procedure.

See these notes on using dbms_transaction.

If you do this, the row will disappear from dba_2pc_pending after the transaction has been resolved. If you force the transaction the wrong way (for example, roll back when other nodes committed), RECO will detect the problem, set the MIXED column to yes, and the row will remain in the dba_2pc_pending table.

Internally, Oracle examines the init.ora parameters to determine the rank that the commit processing will take. The commit_point_strength init.ora parameter determines which of the distributed databases is to be the commit point site. In a distributed update, the database with the largest value of commit_point_strength will be the commit point site.

Viewing pending two phase commit transactions

The commit point site is the database that must successfully complete before the transaction is updated at the other databases. Conversely, if a transaction fails at the commit point site, the entire transaction will be rolled back at all of the other databases. In general, the commit point site should be the database that contains the most critical data. Below is a script that will identify a two-phase commit transaction that has failed to complete.

Two Phase commit  pending.sql reports on any pending distributed transactions.

set pagesize 999;
set feedback off;
set wrap on;

column local_tran_id  format a22 heading 'Local Txn Id'
column global_tran_id format a50 heading 'Global Txn Id'
column state          format a16 heading 'State'
column mixed          format a5  heading 'Mixed'
column advice         format a5  heading 'Advice'

select
local_tran_id,
global_tran_id,
state,mixed,advice
from
dba_2pc_pending
order
by local_tran_id;

Managing two phase commits

When a distributed update (or delete) has finished processing, SQL*Net will coordinate commit processing so that if any portion of the transaction fails, the entire transaction will roll back. The first phase of this process is a “prepare phase” to each node, followed by the commit, and then terminated by a “forget phase.”

If a distributed update is in the process of issuing the 2PC and a network connection breaks, Oracle will place an entry in the dba_2pc_pending table, and the recovery background process, reco, will roll back or commit the good node to match the state of the disconnected node, thereby ensuring consistency.

You can activate reco via the alter system enable distributed recovery command. The reco process is applicable only when the distributed option is installed in Oracle, and is only used to manage 2PCs. If your system does not perform cross-database synchronization, you may want to disable reco.

The dba_2pc_pending table contains an “advise” column that directs the database to either commit or rollback the pending item. You can use the alter session advise syntax to direct the 2PC mechanism. For example, to force the completion of an insert, you can enter the following:

ALTER SESSION ADVISE COMMIT;INSERT INTO PAYROLL@LONDON . . . ;

When a 2PC transaction fails, you can query the dba_2pc_pending table to check the “state” column. You can enter SQL*DBA and use the “recover in-doubt transaction” dialog box to force either a rollback or a commit of the pending transaction. If you do this, the row will disappear from dba_2pc_pending after the transaction has been resolved.

If you force the transaction the wrong way (for example, to roll back when other nodes have already committed), reco will detect the problem, set the mixed column to “yes,” and the row will remain in dba_2pc_pending table.

Two Phase commit errors

Here are common errors related to the two phase commit:

ORA-24756:Transaction does not exist
Cause: An invalid transaction identifier or context was used or the transaction has completed.
Action: Supply a valid identifier if the transaction has not completed and retry the call

ORA-02019: “connection description for remote database not found”
Cause: The user attempted to connect or log in to a remote database using a connection description that could not be found.
Action: Specify an existing database link.  Query the data dictionary to see all existing database links.  See your operating system-specific SQL*Net documentation for valid connection descriptors.

ORA-02058: “no prepared transaction found with ID %s”
Cause: A COMMIT FORCE was attempted on a transaction, but the transaction with LOCAL_TRAN_ID or GLOBAL_TRAN_ID was not found in the DBA_2PC_INDOUBT table in prepared state.
Action: Check the DBA_2PC_INDOUBT table to ensure the proper transaction ID is used and attempt the commit again.

ORA-02068: “following severe error from %s%s
Cause: A severe error (disconnect, fatal Oracle error) received from the indicated database link.  See following error text.
Action: Contact the remote system administrator.

ORA-02050: “transaction %s rolled back, some remote DBs may be in-doubt”
Cause: Network or remote failure in 2PC.
Action: Notify operations; remote DBs will automatically re-sync when the failure is repaired.

ORA-02053:  Transaction string committed, some remote DBs may be in-question
Cause: Network or remote failure in 2PC.  (Two Phase commit)
Action:
Notify operations; remote DBs will automatically re-sync when the failure is repaired.

The ORA-02050 is for distributed updates, over a database link, and it’s often the result of losing connectivity during an update.  The root cause of this ORA-02050 error is a network failure, not Oracle.

First, you need to clean any in-flight transaction that may have failed by querying dba_2pc_pending:

select * from dba_2pc_pending;

Since the root cause is a network failure, get your Network Administrator to monitor the network with netstat, looking for lost packets or other network errors.
[/spoiler]