Even a transaction timeout will not kill or timeout any action that is done by the resources that are enlisted in this transaction. The actions will run as long as they take, without interruption. A transaction timeout will set a flag on the transaction that will mark it as rollback only, so that any subsequent request to commit this transaction will fail with a TimedOutException or RollbackException. However, as mentioned above, the long running JDBC calls can lead to blocked WebLogic Server execute threads, which can finally lead to a hanging instance, if all threads are blocked and no execute thread remains available for handling incoming requests.

More recent WebLogic Server versions have a health check functionality that regularly checks if a thread does not react for a certain period of time (the default is 600 seconds). If this happens, an error message is printed to your log file similar to following:

The time interval for the health check functionality is configurable. Please check the StuckThreadMaxTime property in the <Server> tag of your config.xml file or the "Detecting stuck threads" section in the WebLogic Server administration console help

The following are some different possible reasons that can cause JDBC calls to lead to a hanging WebLogic Server instance:


  • Use of DriverManager.getConnection() in your JDBC code.
  • SQL Queries issued to the database take unexpectedly long time to return.
  • Database for which the JDBC connection pool is configured hangs and does not return from calls in a timely manner.
  • A slow or overloaded network causes database calls to slow down or hang.
  • A deadlock causes all execute threads to hang and wait forever.
  • RefreshMinutes or TestFrequencySeconds property in the JDBC connection pool causes hang periods in WebLogic Server.
  • JDBC connection pool shrinking and re-creation of database connections causes long response times.

Synchronized DriverManager.getConnection()

There are some best practices both in the development of JDBC code and also in the configuration practice of JDBC connection pools that can help to avoid common problems and optimize resource usage so that hanging server instances should not happen.
Problem Troubleshooting
Different programming techniques or JDBC connection pool configurations can lead to deadlocks or long running JDBC calls that lead to hanging WebLogic Server instances. This pattern addresses JDBC calls causing a server hang and other well-known JDBC-related causes for common problems leading to hanging WebLogic Server instance.

Troubleshooting Steps

Why does the problem occur?
Older JDBC application code sometimes uses DriverManager.getConnection() calls to retrieve a database connection using a certain driver. This technique is not recommended as it can cause deadlocks or at least relatively low performance for your connection requests. The reason behind this is, that all DriverManager calls are class-synchronized, meaning that one DriverManager call in one thread will block all other DriverManager calls in any other thread inside one WebLogic Server instance. In addition to that, the constructor for a SQLException makes a DriverManager call, and most drivers have DriverManager.println() calls for logging, so any of these can block all other threads that issue a DriverManager call.DriverManager.getConnection() can take a relatively long time until it returns with the physical connection created to the database. Even if no deadlock occurs, all other calls need to wait until that one thread gets its connection. This is not a best practice in a multi-threaded system like WebLogic Server. Oracle documentation clearly states that DriverManager.getConnection() should not be used: If you prefer to use JDBC connections in your JDBC code, you should use a WebLogic Server JDBC connection pool, define a DataSource for it, and get the connection from the DataSource. This will give you all advantages from a pool (resource sharing, connection reuse, connection refresh if a database was down, etc). It also will help you avoid the deadlocks that may happen with DriverManager calls.


Long-Running SQL Queries

Long running SQL queries block execute threads for their duration and until they return their result to the calling application. This means that a WebLogic Server instance needs to be configured to be able to handle enough calls simultaneously as they are requested by the application load. Limiting factors here are the number of execute threads and the number of connections in the JDBC connection pools. A general rule of thumb is to set the number of connections in the pool equally to the number of execute threads to enable optimal resource utilization. If JTS is used, some more connections in the pools should be available because connections may be reserved for transactions that are actually not active. A thread hanging in a long running SQL call will show a very similar stack in a thread dump as the one for a hanging database. Please compare the next section for details.

Hanging Database

Good database performance is key for the performance of an application that relies on this database. Consequently, a hanging database can block many or all available execute threads in a WebLogic Server instance and finally lead to a hanging server. To diagnose this, you should take 5 to 10 thread dumps  from your hanging WebLogic Server instance and check your execute threads (in the default queue or your application thread queue) to see if they are currently in SQL calls and waiting for a result from the database. A typical stack trace for a thread that currently issues a sql query could look similar to following example:The thread will be in running state. You should compare the threads in your different thread dumps in order to see if they receive the return from the SQL call in a timely manner or if they hang in this same call for a longer period of time. If the thread dumps seem to imply long response times from SQL calls, the corresponding database logs should be checked to see if problems in the database cause this slow performance or hang situation. You can use tools like Samurai or Thread Dump Analyzer to review multiple thread dumps.

Slow Network

Communication between WebLogic Server and the database relies on a well-performing and reliable network in order to serve the requests in a timely manner. Slow network performance can therefore lead to hanging or blocking execute threads waiting for results of SQL queries. The related stack traces will look similar to example above in Hanging Database section. It is not possible to find the root cause of the hanging or slow SQL queries by solely analyzing the WebLogic Server thread dumps. These give the first hint that something is wrong with the performance of the SQL calls. The next step is to check if there is a database or network problem that causes poorly performing SQL calls.

Deadlock

Both an application level deadlock as well as a deadlock on the database level can lead to hanging threads. You should check your thread dumps to see if there is an application level deadlock. A database deadlock can be detected either in the database log or by the SQL Exception that can be found in the WebLogic Server log file. An example for a related SQL Exception is:As it generally can take some time until a database detects a deadlock and resolves it by rolling back one or more transactions that cause the deadlock, one or more execute threads will be blocked until the rollback has finished.RefreshMinutes or TestFrequencySecondsIf you see recurring periods of low database performance, slow SQL calls, or connection peaks, the setting of the RefreshMinutes or TestFrequencySeconds configuration property in your JDBC connection pools could be the reason. Unless you do not have a firewall between your WebLogic Server instance and your database, you should disable this functionality.Pool ShrinkingPhysical connections to a database are resources that should be opened once and kept open as long as possible, as a new connection request is a considerable resource overhead for the database, the operating system kernel, and the WebLogic Server. Consequently, pool shrinking should be disabled on production systems in order to keep this overhead at a minimum. If pool shrinking is enabled, idle pool connections will be closed and reopened once connection requests to the pool cannot be satisfied. As these activities can take some time, the related application requests may take an unexpectedly long time which can lead users to assume that the system hangs.


Analysis of a hanging WebLogic Server instance

Most times it will be helpful to start with taking thread dumps from the hanging system in order to find out what is going on, e.g., what the different threads are doing and why they hang. Generally, thread dumps can be taken on production systems, however caution is necessary for very old versions of the JVM (<1.3.1_09), as they may crash during thread dumps. Also if the WebLogic Server instance has a huge number of threads, it will mean that the thread dump will take a while to complete, while the rest of the threads are blocked. Please take more than one thread dump (5 to 10) with a delay of some seconds in between. This gives you the possibility to check the progress of the different threads. Also it will show if the system actually hangs (no progress at all) or if the throughput is extremely slow, which can seem to be a hanging system.  http://download.oracle.com/docs/cd/E13222_01/wls/docs81/cluster/trouble.html. If for example all your threads are in a DriverManager method like getConnection() then you have identified the root cause and need to change your application to use a DataSource or Driver.connect() instead of DriverManager.getConnection(). A very useful tool, Samurai, can be used to analyze thread dumps and to monitor the progress of threads between different thread dumps.

JDBC Programming

In order to optimize resource usage in WebLogic Server and conserve database resources, you should use JDBC connection pools for your application's JDBC calls. Connections created and destroyed in your application code generate an unnecessary overhead which should be avoided. For generic documentation on JDBC programming, see: http://download.oracle.com/docs/cd/E13222_01/wls/docs81/jdbc/rmidriver.html#1028977. Also details on JDBC performance tuning are at: http://download.oracle.com/docs/cd/E13222_01/wls/docs81/jdbc/performance.html#1027791.

A JDBC connection which is used by an application or by WebLogic Server itself will block one WebLogic Server execute thread for the complete duration of the calls that are made via this connection. The JVM will ensure that the CPU is given to runnable threads by its thread scheduling mechanism, while the thread that blocks on a SQL query needs to wait. However, the thread occupied by the JDBC call will be reserved and used for the application until the call returns from the SQL query.

####<Nov 6, 2009 1:42:30 PM EST> <Warning> <WebLogicServer> <mydomain> <myserver> <CoreHealthMonitor> <kernel identity> <> <000337> <ExecuteThread: '64' for queue: 'default' has been busy for "740" seconds working on the request "Scheduled Trigger", which is more than the configured time (StuckThreadMaxTime) of "600" seconds.>


This does not interrupt the thread, as this is just a notification for the administrator. The only way a stuck thread becomes unstuck again is when the request it is handling finishes. In this case, you will find a message similar to the following in your WebLogic Server's log file:

####<Nov 7, 2009 4:17:34 PM EST> <Info> <WebLogicServer><mydomain> <myserver> <ExecuteThread: '66' for queue: 'default'> <kernel identity> <> <000339> <ExecuteThread: '66' for queue: 'default' has become "unstuck".>"ExecuteThread-39" daemon prio=5 tid=0x401660 nid=0x33 waiting for monitor entry [0xd247f000..0xd247fc68]   at java.sql.DriverManager.getConnection(DriverManager.java:188)   at com.bla.updateDataInDatabase(MyClass.java:296)   at javax.servlet.http.HttpServlet.service(HttpServlet.java:865)   at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:120)   at weblogic.servlet.internal.ServletContextImpl.invokeServlet(ServletContextImpl.java:945)   at weblogic.servlet.internal.ServletContextImpl.invokeServlet(ServletContextImpl.java:909)   at weblogic.servlet.internal.ServletContextManager.invokeServlet(ServletContextManager.java:269)   at weblogic.socket.MuxableSocketHTTP.invokeServlet(MuxableSocketHTTP.java:392)   at weblogic.socket.MuxableSocketHTTP.execute(MuxableSocketHTTP.java:274)   at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:130) "ExecuteThread: '4' for queue: 'weblogic.kernel.Default'" daemon prio=5 tid=0x8e93c8 nid=0x19 runnable [e137f000..e13819bc]   at java.net.SocketInputStream.socketRead0(Native Method)   at java.net.SocketInputStream.read(SocketInputStream.java:129)   at oracle.net.ns.Packet.receive(Unknown Source)   at oracle.net.ns.DataPacket.receive(Unknown Source)   at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)   at oracle.net.ns.NetInputStream.read(Unknown Source)   at oracle.net.ns.NetInputStream.read(Unknown Source)   at oracle.net.ns.NetInputStream.read(Unknown Source)   at oracle.jdbc.ttc7.MAREngine.unmarshalUB1(MAREngine.java:931)   at oracle.jdbc.ttc7.MAREngine.unmarshalSB1(MAREngine.java:893)   at oracle.jdbc.ttc7.Oall7.receive(Oall7.java:375)   at oracle.jdbc.ttc7.TTC7Protocol.doOall7(TTC7Protocol.java:1983)   at oracle.jdbc.ttc7.TTC7Protocol.fetch(TTC7Protocol.java:1250)   - locked <e8c68f00> (a oracle.jdbc.ttc7.TTC7Protocol)   at oracle.jdbc.driver.OracleStatement.doExecuteQuery(OracleStatement.java:2529)   at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:2857)   at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:608)   - locked <e5cc44d0> (a oracle.jdbc.driver.OraclePreparedStatement)   - locked <e8c544c8> (a oracle.jdbc.driver.OracleConnection)   at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:536)   - locked <e5cc44d0> (a oracle.jdbc.driver.OraclePreparedStatement)   - locked <e8c544c8> (a oracle.jdbc.driver.OracleConnection)   at weblogic.jdbc.wrapper.PreparedStatement.executeQuery(PreparedStatement.java:80)   at myPackage.query.getAnalysis(MyClass.java:94)   at jsp_servlet._jsp._jspService(__jspService.java:242)   at weblogic.servlet.jsp.JspBase.service(JspBase.java:33)   at weblogic.servlet.internal.ServletStubImpl$ServletInvocationAction.run(ServletStubImpl.java:971)   at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:402)   at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:305)   at weblogic.servlet.internal.RequestDispatcherImpl.include(RequestDispatcherImpl.java:607)   at weblogic.servlet.internal.RequestDispatcherImpl.include(RequestDispatcherImpl.java:400)   at weblogic.servlet.jsp.PageContextImpl.include(PageContextImpl.java:154)   at jsp_servlet._jsp.__mf1924jq._jspService(__mf1924jq.java:563)   at weblogic.servlet.jsp.JspBase.service(JspBase.java:33)   at weblogic.servlet.internal.ServletStubImpl$ServletInvocationAction.run(ServletStubImpl.java:971)   at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:402)   at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:305)   at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:6350)   at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:317)   at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:118)   at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3635)   at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2585)   at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:197)   at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:170) java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource   at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:170)   at oracle.jdbc.oci8.OCIDBAccess.check_error(OCIDBAccess.java:1614)   at oracle.jdbc.oci8.OCIDBAccess.executeFetch(OCIDBAccess.java:1225)   at oracle.jdbc.oci8.OCIDBAccess.parseExecuteFetch(OCIDBAccess.java:1338)   at oracle.jdbc.driver.OracleStatement.executeNonQuery(OracleStatement.java:1722)   at oracle.jdbc.driver.OracleStatement.doExecuteOther(OracleStatement.java:1647)   at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:2167)   at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:404)

0 Comments