閱讀338 返回首頁    go 阿裏雲 go 技術社區[雲棲]


JDK的sql設計不合理導致的驅動類初始化死鎖問題

問題描述
當我們一個係統既需要mysql驅動,也需要oracle驅動的時候,在並發加載初始化這些驅動類的過程中產生死鎖的可能性非常大,下麵是一個模擬的例子,對於Thread2的實現其實是jdk裏java.sql.DriverService的邏輯,也是我們第一次調用java.sql.DriverManager.registerDriver注冊一個驅動實例要走的邏輯(jdk1.6下),不過這篇文章是使用我們生產環境的一個係統的線程dump和內存dump為基礎進行分析展開的。

01 import java.util.Iterator;
02  
03 import sun.misc.Service;
04  
05 public class Main {
06     public static void main(String[] args) throws ClassNotFoundException {
07         Thread1 thread1 = new Thread1();
08         Thread2 thread2 = new Thread2();
09         thread1.start();
10         thread2.start();
11     }
12 }
13  
14 class Thread1 extends Thread {
15     public void run() {
16         try {
17             Class clazz = Class.forName("com.mysql.jdbc.Driver"true, Thread.currentThread()
18                 .getContextClassLoader());
19             System.out.println(clazz);
20         catch (ClassNotFoundException e) {
21         }
22     }
23 }
24  
25 class Thread2 extends Thread {
26     public void run() {
27         Iterator ps = Service.providers(java.sql.Driver.class);
28         try {
29             while (ps.hasNext()) {
30                 System.out.println(ps.next());
31             }
32         catch (Throwable t) {
33  
34         }
35     }
36 }

 

如果以上代碼運行過程中發現有線程一直卡死在Class.forName的調用裏,那麼說明問題已經重現了。

先上兩張圖
thread_2
heap_thread_1

下麵是問題進程線程dump的分析”thread dump”,還有這個進程的內存dump分析”heap dump”,這個後麵展開分析的基礎

存疑點
仔細看看上麵的線程dump分析和內存dump分析裏的線程分析模塊,您可能會有如下兩個疑惑:

【為什麼線程”Thread-0″一直卡在Class.forName的位置】:這有點出乎意料,做一個類加載要麼找不到拋出ClassNotFoundException,要麼找到直接返回,為什麼會一直卡在這個位置呢?

【明明”Thread-0″注冊的是mysql驅動為什麼會去加載Odbc的驅動類】:通過”Thread-0″在棧上看倒數第二幀展開看到傳入Class.forName的參數是com.mysql.jdbc.Driver,然後展開棧上順序第二幀,看到傳入的參數是sun.jdbc.odbc.JdbcOdbcDriver,這意味著在對mysql驅動類做加載初始化的過程中又觸發了JdbcOdbc驅動類的加載

疑惑點解釋
疑惑二:
第一個疑惑我們先留著,先解釋下第二個疑惑,大家可以對照堆棧通過反編譯rt.jar還有ojdbc6-11.2.0.3.0.jar看具體的代碼

驅動類加載過程簡要介紹:

當要注冊某個sql驅動的時候是通過調用java.sql.DriverManager.registerDriver來實現的(注意這個方法加了synchronized關鍵字,後麵解釋第一個疑惑的時候是關鍵),而這個方法在第一次執行過程中,會在當前線程classloader的classpath下尋找所有/META-INF/services/java.sql.Driver文件,這個文件在mysql和oracle驅動jar裏都有,裏麵寫的是對應的驅動實現類名,這種機製是jdk提供的spi實現,找到這些文件之後,依次使用Class.forName(driverClassName, true, this.loader)來對這些驅動類進行加載,其中第二個參數是true,意味著不僅僅做一次loadClass的動作,還會初始化該類,即調用包含靜態塊的< clinit >方法,執行完之後才會返回,這樣就解釋了第二個疑惑,在mysql驅動注冊過程中還會對odbc驅動類進行加載並初始化

感想:

其實我覺得這種設計有點傻,為什麼要幹和自己不相關的事情呢,畫蛇添足的設計,首先類初始化的開銷是否放到一起做並沒有多大區別,其次正由於這種設計導致了今天這個死鎖的發生

疑惑一:
現在來說第一個疑惑,為什麼會一直卡在Class.forName呢,到底卡在哪裏,於是再通過jstack -m 命令將jvm裏的堆棧也打印出來,如下所示

01 ----------------- 5738 -----------------
02 0x003f67a2      _dl_sysinfo_int80 + 0x2
03 0xb79a71ae      _ZN2os13PlatformEvent4parkEv + 0xee
04 0xb7997acb      _ZN13ObjectMonitor4waitExbP6Thread + 0x5fb
05 0xb7a73c53      _ZN18ObjectSynchronizer19waitUninterruptiblyE6HandlexP6Thread + 0x53
06 0xb777eb34      _ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread + 0x74
07 0xb777e288      _ZN13instanceKlass10initializeEP6Thread + 0x58
08 0xb7821ad9      _Z28find_class_from_class_loaderP7JNIEnv_12symbolHandleh6HandleS2_hP6Thread + 0xb9
09 0xb7807d99      JVM_FindClassFromClassLoader + 0x269
10 0xb734c236      Java_java_lang_Class_forName0 + 0x116
11 0xb433064a      * java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) bci:0(Interpreted frame)
12 0xb4328fa7      * java.lang.Class.forName(java.lang.String, boolean, java.lang.ClassLoader) bci:32line:247 (Interpreted frame)
13 0xb4328fa7      * sun.misc.Service$LazyIterator.next() bci:31 line:271 (Interpreted frame)
14 0xb4329483      * java.sql.DriverService.run() bci:26 line:664 (Interpreted frame)
15 0xb43263e6
16 0xb77a4e31      _ZN9JavaCalls11call_helperEP9JavaValueP12methodHandleP17JavaCallArgumentsP6Thread + 0x1c1
17 0xb79a6418      _ZN2os20os_exception_wrapperEPFvP9JavaValueP12methodHandleP17JavaCallArgumentsP6ThreadES1_S3_S5_S7_ + 0x18
18 0xb77a4c5f      _ZN9JavaCalls4callEP9JavaValue12methodHandleP17JavaCallArgumentsP6Thread + 0x2f
19 0xb780aace      JVM_DoPrivileged + 0x40e
20 0xb734b95d      Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedAction_2 + 0x3d
21 0xb433064a      * java.security.AccessController.doPrivileged(java.security.PrivilegedAction) bci:0(Interpreted frame)
22 0xb4328fa7      * java.sql.DriverManager.loadInitialDrivers() bci:31 line:506 (Interpreted frame)
23 0xb432910d      * java.sql.DriverManager.initialize() bci:11 line:612 (Interpreted frame)
24 0xb432910d      * java.sql.DriverManager.registerDriver(java.sql.Driver) bci:6 line:281 (Interpreted frame)
25 0xb432910d      * com.mysql.jdbc.Driver.() bci:7 line:65 (Interpreted frame)
26 0xb43263e6
27 0xb77a4e31      _ZN9JavaCalls11call_helperEP9JavaValueP12methodHandleP17JavaCallArgumentsP6Thread + 0x1c1
28 0xb79a6418      _ZN2os20os_exception_wrapperEPFvP9JavaValueP12methodHandleP17JavaCallArgumentsP6ThreadES1_S3_S5_S7_ + 0x18
29 0xb77a4c5f      _ZN9JavaCalls4callEP9JavaValue12methodHandleP17JavaCallArgumentsP6Thread + 0x2f
30 0xb77800c1      _ZN13instanceKlass27call_class_initializer_implE19instanceKlassHandleP6Thread + 0xa1
31 0xb777ed8e      _ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread + 0x2ce
32 0xb777e288      _ZN13instanceKlass10initializeEP6Thread + 0x58
33 0xb7821ad9      _Z28find_class_from_class_loaderP7JNIEnv_12symbolHandleh6HandleS2_hP6Thread + 0xb9
34 0xb7807d99      JVM_FindClassFromClassLoader + 0x269
35 0xb734c236      Java_java_lang_Class_forName0 + 0x116
36 0xb433064a      * java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) bci:0(Interpreted frame)
37 0xb4328fa7      * java.lang.Class.forName(java.lang.String, boolean, java.lang.ClassLoader) bci:32line:247 (Interpreted frame)
38 0xb4328fa7      * Thread1.run() bci:9 line:17 (Interpreted frame)

我們看到其實正在做類的初始化動作,並且線程正在調用ObjectSynchronizer::waitUninterruptibly一直沒返回,在看這方法的調用者instanceKlass1::initialize_impl,我們找到源碼位置如下:

01 void instanceKlass::initialize_impl(instanceKlassHandle this_oop, TRAPS) {
02   // Make sure klass is linked (verified) before initialization
03   // A class could already be verified, since it has been reflected upon.
04   this_oop->link_class(CHECK);
05  
06   DTRACE_CLASSINIT_PROBE(required, instanceKlass::cast(this_oop()), -1);
07  
08   bool wait = false;
09  
10   // refer to the JVM book page 47 for description of steps
11   // Step 1
12   { ObjectLocker ol(this_oop, THREAD);
13  
14     Thread *self = THREAD; // it's passed the current thread
15  
16     // Step 2
17     // If we were to use wait() instead of waitInterruptibly() then
18     // we might end up throwing IE from link/symbol resolution sites
19     // that aren't expected to throw.  This would wreak havoc.  See 6320309.
20     while(this_oop->is_being_initialized() && !this_oop->is_reentrant_initialization(self)) {
21         wait = true;
22       ol.waitUninterruptibly(CHECK);
23     }
24  
25     // Step 3
26     if (this_oop->is_being_initialized() && this_oop->is_reentrant_initialization(self)) {
27       DTRACE_CLASSINIT_PROBE_WAIT(recursive, instanceKlass::cast(this_oop()), -1,wait);
28       return;
29     }
30  
31     // Step 4
32     if (this_oop->is_initialized()) {
33       DTRACE_CLASSINIT_PROBE_WAIT(concurrent, instanceKlass::cast(this_oop()), -1,wait);
34       return;
35     }
36  
37     // Step 5
38     if (this_oop->is_in_error_state()) {
39       DTRACE_CLASSINIT_PROBE_WAIT(erroneous, instanceKlass::cast(this_oop()), -1,wait);
40       ResourceMark rm(THREAD);
41       const char* desc = "Could not initialize class ";
42       const char* className = this_oop->external_name();
43       size_t msglen = strlen(desc) + strlen(className) + 1;
44       char* message = NEW_RESOURCE_ARRAY(char, msglen);
45       if (NULL == message) {
46         // Out of memory: can't create detailed error message
47         THROW_MSG(vmSymbols::java_lang_NoClassDefFoundError(), className);
48       else {
49         jio_snprintf(message, msglen, "%s%s", desc, className);
50         THROW_MSG(vmSymbols::java_lang_NoClassDefFoundError(), message);
51       }
52     }
53  
54     // Step 6
55     this_oop->set_init_state(being_initialized);
56     this_oop->set_init_thread(self);
57   }
58   ...
59 }

類的初始化過程:

當某個線程獲得機會對某個類進行初始化的時候(請看上麵的Step 6),會設置這個類的init_state屬性為being_initialized(如果初始化好了會設置為fully_initialized,異常的話會設置為initialization_error),還會設置init_thread屬性為當前線程,在這個設置過程中是有針對這個類提供了一把互斥鎖的,因此當有別的線程進來的時候會被攔截在外麵,如果設置完了,這把互斥鎖也釋放了,但是因為這個類的狀態被設置了,因此並發問題也得到了解決,當另外一個線程也嚐試初始化這個類的時候會判斷這個類的狀態是不是being_initialized,並且其init_thread不是當前線程,那麼就會一直卡在那裏,也就是此次線程dump的線程所處的狀態,正在初始化類的線程會調用< clinit >方法,如果正常結束了,那麼就設置其狀態為fully_initialized,並且通知之前卡在那裏等待初始化完成的線程,然他們繼續往下走(下一個動作就是再判斷下狀態,發現完成了就直接return了)

猜想:

在了解了上麵的過程之後,於是我們猜測兩種可能

第一,這個類的狀態還是being_intialized,還在while循環裏沒有跳出來
第二,事件通知機製出現了問題,也就是pthread_cond_wait和pthread_cond_signal之間的通信過程出現了問題。
不過第二種可能性非常小,比較linux久經考驗了,那接下來我們驗證其實是第一個猜想

驗證

我們通過GDB attach的方式連到了問題機器上(好在機器沒有掛),首先我們要找到具體的問題線程,我們通過上麵的jstack -m命令看到了線程ID是5738,然後通過info threads找到對應的線程,並得到它的序號14

01 (gdb) info threads
02 17 process 5724 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
03 16 process 6878 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
04 15 process 5739 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
05 14 process 5738 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
06 13 process 5737 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
07 12 process 5736 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
08 11 process 5735 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
09 10 process 5734 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
10 9 process 5733 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
11 8 process 5732 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
12 7 process 5731 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
13 6 process 5730 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
14 5 process 5729 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
15 4 process 5728 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
16 3 process 5727 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
17 2 process 5726 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
18 1 process 5725 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

然後通過thread 14切換到對應的線程,並通過bt看到了如下的堆棧,正如我們想象的那樣,正在做類的初始化,一直卡在那裏

01 (gdb) thread 14
02 [Switching to thread 14 (process 5738)]#0 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
03 (gdb) bt
04 #0 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
05 #1 0x005e0d76 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/i686/nosegneg/libpthread.so.0
06 #2 0x005e13ee in pthread_cond_wait@GLIBC_2.0 () from /lib/tls/i686/nosegneg/libpthread.so.0
07 #3 0xb79a71ae in os::PlatformEvent::park () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
08 #4 0xb7997acb in ObjectMonitor::wait () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
09 #5 0xb7a73c53 in ObjectSynchronizer::waitUninterruptibly () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
10 #6 0xb777eb34 in instanceKlass::initialize_impl () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
11 #7 0xb777e288 in instanceKlass::initialize () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
12 #8 0xb7821ad9 in find_class_from_class_loader () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
13 #9 0xb7807d99 in JVM_FindClassFromClassLoader () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
14 #10 0xb734c236 in Java_java_lang_Class_forName0 () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/libjava.so
15 #11 0xb433064a in ?? ()
16 #12 0x0813b120 in ?? ()
17 #13 0x70aaa690 in ?? ()
18 #14 0x70aaa6a0 in ?? ()
19 #15 0x00000001 in ?? ()
20 #16 0x70aaa698 in ?? ()
21 #17 0x00000000 in ?? ()

我們通過f 6選擇第7幀,在通過disassemble反匯編該幀,也就是對instanceKlass::initialize_impl ()這個方法反匯編

01 0xb777eaed <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+45>:  lea    0xfffffff4(%ebp),%esp    //將%ebp偏移0xfffffff4位置的值存到%esp棧頂,然後下麵的pop操作存到%ebx
02 0xb777eaf0 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+48>:  pop    %ebx
03 0xb777eaf1 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+49>:  pop    %esi
04 0xb777eaf2 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+50>:  pop    %edi
05 0xb777eaf3 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+51>:  pop    %ebp
06 0xb777eaf4 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+52>:  ret
07 0xb777eaf5 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+53>:  push   $0x1
08 0xb777eaf7 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+55>:  lea    0xffffffd8(%ebp),%edx
09 0xb777eafa <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+58>:  push   %esi
10 0xb777eafb <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+59>:  push   %ebx
11 0xb777eafc <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+60>:  push   %edx
12 0xb777eafd <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+61>:  call   0xb7a73a80<_ZN12ObjectLockerC1E6HandleP6Threadb>
13 0xb777eb02 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+66>:  add    $0x10,%esp
14 0xb777eb05 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+69>:  xor    %eax,%eax
15 0xb777eb07 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+71>:  test   %ebx,%ebx
16 0xb777eb09 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+73>:  je     0xb777eb0d<_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+77>
17 0xb777eb0b <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+75>:  mov    (%ebx),%eax      //將%ebx的值移到%eax
18 0xb777eb0d <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+77>:  cmpl   $0x4,0xe0(%eax)  //對比%eax偏移0xe0位置的值和0x4(這個值其實就是上麵提到的being_initialized狀態,這就說明了%eax偏移0xe0位置其實存的就是初始化類的初始化狀態)
19 0xb777eb14 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+84>:  jne    0xb777eb4f<_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+143>

從上麵的注釋我們其實得出了,我們要看當前類的初始化狀態,那就是看eax寄存器偏移0xe0的位置的值,而eax其實就是ebp寄存器偏移0xfffffff4位置的值,於是我們通過如下地址內存查到得到是4

1 (gdb) x $ebp + 0xfffffff4
2 0x70aaa45c0x71af2180
3 (gdb) x/3w 0x71af2180 0xe0
4 0x71af22600x00000004  0x0813c800  0x0000001a

而4其實代表的就是being_initialized這個狀態,代碼如下

1 enum ClassState {
2 unparsable_by_gc = 0,
3 allocated,
4 loaded,
5 linked,
6 being_initialized,
7 fully_initialized,
8 initialization_error
9 };

從這於是我們驗證了第一個猜想,其實是狀態一直沒有變更,因此一直卡在那裏,為了更進一步確認這個問題,要是我們能找到該類的init_thread線程id就更清楚了,拿到這個ID我們就能看到這個線程棧,就知道它在幹什麼了,但是很遺憾,這個很難獲取到,至少我一直沒有找到辦法,因為線程ID在線程對象裏一直沒有存,都是調用的os函數來獲取的,得換個思路。

突然發現instanceKlass.hpp代碼中得知兩個屬性原來是相鄰的(init_state和init_thread),於是斷定下一個地址的值就代表是這個線程對象了,但是其屬性何其多,找到想要的太不易了,最主要的是還擔心自己看的代碼和服務器上的jvm代碼不一致,這樣更蛋疼了,於是繼續查看Thread.hpp中的JavaThread類,找到個關鍵字0xDEAD-2=0xDEAB,這個有可能是volatile TerminatedTypes _terminated屬性的值,於是把線程對象打印出來,果然查到了關鍵字0xDEAB

01 (gdb) x/100w 0x0813c800
02 0x813c8000xb7bc06e8 0x00000000 0x00000000 0x00000000
03 0x813c8100x0813c488 0x0813d2c8 0x00000000 0x00000000
04 0x813c8200x080f9bf8 0x080f8b50 0x70a59b60 0x00000000
05 0x813c8300x00000000 0x00000000 0x00000000 0x00000000
06 0x813c8400x00014148 0x00000505 0x00000000 0x00000000
07 0x813c8500x00000000 0x00000000 0x00000000 0x3f800021
08 0x813c8600x00000001 0x00000023 0x3f800021 0x0001b530
09 0x813c8700x00000000 0x00000000 0x00000000 0x080ffdc0
10 0x813c8800x00000001 0x00000000 0x080ffe24 0x00000014
11 0x813c8900x00000031 0x00000000 0x00000000 0x0813dab0
12 0x813c8a00x0813c428 0x0813ce98 0x70a5b000 0x00051000
13 0x813c8b00x00000000 0xffffffff 0x00000000 0x080ffdc0
14 0x813c8c00x00002bad 0x0813d400 0x0813d500 0x0813d700
15 0x813c8d00x0813d800 0x00000000 0x00000000 0x104aa1ad
16 0x813c8e00x544a5ab2 0x32378fc7 0x00008767 0x00000000
17 0x813c8f00x00000000 0x00000000 0x0ee9547d 0x00000000
18 0x813c9000x00000000 0x00000000 0x0813b000 0x75878760
19 0x813c9100x70a59a94 0x00000000 0x70a59abc 0xb7829020
20 0x813c9200xb7bb7100 0x00000000 0x00000000 0x00000000
21 0x813c9300x00000000 0x00000000 0x00000000 0x00000000
22 0x813c9400x00000000 0x00000000 0x00000000 0x00000000
23 0x813c9500x00000000 0x00000000 0x00000000 0x0000000a
24 0x813c9600x0813da98 0x00000000 0x0000deab 0x00000001
25 0x813c9700x00000000 0x00000000 0x00000002 0x00000000
26 0x813c9800x00000000 0x00000000 0x00000000 0x00000000

因此順著這個屬性繼續往上找,找到了_thread_state表示線程狀態的值(向上偏移三個字),0x0000000a,即10,然後查看代碼知道原來線程是出於block狀態

01 public:
02 volatile JavaThreadState _thread_state;
03 private:
04 ThreadSafepointState *_safepoint_state;
05 address _saved_exception_pc;
06 volatile TerminatedTypes _terminated;
07 enum JavaThreadState {
08 _thread_uninitialized = 0// should never happen (missing initialization)
09 _thread_new = 2// just starting up, i.e., in process of being initialized
10 _thread_new_trans = 3// corresponding transition state (not used, included for completness)
11 _thread_in_native = 4// running in native code
12 _thread_in_native_trans = 5// corresponding transition state
13 _thread_in_vm = 6// running in VM
14 _thread_in_vm_trans = 7// corresponding transition state
15 _thread_in_Java = 8// running in Java or in stub code
16 _thread_in_Java_trans = 9// corresponding transition state (not used, included for completness)
17 _thread_blocked = 10// blocked in vm
18 _thread_blocked_trans = 11// corresponding transition state
19 _thread_max_state = 12 // maximum thread state+1 - used for statistics allocation
20 };

這樣一來查看下線程dump,發現”Thread-1″正好處於BLOCKED狀態,也就是說Thread-1就是那個正在對mysql驅動類做初始化的線程,這說明Thread-0和Thread-1成功互鎖了

於是我們展開【Thread-1】,看到- waiting to lock (a java.lang.Class for java.sql.DriverManager),該線程正在等待java.sql.DriverManager類型鎖,而blocked在那裏,而這個類型鎖是被【Thread-0】線程持有的,從【Thread-1】這個線程堆棧來看它其實也是在做Class.forName動作,並且通過”Thread-1″,展開第四幀我們可以看到其正在對加載sun.jdbc.odbc.JdbcOdbcDriver

問題現場遐想:

於是我們大膽設想一個場景,【Thread-1】先獲取到初始化sun.jdbc.odbc.JdbcOdbcDriver的機會,然後在執行sun.jdbc.odbc.JdbcOdbcDriver這個類的靜態塊的時候調用DriverManager.registerDriver(new Driver());,而該方法之前已經提到了是會加同步鎖的,再想象一下,在這個這個靜態塊之前,並且設置了sun.jdbc.odbc.JdbcOdbcDriver類的初始化狀態為being_initialized之後,【Thread-0】這個線程執行到了卡在的那個位置,並且我們從其堆棧可以看出它已經持有了java.sql.DriverManager這個類型的鎖,因此這兩個線程陷入了互鎖狀態

解決方案
解決方案目前想到的是將驅動類的加載過程變成單線程加載,不存在並發情況就沒問題了。

最後更新:2017-05-23 18:02:54

  上一篇:go  Bug:StampedLock鐨勪腑鏂棶棰樺鑷碈PU鐖嗘弧-鍗氬-浜戞爾紺懼尯-闃塊噷浜?
  下一篇:go  AlphaGo 其實挺“笨”的