Java Thread 那些事
這篇文章被壓在草稿箱許久,最近公司內部的技術社區有同學貼出了幾篇分享Java線程的文章,發覺有很多的地方可以深挖,所以花點時間繼續撰寫,便有了這篇博文。本文隻聚焦JVM層麵的線程模型,不考慮和真實的操作係統Thread模型掛鉤(由於篇幅有限,本文不會介紹Thread dump結構,也不會介紹調優過程中對工具的綜合使用,如ps,perf.top,iostat,jstack,TDA plugin,Thread inspector.如果有問題,歡迎大家留言交流)。後麵會考慮對xUnix和Windows平台的線程/進程模型進行深入分析,也希望大家能夠喜歡。ok,言歸正傳。上圖:

Java的線程狀態一共有NEW,RUNNABLE,BLOCKED,WAITING,TIMED_WAITING,TERMINATED 6種狀態。這裏重點關注一下BLOCKED和TIMED_WAITING狀態。
BLOCKED狀態:線程進入此狀態的前提一般有兩個:waiting for monitor(intrinsic or external) entry 或者 reenter 同步代碼塊。講到這我們先了解一下Java線程模型中的兩個隊列。如圖所示:

每個 Monitor在某個時刻,隻能被一個線程擁有,該線程就是 “Active Thread”,而其它線程都是 “Waiting Thread”,分別在兩個隊列 “Entry Set”和 “Wait Set”裏麵等候。在 “Entry Set”中等待的線程狀態是 “Waiting for monitor entry”,而在 “Wait Set”中等待的線程狀態是 “in Object.wait()”。如果你不恰當的使用了ReentrantLock或者ReentrantReadWriteLock類,就有可能陷入BLOCKED狀態,這個也是我們調優中經常會遇到的情況,解決方案也很簡單,找到等待上鎖的地址,分析是否發生了Thread
starvation。
至於TIME_WAITING狀態,官方文檔也講解的比較好,即你在調用下麵方法時,線程會進入該狀態。
Thread.sleep Object.wait with timeout Thread.join with timeout LockSupport.parkNanos LockSupport.parkUntil
這裏重點關注一下LockSupport,該類是用來創建鎖和其他同步類的基本線程阻塞原語,是一個針對Thread.suspend和Thread.resume()的優化,也是針對忙等,防止過度自旋的一種優化(關於這一點,感興趣的同學可以參閱一下文獻5)。
ok,在簡單介紹完幾個重點的線程狀態後,我們通過幾個具體的case來了解下Thread stack:
Case 1:NIO 中的Acceptor
"qtp589745448-36 Acceptor0 SelectChannelConnector@0.0.0.0:8161" prio=10 tid=0x00007f02f8eea800 nid=0x18ee runnable [0x00007f02e70b3000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) - locked <0x00000000ec8ffde8> (a java.lang.Object) at org.eclipse.jetty.server.nio.SelectChannelConnector.accept(SelectChannelConnector.java:109) at org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:938) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) Locked ownable synchronizers: - None
瞅瞅源代碼中是怎麼實現的,如下:
public void accept(int acceptorID) throws IOException
100 {
101 ServerSocketChannel server;
102 synchronized(this)
103 {
104 server = _acceptChannel;
105 }
106
107 if (server!=null && server.isOpen() && _manager.isStarted())
108 {
109 SocketChannel channel = server.accept();
110 channel.configureBlocking(false);
111 Socket socket = channel.socket();
112 configure(socket);
113 _manager.register(channel);
114 }
115 }
關於Thread stack,這裏強調一點:nid,native lwp id,即本地輕量級進程(即線程)ID。
Case 2: NIO中的Selector
"qtp589745448-35 Selector0" prio=10 tid=0x00007f02f8ee9800 nid=0x18ed runnable [0x00007f02e71b4000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) - locked <0x00000000ec9006f0> (a sun.nio.ch.Util$2) - locked <0x00000000ec9006e0> (a java.util.Collections$UnmodifiableSet) - locked <0x00000000ec9004c0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) at org.eclipse.jetty.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:569) at org.eclipse.jetty.io.nio.SelectorManager$1.run(SelectorManager.java:290) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) Locked ownable synchronizers: - None
代碼片段如下:
// If we should wait with a select
566 if (wait>0)
567 {
568 long before=now;
569 selector.select(wait);
570 now = System.currentTimeMillis();
571 _timeout.setNow(now);
572
573 // If we are monitoring for busy selector
574 // and this select did not wait more than 1ms
575 if (__MONITOR_PERIOD>0 && now-before <=1)
576 {
577 // count this as a busy select and if there have been too many this monitor cycle
578 if (++_busySelects>__MAX_SELECTS)
579 {
580 // Start injecting pauses
581 _pausing=true;
582
583 // if this is the first pause
584 if (!_paused)
585 {
586 // Log and dump some status
587 _paused=true;
588 LOG.warn("Selector {} is too busy, pausing!",this);
589 }
590 }
591 }
592 }
Case 3: ActveMQ中針對MQTT協議的Handler
"ActiveMQ Transport Server Thread Handler: mqtt://0.0.0.0:1883?maximumConnections=1000&wireFormat.maxFrameSize=104857600" daemon prio=10 tid=0x00007f02f8ba6000 nid=0x18dc waiting on condition [0x00007f02ec824000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000faad0458> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at org.apache.activemq.transport.tcp.TcpTransportServer$1.run(TcpTransportServer.java:373) at java.lang.Thread.run(Thread.java:724) Locked ownable synchronizers: - None代碼片段:
@Override
protected void doStart() throws Exception {
if (useQueueForAccept) {
Runnable run = new Runnable() {
@Override
public void run() {
try {
while (!isStopped() && !isStopping()) {
Socket sock = socketQueue.poll(1, TimeUnit.SECONDS);
if (sock != null) {
handleSocket(sock);
}
}
} catch (InterruptedException e) {
LOG.info("socketQueue interuppted - stopping");
if (!isStopping()) {
onAcceptError(e);
}
}
}
};
socketHandlerThread = new Thread(null, run, "ActiveMQ Transport Server Thread Handler: " + toString(), getStackSize());
socketHandlerThread.setDaemon(true);
socketHandlerThread.setPriority(ThreadPriorities.BROKER_MANAGEMENT - 1);
socketHandlerThread.start();
}
super.doStart();
}Case 5: 模擬銀行轉帳存款"withdraw" prio=10 tid=0x00007f3428110800 nid=0x2b6b waiting for monitor entry [0x00007f34155bb000] java.lang.Thread.State: BLOCKED (on object monitor) at com.von.thread.research.DeadThread.depositMoney(DeadThread.java:13) - waiting to lock <0x00000000d7fae540> (a java.lang.Object) - locked <0x00000000d7fae530> (a java.lang.Object) at com.von.thread.research.DeadThread.run(DeadThread.java:28) at java.lang.Thread.run(Thread.java:724) Locked ownable synchronizers: - None "deposit" prio=10 tid=0x00007f342810f000 nid=0x2b6a waiting for monitor entry [0x00007f34156bc000] java.lang.Thread.State: BLOCKED (on object monitor) at com.von.thread.research.DeadThread.withdrawMoney(DeadThread.java:21) - waiting to lock <0x00000000d7fae530> (a java.lang.Object) - locked <0x00000000d7fae540> (a java.lang.Object) at com.von.thread.research.DeadThread.run(DeadThread.java:29) at java.lang.Thread.run(Thread.java:724) Locked ownable synchronizers: - None
Found one Java-level deadlock: ============================= "withdraw": waiting to lock monitor 0x00007f3400003620 (object 0x00000000d7fae540, a java.lang.Object), which is held by "deposit" "deposit": waiting to lock monitor 0x00007f3400004b20 (object 0x00000000d7fae530, a java.lang.Object), which is held by "withdraw" Java stack information for the threads listed above: =================================================== "withdraw": at com.von.thread.research.DeadThread.depositMoney(DeadThread.java:13) - waiting to lock <0x00000000d7fae540> (a java.lang.Object) - locked <0x00000000d7fae530> (a java.lang.Object) at com.von.thread.research.DeadThread.run(DeadThread.java:28) at java.lang.Thread.run(Thread.java:724) "deposit": at com.von.thread.research.DeadThread.withdrawMoney(DeadThread.java:21) - waiting to lock <0x00000000d7fae530> (a java.lang.Object) - locked <0x00000000d7fae540> (a java.lang.Object) at com.von.thread.research.DeadThread.run(DeadThread.java:29) at java.lang.Thread.run(Thread.java:724) Found 1 deadlock.
這裏是一個非順序加鎖誘發的一個死鎖場景。
好了,差不多了,總結一下在調優過程中需要重點關注的三類情況(grep java.lang.Thread.State dump.bin | awk '{print $2$3$4$5}' | sort | uniq -c):
1. waiting for monitor entry – thread state blocked。可能發生的問題: deadlock(sequential deadlock,starvation deadlock...)
2. waiting on condition – sleeping or timed_waiting。可能發生的問題: IO bottleneck
3. Object.wait – TIMED_WAITING。wait & notifyAll使用上需要明確其性能及其局限性問題,JCIP上也推薦盡可能使用JUC提供的高級並發原語AQS
參考文獻:
-
https://architects.dzone.com/articles/how-analyze-java-thread-dumps
-
https://stackoverflow.com/questions/7698861/simple-java-example-runs-with-14-threads-why
-
https://www.slideshare.net/Byungwook/analysis-bottleneck-in-j2ee-application
-
https://docs.oracle.com/javase/1.5.0/docs/guide/misc/threadPrimitiveDeprecation.html
-
JavaConcurrency in practice
-
https://stackoverflow.com/questions/37026/java-notify-vs-notifyall-all-over-again
最後更新:2017-04-03 12:53:45