消失的死鎖

問題描述
如果java層麵發生了死鎖,當我們使用jstack
命令的時候其實是可以將死鎖的信息給dump出來的,在dump結果的最後會有類似Found one Java-level deadlock:
的關鍵字,接著會把發生死鎖的線程的堆棧及對應的同步鎖給打印出來,這次碰到一個係統就發生類似的問題,不過這個dump文檔裏雖然提到了如下的死鎖信息:
Found one Java-level deadlock:
=============================
"worker-1-thread-121":
waiting to lock monitor 0x00007f3758209dc8 (object 0x0000000764cd2b20, a java.util.concurrent.ConcurrentHashMap),
which is held by "HSFBizProcessor-4-thread-4"
"HSFBizProcessor-4-thread-4":
waiting to lock monitor 0x00007f3758289260 (object 0x000000076073ddc8, a com.rjb.test.extensions.equinox.KernelBundleClassLoader),
which is held by "HSFBizProcessor-4-thread-5"
"HSFBizProcessor-4-thread-5":
waiting to lock monitor 0x00007f3758253420 (object 0x00000007608e6fc8, a com.rjb.test.extensions.equinox.KernelBundleClassLoader),
which is held by "HSFBizProcessor-4-thread-4"
但是我們在堆棧裏搜索對應的鎖的時候並沒發現,也就是上麵提到的
object 0x00000007608e6fc8 which is held by "HSFBizProcessor-4-thread-4"
我們在HSFBizProcessor-4-thread-4
這個線程的堆棧裏並沒有看到對應的持鎖信息。
附上線程dump詳情
Found one Java-level deadlock:
=============================
"worker-1-thread-121":
waiting to lock monitor 0x00007f3758209dc8 (object 0x0000000764cd2b20, a java.util.concurrent.ConcurrentHashMap),
which is held by "HSFBizProcessor-4-thread-4"
"HSFBizProcessor-4-thread-4":
waiting to lock monitor 0x00007f3758289260 (object 0x000000076073ddc8, a com.rjb.test.extensions.equinox.KernelBundleClassLoader),
which is held by "HSFBizProcessor-4-thread-5"
"HSFBizProcessor-4-thread-5":
waiting to lock monitor 0x00007f3758253420 (object 0x00000007608e6fc8, a com.rjb.test.extensions.equinox.KernelBundleClassLoader),
which is held by "HSFBizProcessor-4-thread-4"
Java stack information for the threads listed above:
===================================================
"worker-1-thread-121":
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:180)
- waiting to lock <0x0000000764cd2b20> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.AbstractBeanFactory.isTypeMatch(AbstractBeanFactory.java:455)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBeanNamesForType(DefaultListableBeanFactory.java:317)
......
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"HSFBizProcessor-4-thread-4":
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLoadedClass(Unknown Source)
- waiting to lock <0x000000076073ddc8> (a com.rjb.test.extensions.equinox.KernelBundleClassLoader)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.SingleSourcePackage.loadClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(Unknown Source)
at com.rjb.test.extensions.equinox.KernelBundleClassLoader.loadClass(KernelBundleClassLoader.java:121)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.springframework.scripting.groovy.GroovyScriptFactory.executeScript(GroovyScriptFactory.java:238)
......
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"HSFBizProcessor-4-thread-5":
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLoadedClass(Unknown Source)
- waiting to lock <0x00000007608e6fc8> (a com.rjb.test.extensions.equinox.KernelBundleClassLoader)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.buddy.DependentPolicy.loadClass(Unknown Source)
at org.eclipse.osgi.internal.loader.buddy.PolicyHandler.doBuddyClassLoading(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(Unknown Source)
at com.rjb.test.extensions.equinox.KernelBundleClassLoader.loadClass(KernelBundleClassLoader.java:121)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at groovy.lang.MetaClassRegistry$MetaClassCreationHandle.createWithCustomLookup(MetaClassRegistry.java:127)
at groovy.lang.MetaClassRegistry$MetaClassCreationHandle.create(MetaClassRegistry.java:122)
......
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Found 1 deadlock.
類加載的問題?
首先應該懷疑類加載的問題,因為我們看到導致死鎖的對象是一個classloader對象:
waiting to lock monitor 0x00007f3758289260 (object 0x000000076073ddc8, a com.rjb.test.extensions.equinox.KernelBundleClassLoader)
然後我們再來分析下堆棧
HSFBizProcessor-4-thread-4
"HSFBizProcessor-4-thread-4":
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLoadedClass(Unknown Source)
- waiting to lock <0x000000076073ddc8> (a com.rjb.test.extensions.equinox.KernelBundleClassLoader)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.SingleSourcePackage.loadClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(Unknown Source)
at com.rjb.test.extensions.equinox.KernelBundleClassLoader.loadClass(KernelBundleClassLoader.java:121)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.springframework.scripting.groovy.GroovyScriptFactory.executeScript(GroovyScriptFactory.java:238)
at org.springframework.scripting.groovy.GroovyScriptFactory.getScriptedObject(GroovyScriptFactory.java:185)
我這裏隻把關鍵的線程棧貼出來,從棧頂知道正在等一把鎖:
- waiting to lock <0x000000076073ddc8> (a com.rjb.test.extensions.equinox.KernelBundleClassLoader)
這把鎖的對象是一個ClassLoader對象,我們找到對應的代碼,確實存在synchronized的操作:
private Class<?> findLoadedClass(String classname) {
if ((LOCK_CLASSNAME) || (this.isParallelClassLoader)) {
boolean initialLock = lockClassName(classname);
try {
return this.classloader.publicFindLoaded(classname);
} finally {
if (initialLock)
unlockClassName(classname);
}
}
synchronized (this.classloader) {
return this.classloader.publicFindLoaded(classname);
}
}
另外我們還知道它正在執行loadClass的動作,並且是從groovy調用來的,同樣找到對應的代碼:
protected Object executeScript(ScriptSource scriptSource, Class scriptClass)
throws ScriptCompilationException
{
try
{
GroovyObject goo = (GroovyObject)scriptClass.newInstance();//line 238
if (this.groovyObjectCustomizer != null)
{
this.groovyObjectCustomizer.customize(goo);
}
if ((goo instanceof Script))
{
return ((Script)goo).run();
}
return goo;
}
catch (InstantiationException ex)
{
throw new ScriptCompilationException(
scriptSource, "Could not instantiate Groovy script class: " + scriptClass.getName(), ex);
}
catch (IllegalAccessException ex) {
throw new ScriptCompilationException(
scriptSource, "Could not access Groovy script constructor: " + scriptClass.getName(), ex);
}
}
執行到第238行的時候
GroovyObject goo = (GroovyObject)scriptClass.newInstance();//line 238
突然發現調用了
java.lang.ClassLoader.loadClass(ClassLoader.java:247)
而我們看到上麵第238行的邏輯其實就是實例化一個對象,然後進行強轉,我們看看對應的字節碼:
0: aload_2
1: invokevirtual #164 // Method java/lang/Class.newInstance:()Ljava/lang/Object;
4: checkcast #168 // class groovy/lang/GroovyObject
7: astore_3
其實就對應這麼幾條字節碼指令,其實在jvm裏當我們執行checkcast指令的時候會觸發類加載的動作:
void TemplateTable::checkcast() {
...
call_VM(rax, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
...
}
IRT_ENTRY(void, InterpreterRuntime::quicken_io_cc(JavaThread* thread))
// Force resolving; quicken the bytecode
int which = get_index_u2(thread, Bytecodes::_checkcast);
constantPoolOop cpool = method(thread)->constants();
// We'd expect to assert that we're only here to quicken bytecodes, but in a multithreaded
// program we might have seen an unquick'd bytecode in the interpreter but have another
// thread quicken the bytecode before we get here.
// assert( cpool->tag_at(which).is_unresolved_klass(), "should only come here to quicken bytecodes" );
klassOop klass = cpool->klass_at(which, CHECK);
thread->set_vm_result(klass);
IRT_END
klassOop klass_at(int which, TRAPS) {
constantPoolHandle h_this(THREAD, this);
return klass_at_impl(h_this, which, CHECK_NULL);
}
klassOop constantPoolOopDesc::klass_at_impl(constantPoolHandle this_oop, int which, TRAPS) {
...
klassOop k_oop = SystemDictionary::resolve_or_fail(name, loader, h_prot, true, THREAD);
...
}
//SystemDictionary::resolve_or_fail最終會調用到下麵這個方法
klassOop SystemDictionary::resolve_instance_class_or_null(Symbol* name, Handle class_loader, Handle protection_domain, TRAPS) {
...
// Class is not in SystemDictionary so we have to do loading.
// Make sure we are synchronized on the class loader before we proceed
Handle lockObject = compute_loader_lock_object(class_loader, THREAD);
check_loader_lock_contention(lockObject, THREAD);
ObjectLocker ol(lockObject, THREAD, DoObjectLock);
...
//此時會調用ClassLoader.loadClass來加載類了
...
}
Handle SystemDictionary::compute_loader_lock_object(Handle class_loader, TRAPS) {
// If class_loader is NULL we synchronize on _system_loader_lock_obj
if (class_loader.is_null()) {
return Handle(THREAD, _system_loader_lock_obj);
} else {
return class_loader;
}
}
SystemDictionary::resolve_instance_class_or_null
這個方法非常關鍵了,在裏麵我們看到會獲取一把鎖ObjectLocker,其相當於我們java代碼裏的synchronized
關鍵字,而對象對應的是lockObject,這個對象是上麵的SystemDictionary::compute_loader_lock_object
方法返回的,從代碼可知隻要不是bootstrapClassloader加載的類就會返回當前classloader對象,也就是說當我們在加載一個類的時候其實是會持有當前類加載對象的鎖的,在獲取了這把鎖之後就會調用ClassLoader.loadClass來加載類了。這其實就解釋了HSFBizProcessor-4-thread-4
這個線程為什麼持有了
object 0x00000007608e6fc8, a com.rjb.test.extensions.equinox.KernelBundleClassLoader
這個類加載的鎖,不過遺憾的是因為這把鎖不是java層麵來顯示加載的,因此我們在jstack
線程dump的輸出裏居然看不到這把鎖的存在.
HSFBizProcessor-4-thread-5
先上堆棧:
"HSFBizProcessor-4-thread-5":
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLoadedClass(Unknown Source)
- waiting to lock <0x00000007608e6fc8> (a com.rjb.test.extensions.equinox.KernelBundleClassLoader)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findLocalClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.buddy.DependentPolicy.loadClass(Unknown Source)
at org.eclipse.osgi.internal.loader.buddy.PolicyHandler.doBuddyClassLoading(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(Unknown Source)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(Unknown Source)
at com.rjb.test.extensions.equinox.KernelBundleClassLoader.loadClass(KernelBundleClassLoader.java:121)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
這個線程棧其實和之前那個線程差不多,隻是等的鎖不一樣,另外觸發類加載的動作是Class.forName
,獲取大家也猜到了,其實是在下麵兩行堆棧之間同樣獲取了一把類加載器的鎖
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
這裏的代碼我也不細貼了,最終調用的jvm裏的方法都是一樣的,獲取鎖的邏輯也是一樣的
總結
想象下這種場景,兩個線程分別使用不同的classloader對兩個類進行類加載,然而由於osgi類加載機製的緣故,在loadClass過程中可能會委托給別的classloader去加載,而正巧,這兩個線程在獲取當前classloader的鎖之後,然後分別委托對方的classloader去加載,可以看到文章開頭列的那個findLoadedClass方法,而synchronized的那個classloader正好是對方的classloader,從而導致了死鎖
最後更新:2017-04-11 19:32:01