本文主要是介绍X11 qt5.6.2 程序运行一段时间后卡死/不刷新/(死机)(_XReply),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
X11 qt5.6.2 程序运行一段时间后卡死/不刷新/(死机)(_XReply)
硬件:
飞凌 imx6dl 开发板
软件:
linux 4.1.15 X11 qt5.6.2
问题的现象:
QT应用程序运行一段时间后会卡死(界面不刷新), 此时, top命令可以看到程序进程还在,cpu占用率为0 (图中Impella进程), 内核状态也正常,内存资源看起来也正常, 触摸屏事件也能捕捉到,再运行其他qt程序也可以工作,唯独跑了一段时间后的Impella卡死了。
复现时间间隔不固定,可能两到三小时,或者一到两天。
解决过程:
问题debug了 很久,基于对开发板厂商的信任, 一开始 怀疑QT 程序内有死锁,或陷入睡眠,死循环等。
测试发现死锁或者睡眠,的现象与问题的现象一致,都是进程卡死,cpu占用率为0
死循环 也会导致卡死,但是cpu占用率很高,与问题不一致,
梳理代码后我认为,死锁,或陷入睡眠,死循环的概率很小。 代码中没有用信号量或者sleep等
又开始怀疑 是不是某槽函数阻塞导致假死, 一段时间百度折腾后,网上答案五花八门,都没解决我的问题。
最后想到用gdb去查看程序卡死时的状态,调用栈等。
程序卡死后,我们可以gdb attach 程序对应的pid
然后通过 bt full查看程序调用栈,
(gdb) bt full
#0 0x75e5fa0c in pthread_cond_wait () from /lib/libpthread.so.0
No symbol table info available.
#1 0x75856b3c in _XReply () from /usr/lib/libX11.so.6
No symbol table info available.
#2 0x758593e4 in ?? () from /usr/lib/libX11.so.6
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
从上面log可以看到, 程序卡死时,进入了自下往上 _XReply () > pthread_cond_wait
最终陷入了等待, 原来是 libX11 的锅。
查看 libX11 的问题记录发现, 官方在1.6.9以后的版本中都修复了此问题,
https://gitlab.freedesktop.org/xorg/lib/libx11/-/merge_requests/13/diffs
Avoid recursing through _XError due to sequence adjustment This patch is based on research done by Dmitry Osipenko to uncover the
cause of a large class of Xlib lockups._XError must unlock and re-lock the display around the call to the user error handler function. When re-locking the display, two
functions are called to ensure that the display is ready to generate a
request:_XIDHandler(dpy); _XSeqSyncFunction(dpy);
The first ensures that there is at least one XID available to use
(possibly calling _xcb_generate_id to do so). The second makes sure a
reply is received at least every 65535 requests to keep sequence
numbers in sync (possibly generating a GetInputFocus request and
synchronously awaiting the reply).If the second of these does generate a GetInputFocus request and wait
for the reply, then a pending error will cause recursion into _XError,
which deadlocks the display.One seemingly easy fix is to have _XError avoid those calls by
invoking InternalLockDisplay instead of LockDisplay…
1.7.0版本中又进行了调整如下:
https://gitlab.freedesktop.org/xorg/lib/libx11/-/commit/30ccef3a48029bf4fc31d4abda2d2778d0ad6277
Avoid recursing through _XError due to sequence adjustment This patch
is based on research done by Dmitry Osipenko to uncover the cause of a
large class of Xlib lockups._XError must unlock and re-lock the display around the call to the user error handler function. When re-locking the display, two
functions are called to ensure that the display is ready to generate a
request:_XIDHandler(dpy); _XSeqSyncFunction(dpy);
The first ensures that there is at least one XID available to use
(possibly calling _xcb_generate_id to do so). The second makes sure a
reply is received at least every 65535 requests to keep sequence
numbers in sync (possibly generating a GetInputFocus request and
synchronously awaiting the reply).If the second of these does generate a GetInputFocus request and wait
for the reply, then a pending error will cause recursion into _XError,
which deadlocks the display.One seemingly easy fix is to have _XError avoid those calls by
invoking InternalLockDisplay instead of LockDisplay…
解决方法:
我目前的bsp中的libX11 版本 是 1.6.3 ,
所以把libX11 的版本更新到 1.6.9 以上的版本即可
我最终选择更新到当前最新版本1.7.2
libX11 源码下载地址
https://www.x.org/releases/individual/lib/
我一般都对平台比较信任的,但出了问题,就纯纯要人命啊。 希望此记录能帮助有缘人避坑。
这篇关于X11 qt5.6.2 程序运行一段时间后卡死/不刷新/(死机)(_XReply)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!