探讨互斥锁锁定非临界区带来多少性能消耗

本文主要是介绍探讨互斥锁锁定非临界区带来多少性能消耗，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

概述

本次主要是测试使用互斥锁，锁定非临界区带来的性能消耗。

在我们写代码时，有时候通过逻辑的设计，可以使代码中临界区在80%以上不会同时访问。但是从理论上来说，在极端或者概率很低的情况下它是可能成为临界区的。处于程序的稳定性考虑，同样是需要加锁的。

但是最近在看disruptor文档[1]时，文献提到：

即使不是临界资源，只要调用了锁就会大幅度的降低性能。
而我之前在项目中的代码，总是会考虑逻辑上减少多线程去竞争同一个锁，这难道是在做无用功？

文中采用的是简单的做5亿次++操作，考虑到其是用Java实现的，因此此处采用C来实现，实践来检验一下结果

如果有资源竞争，肯定会导致性能下降。因此我们主要对比进入“假临界区"的场景。

测试代码：

点击(此处)折叠或打开

#include<stdio.h>
#include<time.h>
#include<sys/time.h>
#include<unistd.h>
#include<string.h>
#include<pthread.h>
unsigned long gtimes = 2 * 1000 * 1000 * 1000;
unsigned long i;
struct timeval startTime, endTime;
pthread_mutex_t gmutex; //ensure not a stack varible;
void start_time()
{
gettimeofday(&startTime, NULL);
}
void end_time()
{
gettimeofday(&endTime, NULL);
}
double spend_time()
{
return 1000 * (endTime.tv_sec - startTime.tv_sec) +
(endTime.tv_usec - startTime.tv_usec) / 1000.0f;
}
void* test_thread(void* argv)
{
i = gtimes;
start_time();
while(i--);
end_time();
printf(" a thread cost time: %.2f ms\n", spend_time());
return NULL;
}
void* test_lockthread(void* argv)
{
i = gtimes;
pthread_mutex_init(&gmutex,NULL);
start_time();
pthread_mutex_lock(&gmutex);
while(i--);
pthread_mutex_unlock(&gmutex);
end_time();
pthread_mutex_destroy(&gmutex);
printf(" a thread with a pthread_mutex, cost time: %.2f ms\n", spend_time());
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t pid;
//pthread_create(&pid, NULL, test_thread, NULL);
pthread_create(&pid, NULL, test_lockthread, NULL);
pthread_join(pid, NULL);
test_thread(NULL);
test_lockthread(NULL);
return 0;
}

测试结果:

	不加锁	加锁	效率对比	绝对值	加锁在不同线程
1	990.86	1007.29	1.66%	16.43	987.79
2	996.13	997.04	0.09%	0.91	1001.21
3	988.47	989.19	0.07%	0.72	982.72
4	993.6	992.02	-0.16%	-1.58	986.94
5	984.85	984.57	-0.03%	-0.28	989.66
6	991.59	986.75	-0.49%	-4.84	992.94
7	986.68	986.72	0.00%	0.04	983.4
8	989.16	991.17	0.20%	2.01	987.69
9	987.22	1001.31	1.43%	14.09	985.03
10	986.27	984.09	-0.22%	-2.18	987.14

从上表可以看出：
如果两个场景在不同的线程中，没有可比性：两者差值不同。
再考虑到进程调度。理论上偏差也比较大

如果是同一个线程中，除了第1和第9组数据，差距都不是很大：
最大偏差<2ms，偏差率<0.5%.
多数偏差<1ms,偏差率<0.1%.

那么另外两组误差在哪里呢？同样是时间片。Linux中时间片是10ms。
在程序中，两个函数是挨着执行的，如果第一个函数执行完成之后，在执行第二个函数的start_time后时间片到期，此时就会多消耗一个时间片。
那么我们将后一个函数减去时间片，则基本上可以在接收的范围内——实际上还会有至少两次线程切换