PDA

View Full Version : understanding exploits


felix
06-01-2004, 04:28 PM
Aloha!

I'm reading and trying understand how exploits work, in a paper they used a technique called "ALIGN" that i don't know what it try align! :(

The code is that:



for (i=0;i<howling;i+=4)

{
buffer[i+ALIGN]=(retaddr&0x000000ff);
buffer[i+ALIGN+1]=(retaddr&0x0000ff00)>>8;
buffer[i+ALIGN+2]=(retaddr&0x00ff0000)>>16;
buffer[i+ALIGN+3]=(retaddr&0xff000000)>>24;
}


Where:

i = integer var to be used in loop
howlong = is the size of buffer to be overflowed
ALIGN = is a integer initilized with ZERO.
retaddr = is the stack pointer less offset (in this case ZERO).

I understood that it is a rigth bit deslocation, but why ?? for what ? What are this 0xyyyyyyy ? Someone can help me ?

Thkz

RobSeace
06-01-2004, 07:39 PM
Well, on some systems, code and data must be properly aligned
on the stack, usually so that it starts on a native word address
boundry... So, presumably, that "ALIGN" value would be set to
the offset needed to reach the proper alignment... *shrug*

As for the bit-twiddling and hex numbers, all that's doing is splitting
the retaddr into its component bytes, and stuffing each one
individually into the appropriate spot in the buffer... Though,
really, those 4 assignments could be replace by a simple:


memcpy (buffer + i + ALIGN, &retaddr, 4);


But, some people like to do things the hard way... ;-)

Nope
06-02-2004, 03:31 AM
No Rob, a simple memcpy isn't the same. To copy what the original code
does you'd have to perform a htonl prior to the memcpy. I do the same
somewhere in my code to "normalise" some ints as I once had problems
to get htonl to work properly with int instead of unsigned long.

I am not sure why they bother to add the Align thing. The compiler is
supposed to take care of those problems after all.

Well Felix. The 0xyyyyyyyy is the hex way to represent a 32 bit integer.
The FF you see in there represents the case that 8 bits are set to 1. The
'&' performs a bitwise "and" operation, making in this case sure that all
other bits beside those under the FF are set to zero. The buffer is a char
array and a char is supposed to be a 8 bit variable. By mascing the other
bits out they make sure that the type conversion from int to char goes right.
By shifting the value by 0,8,16,24 bits they put the value they want to
store in the lowest 8bits of the int so that it can be taken over by the char.

"192.168.0.15" is C0A8000F in hex.

C0A8000F & 000000FF is 000000 0F
C0A8000F & 0000FF00 is 00000000 right shift by 8 is 000000 00
C0A8000F & 00FF0000 is 00A80000 right shift by 16 is 000000 A8
C0A8000F & FF000000 is C0000000 right shift by 24 is 000000 C0

Now integers are stored differently on different systems. So the memcpy
that Rob proposes would reverse the byte order on a i386 based system.
On other systems, like the ones with a G5 processor the byte order would
be correct. In this case would be a union(int and 4byte char array) faster
than a memcpy. That's why we have to normalise values in the net to ensure
that all computers get the same numbers to work with. It gets woerse on
some older 16 bit systems that represent a 32bit value as a pair of 2 16bit
values.

on a G5: C0A8000F
on i386: 0F00A8C0
old 16bit: 000F C0A8 or 0F00 A8C0 (I've seen both)

Nope
06-02-2004, 03:36 AM
no edit in here...

of course the retaddr thingy is supposed to be already in net byte order in
the first place, so Rob is right of course. :oops:

RobSeace
06-02-2004, 01:35 PM
Heh. I was just about to write code to verify that I was correct,
and then bitch at you, too... ;-)

But, it has nothing to do with it being in network byte order; rather,
it's already in host byte order, and you want to KEEP it in host
byte order, since it needs to be interpreted by the host as a valid
address to jump to... So, regardless of your native byte ordering,
you can ALWAYS safely do a simple memcpy() into the buffer...
In fact, that's always SAFER to do than trying to do the manual
byte-splitting, since to do that, you need to make some implicit
assumptions about the native byte ordering, to determine which
byte goes where in the buffer... Just doing a raw memcpy() hides
those details from you, and everything works magically... ;-)

felix
06-02-2004, 04:58 PM
Hi RobSeace and Nope! :)

Thkz once again for help with my dummy questions.. :oops:

The explanation of Nope was really intersting i doesn't know about that facts between processors like i386, G5, 16bits processors, etc. Very cool it! And appear that play with ASM in G5 is more easy since i belive that wi can push data at stack in normal order (not like i386, in reverse order).

I already know what is a shift rigth and AND logical operation, my doubt is for what use it and why that hex number.

But based at your example i had a doubt. You post:

C0A8000F & 00FF0000 is 00A80000

And if we have C0A8000F & 00A80000 what should be the result ? 00A80000, 00000000 or 00110000 ??

I yet have some doubts.

1 - Where can i found documentation about this "need" of "align" data at stack ? I doen't understand it very well yet.. the why/because.. :cry:

2 - Why he does that AND operation between retaddr (stack pointer address of his program) and this pre-defined hex numbers that are one byte changing the FF (255) in each to 2 bits and making a 8 bits increment in shift rigth ? Why ?? To put it each 2bits of the byte (retaddr) in binary form in each byte of "buffer" ??

3 - The memcpy() really is much more easy and better, but yet, with memcpy (buffer + i + ALIGN, &retaddr, 4); why use size 4 as last paramter ? Because it will convert each 2byte (2hex numbers from retaddr) into a binary and save at buffer ?

4 - Why do it in all buffer ?? Why we need to save the retaddr in this form in all buffer and not only in the original retaddr ?

ps.: When i be a big man, i will want be like RobSeace and Nope! :wink:

Regards

RobSeace
06-02-2004, 08:21 PM
And if we have C0A8000F & 00A80000 what should be the result ?


0x00a80000... Bitwise-and yields a 1 bit in the result only where
there is a 1 bit in both of the arguments of the operation... That's
why the original source was using 0xff values at each byte
position within the 32-bit int: so that the result would only contain
that singular byte's values...

1. I'm not sure where to look for a good explanation of this, but
basically some archetectures require addresses to be properly
aligned on a word boundry, and if they're not, bad things are likely
to happen... (Typically a SIGBUS "Bus error"...) Intel x86 doesn't
require any specific alignment... But, some systems do...

2. I'm not sure what you mean by "2 bits"... 0xff respresents 1
byte (8 bits)... Each hex digit is a nybble (4 bits), so 2 hex digits
represents a full byte... As for what the code is doing and why,
as I said before, it's extracting each individual byte of the retaddr
and storing it in the buffer... The right-shifts are necessary to
shift down the higher-order bytes to the low-order spot; ie: to put
them in the range 0 - 255... Eg: to turn your 0x00a80000 into
0x000000a8... Then, that low-order value can be directly asigned
into a single byte/char...

3. The size is 4, because that's the size of a 32-bit int... Of course,
a proper method would be to use "sizeof (void *)", since that'll
give the system's native pointer size, which is what "retaddr"
should be... But, I'm assuming it was hard-coded for 32-bit
systems; and, specifically little-endian ones, based on the original
byte-splitting code... (Of course, if this is exploit code you're
looking at, then it pretty much HAS to be hard-coded for a specific
single archetecture anyway, since it needs intimate knowledge of
how that system places things in memory, and of course it'll need
to stick in some raw machine code to run...)

4. It's put into a buffer, presumably because it then later passes
that buffer somehow to the vulnerable program to get it to misbehave,
and then to have the address within it overwrite the return address
on the stack, and jump to the code (either also within the buffer,
or located elsewhere, like an envvar)...

Nope
06-03-2004, 01:01 AM
Ok, ways to write a number
decimal: 192
binary: 11000000
hex: C0
octal: 300

dec 0 | hex 0 | bin 0000
dec 1 | hex 1 | bin 0001
dec 2 | hex 2 | bin 0010
dec 3 | hex 3 | bin 0011
dec 4 | hex 4 | bin 0100
dec 5 | hex 5 | bin 0101
dec 6 | hex 6 | bin 0110
dec 7 | hex 7 | bin 0111
dec 8 | hex 8 | bin 1000
dec 9 | hex 9 | bin 1001
dec 10 | hex A | bin 1010
dec 11 | hex B | bin 1011
dec 12 | hex C | bin 1100
dec 13 | hex D | bin 1101
dec 14 | hex E | bin 1110
dec 15 | hex F | bin 1111

10101010 & 11110000 = 10100000
10101010 | 11110000 = 11111010
11111111 & 00001010 = 00001010
00001000 & 00001010 = 00001000

It's easier if you write the values below each other:

00001000 &
00001010 =
00001000

As you can see now very clearly you only get a 1 in the result when there
is a one in every of the values you work with.

The >>(shift right) operator always works on bits. So value>>1 shifts one
bit to the right, value>>8 shifts by 8 bit to the write. As a side note,
shifting is the fastest way to multiplay or divide an integer by 2.

And
( retaddr & 0x0000ff00 )>>8
does the same as
( retaddr>>8) & 0xFF

And if I remember right, intel architecture has an alignment of data in
memory. So for example a char array has to start on a 32bit word
address. Most modern processors need such a thing to work at maximum
efficiency. That's one cause for different sizes of structures even if the
types are all the same bitsize. And on some systems you can align to
speed things up, or not align to keep the data size smaller. But in your
code the align is used more like an offset, so perhaps they intend it to be
the offset to the address field in a header structure. That would also
explain why they try to enforce a specific (net-)byte order. The exploit
would then be the exchange of the real address in the header by a made
up one.

Normally you don't need to worry about byte order in any language, not
even assembler. You will always get the same bit with a bit based
operation and your shift operations will always do it right. You only have to
think about the byte order if you want to exchange data between different
systems. The byte order is the way the processor handles and stores the
data internally.

As I said, I use a comparable code sequence like the one from the first
post to exchange data when the client might run on a different computer
type.

------------------

Ok, I've read the whole original post again. They try to overflow a
receive(?) buffer. So what they try to do is the following:

They write or send more data into a buffer than it can hold. So if the
program has no code to check if the incoming data fits into the buffer,
and in case the data is stored on the stack they can overwrite the return
address of the function. When you jump into a function, to for example
receive data, the address it has to jump back to later is stored on the
stack, that's the so called return address. If you overwrite it you can
change the way the program goes on and instead of jumping back you
jump to another function. As you don't know how long the buffer is
exactly you repeat that attack with increasing sizes of "howling" and you
place the data for the function you try to call on proper boundaries behind
it. While an array of chars has to start at a 32bit boundary (4byte
alignment) on a 32bit system, compiler normally gather all char arrays
into one big array to keep the memory waste low. So the buffer itself
might not be properly aligned. In that case you might overwrite the return
address with a not working version (1 to 3 bytes off) only causing a crash
of the task. If that happens you change the ALIGN value until you get the
proper placement.

If you want to see which or if your system has such an enforced
alignment, just look at the size of the following struct.

struct aligntesta{
char a;
int i;
short s;
}

1+4+2=7 bytes, but most likely you'll get a size of 12 bytes as every
different data type has to start at a 32bit boundary. In that case this struct
will also need 12 bytes:

struct aligntestb{
char a[3];
int i;
short s[2];
}

allthough it has a raw byte count of 3+4+4=11

mlampkin
06-03-2004, 09:13 AM
For a buffer overflow exploit... especially if you are passing raw ( machine ) code and data at the system...

The exploit software has to make perform all required data alignment ( as described in the preceding messages) by "hand"... otherwise the exploit commands would fault out instead of executing...

That would be the rational behind the alignment var...

The byte by byte copy is also because of the above... i.e. the alignment between the host and target exploit string / commands may be different ( for any number of reasons )... so a straight assignment isn't possible...

The 0x000000FF etc. lines seem sloppy btw... the compiler should be giving warnings about potential precision loss / type mismatching if it sees that code ( and warnings are on )... since there is no explicit casting to the appropriate primitive type...

Also note that if the code were explicitly casting to the correct type... the only thing required are the shift operators ( i.e. = ( char ) addr >> 16; )... since the cast will automatically chop off the MSBs and you don't have to worry about accidental copy of the carry flag to somewhere inappropriate since all the shifts are on 8 bit boundaries...

Btw, I am guessing that is a fragment of the old UCD exploit code...


Michael

RobSeace
06-03-2004, 12:50 PM
And if I remember right, intel architecture has an alignment of data in
memory. So for example a char array has to start on a 32bit word
address.


No, it doesn't... It's often more efficient if data is aligned properly,
but it's most definitely NOT required on x86 systems... Otherwise,
you wouldn't be able to pack structs, so they take up a minimum
amount of space, and still be able to access them perfectly well...
(Which you most definitely CAN do... I do it all the time...)

But, some systems actually REQUIRE proper alignment to function
at all, and not simply for improved efficiency...

Nope
06-04-2004, 03:36 AM
It is right that intel processors can access unaligned data...most of the
time. Storing data unaligned will let you loose at least one clock cycle per
read or write operation. That doesn't sound much, but can easily slow
down the bottleneck ram by 30+%. Considering that the alignment is type
specific and that a normal compiler therefore only makes sure that the
first variable of a type is aligned properly you loose only very little
memory to the alignment overhead. So in the spirit of best resource use
on a PC type machine I see it as a must. As I said, you have to decide
between speed and memory usage. The SSE extensions for example get
so slowed down if the data isn't in a 16byte alignment that the use is
bogus. Well, if non-aligned data reduceses the efficiency, then alignment is
required in some way. The intel ones read 2 times the actual data and
then move it together into an aligned memory cell prior to start working
with it. By doing that you loose cache memory to gain a few bytes more regular ram. That this happens on hardware level doesn't change the
fact that the cpu core needs data alignment. The "late alignment" was just
built in to keep compatibility to older code. Other processors whose cores
have the same limitations and miss this additional compatibility mode
really need alignment.

RobSeace
06-04-2004, 02:09 PM
So in the spirit of best resource use
on a PC type machine I see it as a must.


Well, sure, I definitely wasn't arguing AGAINST properly aligning
your data for the best efficiency... Of course, it's always a wise
thing to do, when possible... But, the simple fact is that it's not an
absolute REQUIREMENT... So, if you have some NEED to use
unaligned data (as I often do; dealing with raw ctree database
records, packed to smallest size possible, for lowest disk usage),
it'll work just fine, though perhaps a bit less efficiently...


Well, if non-aligned data reduceses the efficiency, then alignment is
required in some way.


*blink* This is obviously some strange new usage of the word
"required" that I wasn't previously aware of... ;-) I suppose then
that you also think it's a requirement to run "vi", because running
"emacs" reduces the efficiency of your system, as well? ;-) (I
actually fully support that idea, BTW... ;-))

Nope
06-06-2004, 05:55 PM
Quote:
Well, if non-aligned data reduceses the efficiency, then alignment is
required in some way.


*blink* This is obviously some strange new usage of the word
"required" that I wasn't previously aware of... ;-)

Och, be fair. The core needs alignment, the memory interface not. So it alignes the data properly in additional work cycles. 8)

And yes, I prefer vi over emacs. Longer ago (some 20years) I wouldn't even use vi and condemed all others using it in the terminal room. 5 guys would get the system down to a crawl after all... :lol:

RobSeace
06-06-2004, 06:37 PM
Sure, but who interacts with the core?? ;-) What the programmer
sees of Intel x86 looks nothing like its actual core, anymore...
These days, it's basically a RISC core, with a CISC interpreter on
top... But, really, who cares?? ;-) Ok, I'm sure someone cares;
like those writing optimizing compilers... But, *I* don't care... ;-)
It's still just a bloated CISC chip to me, and one that'll work just fine
if I throw unaligned data/code at it... How it accomplishes this
underneath doesn't really concern me too much... (Unless I'm
trying to eke out every last bit of efficiency I possibly can, for some
bit of code, anyway... But, more often than not, lots of other things
tend to take precedence over pure maximum efficiency...)

But, yes, I was being a smart-ass, and you may be TECHNICALLY
correct about there being SOME hidden requirement (I actually am
not sure, since I haven't kept up on the modern x86 core design)
somewhere... ;-) But, the requirement is not one visible from the
outside, which is really all that matters, IMHO...

Oh, and vi is the only true editor, of course... ;-) Emacs is the
work of the devil... I mean, what sick, twisted weirdo would put a
damned LISP interpreter into his editor?!? ;-)

felix
06-07-2004, 07:46 PM
Hi,

Thkz a lot again, i really can understand it now. ;-)

Only a curious, i saw some of you writing "little endian & big endian" and i search about it (to learn what it is), and i saw that the PowePC can work like little and big endian... really cool! :)

Regards.

felix