Well I’ve been having fun with a little piece of code in the office. It’s apparently written in C++. Well it seems to be written in C with just a small smattering of C++ to make it useless. It’s a true candidate for the WTF.
Another person stung by the do what I mean mentality
Ah, yes. I saw that Albert had posted the listings for how young Dave Connolly got on in the winter Olympics Skeleton (20th! excellent for him), his previous blog entry had me cringe.
It was another case of expecting a language to perform implicit type conversion when assigning it. I was learning C when I found out about this. <float> = <int> / <int>; does not yield the correct answer if you were expecting a floating point answer.
It’s partially to do with the way the language is implemented and partially to do with the KISS principle. What you are doing is essentially performing two separate operations. A division operation, followed by an assignment operation. The division operation will happen with all the precision of the types that were used, so under the rules, if none of the values are a floating point value then the operation will not use floating point arithmetic. I mean, if you had to convert every integer into a floating point value every time you encountered a mathematical operation just in case the answer is a floating point value you end up making the system very, very slow and potentially likely to cause problems of other types.
Like attempting to make an equality test on floating point numbers. Let’s just say you should never use a direct equality comparison on floating point values, as even if you think the value is 0.0 it may in reality be 0.0000000000001. With the extra precision of floating point numbers comes less accuracy, and to steal someone else’s description, if ints are bricks then floats are more like silly putty.
If you want to use floating point math then at least one of the elements of the operation must be a floating point value. For example:
int a = 1; int b = 2; float c = 0; c = a / b;
Means that c is 0.0, not 0.5, because everything on the right hand side of the operation is an integer
c = (float)a / b;
Would cause c to be 0.5, which is what you wanted. As an aside the initial float c = 0 has an implied type conversion from int to float. It should really have been c = 0.0.
Of course, we can’t forget all the joyous C constant things we had to remember…. <value>U for unsigned, L for long, UL for unsigned long, LL for long long (64 bit anyone). When you’re using C on a predominantly 16 bit programming environment (like the palm platform), then you need to keep reminding yourself, because an int on the palm is only 16bits in size.
oh the joy of it all!
Detecting debuggers
Let’s assume that we’re all using an NT based OS (2k, XP, 2003). It’s a simple matter to detect that you’re being run unde a debugger – simply isue an IsDebuggerPresent call. Heck, you could even put it into a thread that intermittently polls and reacts (in)appropriately.
Of course, I’m not one for polling. Never have been. I just consider it ‘lower class’ to do something like that. This is of course why I like the next trick for detecting the arrival of a debugger.
what happens is that an external thread attaches and executed the ‘DbgUiRemoteBreakin’ function; which at some point issues an ‘DbgBreakPoint’, which of all things contains an int3 and a return statement (this is a kick the debugger operation).
Detecting the debugger consists of rewriting the function implementation.
What we want to do is invoke our function (fun) when a debugger attaches. The following code does just that, and as a result of the attachment of the process simply spits out a message and terminates.
#include <windows.h> #include <stdio.h> HANDLE wh; void message(void) { SetEvent(wh); ExitThread(0); } int main(int argc, char **argv) { unsigned char *abki; HMODULE dlh; wh = CreateEvent(NULL, FALSE, FALSE, NULL); dlh = GetModuleHandle("ntdll.dll"); abki = (void *)GetProcAddress(dlh, "DbgUiRemoteBreakin"); if (abki != NULL) { DWORD olp; if (VirtualProtect(abki, 20, PAGE_EXECUTE_READWRITE, &olp)) { *abki++ = 0xb8; // Mov EAX, *(DWORD *)abki = (DWORD)message; abki+=4; *(WORD *)abki = 0xd0ff; // call EAX } else { printf("Page Protection Failed\n"); return (1); } } WaitForSingleObject(wh, INFINITE); printf("Debugger\n"); return (0); }
This is the beginning of something bigger.
Assembly language, functions and misoriented parameters
I had a minor complaint some time back about the lack of consistency amongst the various windows APIs, they seemed to be written by people who chose one mechanism one day, and another the other day. The reality is that in such a large company as Microsoft, the different groups were consistent amongst themselves, the problem was that they were not consistent amongst each other.
This brings me to the rant du jour. When one finds oneself reading/writing assembly language, the code is platform consistent. For example on x86 the format is: operator parameters,destination. So, mov 0xffffffff, %eax means put the value 0xffffffff into the register eax. On Sparc the parameter and destinations are reversed so mov %l0,0x110011 means put the value 0x110011 into the register l0. It’s quite easy to see one from the other because you are aware of what platform you are on. Sun, for reasons best known to themselves reversed the order of the parameters – probably to make them more like the Solaris ‘native’ format and easier for their developers to follow.
All very fine and well, Sun are entitled to confuse native x86 developers all they want, and besides which, because of the consistency, it is a really easy switch.
My mini bugbear is of course, the bcopy function. It performs a block copy from a source to a destination. It’s part of libc, it’s simple, it’s easy to use, the only problem is that it is in the reverse order from all the string routines (dest, source), memmove (dest, src). If you look up the definition of bcopy it generally asserts that it is implemented in terms of memmove, and if you look at most implementations you find that bcopy just swaps the in and out parameters and then invokes memmove (I believe the exception is sparc, where this is the opposite). This is why bcopy is officially off my list of functions to use. It’s simply the one lone voice of dissent amongst all the consistency that libc affords.
Who made bcopy then?
Who’s living in what apartment?
It’s the COM apartment models. They’re related to the threads that make use of COM objects. What happens is that when you initialize COM for a specific thread you declare that it’s either Apartment Threaded (AKA Single Threaded Apartment) or Multi Threaded.
When you use the Apartment threading model, it means that the COM object is isolated within the thread that created it. The most important piece of information about this model is that you should never use that object in another thread – it causes brokenness.
When you use the multi-threading model, what you’re pretty much saying is that I’m probably going to use this COM object in several threads. The way it works is that a multi threaded model, then the context is shared within the process.
The model you support also puts extra complications on you, the creator of the object. COM objects with a declared MT support must use some synchronization to protect shared information within the object, otherwise you’ll suffer from data corruption due to threads walking over the data. You don’t have any of these considerations in a Single threaded model – you’re guaranteed safe and sane interactions.
Additionally, when you’re in COM land, remember never just WaitFor*, but instead MsgWaitFor* things. This also applies to using DDE. This is because the Apartment model uses windows messages under the hood.
Least Significant 1 Bit
This can be useful for extracting the lowest numbered element of a bit set. Given a 2’s complement binary integer value x, (x&-x) is the least significant 1 bit. The reason this works is that it is equivalent to (x & ((~x) + 1)); any trailing zero bits in x become ones in ~x, adding 1 to that carries into the following bit, and AND with x yields only the flipped bit… the original position of the least significant 1 bit.
Alternatively, since (x&(x-1)) is actually x stripped of its least significant 1 bit, the least significant 1 bit is also (x^(x&(x-1))).
Integer Selection
A branchless, lookup-free, alternative to code like if (a<b) x=c; else x=d; is ((((a-b) >> (WORDBITS-1)) & (c^d)) ^ d). This code assumes that the shift is signed, which, of course, C does not promise.
Integer Minimum or Maximum
Given 2’s complement integer values x and y, the minimum can be computed without any branches as x+(((y-x)>>(WORDBITS-1))&(y-x)). Logically, this works because the shift by (WORDBITS-1) replicates the sign bit to create a mask — be aware, however, that the C language does not require that shifts are signed even if their operands are signed, so there is a potential portability problem. Additionally, one might think that a shift by any number greater than or equal to WORDBITS would have the same effect, but many instruction sets have shifts that behave strangely when such shift distances are specified.
Of course, maximum can be computed using the same trick: x-(((x-y)>>(WORDBITS-1))&(x-y)).
Dual-Linked List with One Pointer Field
Normally, a dual-linked circular list would contain both previous and next pointer fields and the current position in the list would be identified by a single pointer. By using two current pointers, one to the node in question and the other to the one just before/after it, it becomes possible to store only a single pointer value in each node. The value stored in each node is the XOR of the next and previous pointers that normally would have been stored in each node. Decoding is obvious.
Unfortunately, using this trick in C is awkward because the XOR operation is not defined for pointers.