A simple C dtrace aggregation consumer

This is a little bit of gluing from other sources, but it’s a simple piece of code to consume the content of an aggregation from dtrace using the `libdtrace` library. I’ve tested it on Mac OSX 10.9 and it works, the usual caveats that it wont work all the time, and you need privileged access to get it to run:

#include <assert.h>
#include <dtrace.h>
#include <stdio.h>

// This is the program.
static const char *progstr =
"syscall::recvfrom:return \
{ @agg[execname,pid] = sum(arg0); }";

// This is the aggregation walk function
static int
aggfun(const dtrace_aggdata_t *data, void *gndn __attribute__((unused)))
    dtrace_aggdesc_t *aggdesc = data->dtada_desc;
    dtrace_recdesc_t *name_rec, *pid_rec, *sum_rec;
    char *name;
    int32_t *ppid;
    int64_t *psum;
    static const dtrace_aggdata_t *count;

    if (count == NULL) {
        count = data;
        return (DTRACE_AGGWALK_NEXT);

    // Our agrgegation has 4 records (id, execname, pid, sum)
    assert(aggdesc->dtagd_nrecs == 4);

    name_rec = &aggdesc->dtagd_rec[1];
    pid_rec = &aggdesc->dtagd_rec[2];
    sum_rec = &aggdesc->dtagd_rec[3];

    name = data->dtada_data + name_rec->dtrd_offset;
    assert(pid_rec->dtrd_size == sizeof(pid_t));
    ppid = (int32_t *)(data->dtada_data + pid_rec->dtrd_offset);
    assert(sum_rec->dtrd_size == sizeof(int64_t));
    psum = (int64_t *)(data->dtada_data + sum_rec->dtrd_offset);

    printf("%1$-30s %2$-20d %3$-20ld\n", name, *ppid, (long)*psum);

// set the option, otherwise print an error & return -1
set_opt(dtrace_hdl_t *dtp, const char *opt, const char *value)
    if (-1 == dtrace_setopt(dtp, opt, value)) {
        fprintf(stderr, "Failed to set '%1$s' to '%2$s'.\n", opt, value);
        return (-1);
    return (0);

// set all the options, otherwise return an error
set_opts(dtrace_hdl_t *dtp)
    return (set_opt(dtp, "strsize", "4096")
        | set_opt(dtp, "bufsize", "1m")
        | set_opt(dtp, "aggsize", "1m")
        | set_opt(dtp, "aggrate", "2msec")
        | set_opt(dtp, "arch", "x86_64"));

main(int argc __attribute__((unused)), char **argv __attribute__((unused)))
    int err;
    dtrace_proginfo_t info;
    dtrace_hdl_t *dtp;
    dtrace_prog_t *prog;

    dtp = dtrace_open(DTRACE_VERSION, DTRACE_O_LP64, &err);

    if (dtp == 0) {
        return (1);
    if (-1 == set_opts(dtp))
        return (1);

    prog = dtrace_program_strcompile(dtp, progstr, DTRACE_PROBESPEC_NAME, 0, 0, NULL);
    if (prog == 0) {
        printf("dtrace_program_compile failed\n");
        return (1);
    if (-1 == dtrace_program_exec(dtp, prog, &info)) {
        printf("Failed to dtrace exec.\n");
        return (1);
    if (-1 == dtrace_go(dtp)) {
        fprintf(stderr, "Failed to dtrace_go.\n");
        return (1);

    while(1) {
        int status = dtrace_status(dtp);
        if (status == DTRACE_STATUS_OKAY) {
            dtrace_aggregate_walk(dtp, aggfun, 0);
        } else if (status != DTRACE_STATUS_NONE) {

    return (0);

What a pointless ‘auto updater’

Another morning, another Adobe Flash update.

Big ass dialog telling me that I have an update which links me to their website where I have to uncheck the dumb-ass add-on options for (depending on the week) McAfee and the Google toolbar.

Several months ago Adobe were pushing/advertising their auto-updating technology. They were pretty much saying ‘completely automated updates‘. To me it seems more like manual updating. You’re wasting my time with the prompts, the dialogs, the small download which subsequently downloads the update (each one is tied to the version of flash you’re downloading so what’s the damned point in having a stub downloader for each version).

If you want to see a proper auto-updater you should look to google. It downloads updates silently in the background, applies them (as much as possible) in the background and if you need to restart your browser it mentions this in a prompt. The downloads are tiny, caused by their use of their differential compression algorithm which keeps the updates small to the point of being downloadable in the background while not interfering with your normal use of the system. At the same time they’re not pushing a bunch of extra third-party software on you.

My Heart bleeds, truly…

The world has ended! openssl has this terrible bug which allows an attacker to receive a pot-luck of 64k portions of the memory address space of the SSL server. Thing that are up for grabs include usernames, passwords and SSL private keys; i.e. a veritable grab-bag of things that can be obtained from the server.
You should really change your password.

Actually, don’t bother; it’s not like it actually matters in the long run. You’re probably going to change your password from bunny2 to bunny3 anyway and it’s not like that was the most likely point of escape of your password in the first place. It’s probably that toolbar that installed when you installed that video player that was needed to watch that movie; you know the one? The one with the kittens? The one that you never got to see because you got distracted.

It’s been a really bad few months for security, and while people have been pointing fingers in various directions, deploying a Nelson-style ‘Ha Ha’, the only lesson that has been found in this case is that the bible has a great thing to say about this:

Do not gloat when your enemy falls; when they stumble, do not let your heart rejoice, or the Lord will see and disapprove and turn his wrath away from them.

… and then you’re in trouble.

As for me, I’m going to wait a few weeks and then change the passwords that are in any way connected to money. It would be nice if I could get some form of toolbar that would check sites as I go to them in a database and reveal if they were subject to the bug and prompt me to change the password once they’ve been declared clear. Someone get on that; it sounds like a fun project


Code to change assert behaviour when running under a debugger

In Linux, you can create a __assert_fail routine which will be triggered when an assert is encountered in code compiled with asserts enabled. Using some debugger detection, you can have the code behave differently in that situation. The code to accomplish detection at load-time looks like:

#include <assert.h>
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/ptrace.h>

static int ptrace_failed;

static int
    int status, waitrc;

    pid_t child = fork();
    if (child == -1) return -1;
    if (child == 0) {
        if (ptrace(PT_ATTACH, getppid(), 0, 0))
        do {
            waitrc = waitpid(getppid(), &status, 0);
        } while (waitrc == -1 && errno == EINTR);
        ptrace(PT_DETACH, getppid(), (caddr_t)1, SIGCONT);
    do {
        waitrc = waitpid(child, &status, 0);
    } while (waitrc == -1 && errno == EINTR);
    return WEXITSTATUS(status);

static void
    ptrace_failed = detect_ptrace();

void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function) {
    fprintf(stderr, "Assert: %s failed at %s:%d in function %s\n", assertion, file, line, function);
    if (ptrace_failed)

This code, because it uses an __attribute__((constructor)) will detect a program being started under a debugger.

You could move the detect_debugger() call into the __assert_fail function, and this provides full run-time detection of the debugger and behaviour alteration. On the assumption that not a lot of asserts() get triggered, you have an effective behaviour alteration of your code when run under a debugger. In this case the __assert_fail looks like:

void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function) {
    fprintf(stderr, "Assert: %s failed at %s:%d in function %s\n", assertion, file, line, function);
    if (detect_ptrace())

ascii progress bars

Sometimes you want to display a progress routine in C/C++, without resorting to termcap/curses. The simplest mechanism you can use is:

for (i = 0; i < 10000; i++){
    printf("\rIn Progress: %d", i/100);

This yields flickering on the display as the cursor moves back and forth over the entire line reprinting it. This gets worse with larger amounts of display data. If your progress routine is simple like this, then you can use another mechanism – the backspace \b, rather than the reset \r. You take advantage of printf reporting the number of characters it displayed as a return code, and then only display the required number of backspace characters.

So for example:

printf("In Progress: ");
int lastprinted = 0;
for (i = 0; i < 10000; i++){
    while (lastprinted--) printf("\b");
    lastprinted = printf("%d", i/100);

This way you get to only print a few characters every refresh, which minimizes the flicker.

Launching a JVM from C++ on Mac OS X

I’d been playing around with starting up a JVM from C++ code on Windows. I was futzing around between MSVC and cygwin/mingw and it all worked well. Then I decided to do the same under Mac OS X.

Firstly, if you’re using the OS X VM, when you try to compile some simple code under clang, you’re going to see this awesome warning:

warning: 'JNI_CreateJavaVM' is deprecated [-Wdeprecated-declarations]

Apple really, really don’t want you using the JavaVM framework; in general if you want to use their framework, you have to ignore the warnings. and soldier on.

Secondly, if you want a GUI, you can’t instantiate the VM on the main thread. If you do that then your process will deadlock, and you won’t be able to see any of the GUI.

With that in mind, we hive-off the creation of the VM and main class to a separate thread. With some crufty code, we make a struct of the VM initialization arguments (this code is not clean it contains strdups and news and there’s no cleanup on free):

struct start_args {
    JavaVMInitArgs vm_args;
    const char *launch_class;

    start_args(const char **args, const char *classname) {
        vm_args.version = JNI_VERSION_1_6;
        vm_args.ignoreUnrecognized = JNI_TRUE;

        int arg_count = 0;
        const char **atarg = args;
        while (*atarg++) arg_count++;
        JavaVMOption *options = new JavaVMOption[arg_count];
        vm_args.nOptions = arg_count;
        vm_args.options = options;

        while (*args) {
            options->optionString = strdup(*args);
        launch_class = strdup(classname);

Next we have the thread function that launches the VM. This is a standard posix thread routine, so there’s no magic there.

void *
start_java(void *start_args)
    struct start_args *args = (struct start_args *)start_args;
    int res;
    JavaVM *jvm;
    JNIEnv *env;

    res = JNI_CreateJavaVM(&jvm, (void**)&env, &args->vm_args);
    if (res < 0) exit(1);
    /* load the launch class */
    jclass main_class;
    jmethodID main_method_id;
    main_class = env->FindClass(args->launch_class);
    if (main_class == NULL) {
    /* get main method */
    main_method_id = env->GetStaticMethodID(main_class, "main", "([Ljava/lang/String;)V");
    if (main_method_id == NULL) {

    /* make the initial argument */
    jobject empty_args = env->NewObjectArray(0, env->FindClass("java/lang/String"), NULL);
    /* call the method */
    env->CallStaticVoidMethod(main_class, main_method_id, empty_args);
    /* Don't forget to destroy the JVM at the end */
    return (0);

What this code does is Create the Java VM (short piece at the start). Then it finds and invokes the public static void main(String args[]) of the class that’s passed in. At the end, it destroys that Java VM. You’re supposed to do that; for memory allocation’s sake.

Next we have the main routine, which creates the thread and invokes the run loop

int main(int argc, char **argv)
    const char *vm_arglist[] = { "-Djava.class.path=.", 0 };
    struct start_args args(vm_arglist, "launch");
    pthread_t thr;
    pthread_create(&thr, NULL, start_java, &args);

The trick is the CFRunLoopRun() at the end. What this does is triggers the CoreFoundation main application run-loop, which allows the application to pump messages for all the other run-loops that are created by the java UI framework.

The next thing is an example java file that creates a window.

public class launch extends JFrame {
    JLabel emptyLabel;

    public launch() {
        emptyLabel = new JLabel("Hello World");

        getContentPane().add(emptyLabel, BorderLayout.CENTER);

    public static void main(String args[]) {
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                launch l = new launch();

Because it’s got a EXIT_ON_CLOSE option, when you close the window the application will terminate.

Gluing this into a makefile involves the following; it’s got conditional sections for building with either the 1.6 or 1.7 VM, and there are a couple of changes that are needed in the cpp code to get this to work:


JHOME:=$(shell /usr/libexec/java_home -v 1.$(JAVA_VERSION))
SYSTEM:=$(shell uname -s)

ifeq ($(JAVA_VERSION),7)
LDFLAGS += -L$(VM_DIR)/server -Wl,-rpath,$(VM_DIR) -Wl,-rpath,$(VM_DIR)/server
CXXFLAGS += -I$(JHOME)/include -I$(JHOME)/include/$(SYSTEM)
LDLIBS += -ljvm
CXXFLAGS += -framework JavaVM

CXXFLAGS += -framework CoreFoundation

all: launch launch.class

launch.class: launch.java
	/usr/libexec/java_home -v 1.$(JAVA_VERSION) --exec javac launch.java

	rm -rf launch *.class *.dSYM

For the C++, you need to change the include path depending on whether you’re building for the 1.6 or 1.7 VM.

#include <JavaVM/jni.h>
#include <jni.h>


You can’t really use the code that was compiled against the 1.6 VM on the 1.7 environment, as it uses the JavaVM framework in 1.6, vs. the libjni.dylib on the 1.7 VM, but I would consider this a given across all JVM variants.

This source can be cloned from the Repository on BitBucket.

RAII, or why C++ doesn’t have a finally clause

One of the most common idioms I see in a delphi program looks like the following:

foo := TObject.Create;
    // Do something with foo

It’s primarily because you always create objects on the heap, and everything involving an object, essentially, is a pointer. This makes for a little bit of a memory management issue. You have to remember to destroy objects after you’ve created them, and because if something goes wrong, that destruction needs to take place in a finally block. Having it take place in a finally block keeps you safe from exceptions. If an exception is triggered it always passes through the finally block on it’s way back up the stack. This gives you the ability to cleanup temporary objects as needed

C++ uses the RAII idiom, which means that objects that are defined at a certain scope are always going to be destroyed once that scope is exited. What this means is that if you define an object X in a function Y, once Y is returned from then X will be destructed/destroyed. As an example:

std::stringstream streamer;
// do something with streamer

There’s no awkward streamer.create call, and once you return from the function streamer is appropriately tidied up

But wait, you say, they are not the same, what you are doing in Delphi is creating an object on the heap, while in C++ you are creating it on the stack, so of course during the process of unwinding said stack, you will destroy the object. The more equivalent code in C++ would have been:

std::stringstream *streamer = new std::stringstream();
// Do something with streamer
delete streamer;

Hah you say, no try finally means that if an exception is triggered in the ‘do something’ piece of code, you leak a streamer object on the heap.

To which I respond, silly rabbit, that’s why you didn’t create a pointer in the first place with the first piece of code. If you want to perform something like this, then you should use a smart pointer, which takes care of the destruction of the object once the smart pointer exits scope, like so:

std::unique_ptr&lt;std::stringstream&gt; streamer(new std::stringstream);
// Do something with streamer

But really, if you were just going to create an entity for the duration of a function, it’s far easier to just create it in-place without such complications

This leads to a little gotcha that regulary catches non C++ programmers when they are creating methods. As they typically come from a pointer-based economy (e.g. Delphi, Java), when they create a method:

function doSomething(object : TObject) : integer

What they’re doing is actually passing in a reference to TObject (as it’s just a pointer), and because it’s pass-by-value in this case, what they’re really just passing in is the value of the pointer. In C++ it’s a little bit different. When you pass in an object using the form:

int do_something(std::stringstream streamer)

What actually happens is a copy is made of the item being passed, and it’s that which ends up in the function; not the actual object that you’re passing in. If you want to pass in a reference to the object, then you need to use the reference passing semantic:

int do_something(std::stringstream &streamer)

You can use the const modifier if the method you’re invoking is not going to modify the passed in reference, which allows you to restrict the things you can do with the reference. In this form you don’t need to perform any indirection on the object (e.g. getting a pointer to it) in order to pass it in. This makes for slightly tidier code, which isn’t strewn with &’s on the way in, and var->’s in the method itself.

And for those Delphi haters out there; the reason I picked Delphi rather than Java is because Delphi is, unless you’re using the .NET variant, a non garbage collected language, and as such requires the free, otherwise you get memory leaks.

Objective C is another kettle of fish. Between the original model of retain/release, the GC model that was available on OS X from 10.5, and now the totally shiny ARC mechanism, it makes some people cry.


So I found this little security clanger in the manual page for dlopen on Mac OS X, where it states:

When path does not contain a slash character (i.e. it is just a leaf name), dlopen() searches the following until it finds a compatible Mach-O file: $LD_LIBRARY_PATH, $DYLD_LIBRARY_PATH, current working directory, $DYLD_FALLBACK_LIBRARY_PATH.

Yes, current working directory, one of the classic vulnerability injection mechanisms. This is as epically bad a security clanger as Microsoft Windows’s LoadLibrary call but, apparently, nobody cares! Linux, and Solaris have a far more sensible mechanism, where it actually enumerates the places that it looks for the library, but unless you really, horrendously eff it up, it won’t look in the current working directory.

I nearly did a spit-take when I saw this explicitly called out. In this day and age, it’s an embarassment.

The Disappointment of New Features

I’m reading articles on new features in the CSS media queries level 4 spec. Items such as luminosity, which allow you to adjust the styling on your app depending on three grades of environmental brightness. This means you could adjust that bright white as it gets darker, so that it doesn’t blind someone who’s trying to read it in a darkened room (I had this experience this morning when the auto-brightness setting on my nexus decided that full-on-bright was what I needed while triaging my email at 6am, with the lights off).

It’s a pretty nifty feature, and once people start using it we’ll probably all reap the benefit.

The problem is that as of now, it’s pretty much only in a limited set of web browsers. Even though I have a laptop with an ambient lighting sensor, I’ll never see this work properly anytime in the near future.

The next thing I was reading was about making non-rectangular clipping areas for text so that it would flow around images. Looks pretty awesome, and makes things look more like a desktop publishing environment. Only available in Chrome Canary (which means, at the moment, the most bleeding-edge version of Chrome). Which makes it another feature that we have to wait for.

C++11 introduced some nice features such as Lambdas, which allow you to define the work to be done on something in the same place as the request to perform the work. It’s pretty nice as you can in-line work quite easily, whereas in previous languages you relied on an external function, typically with pointers to a data blob… the whole thing was quite tedious and leads to difficult to understand code. Again, you need a modern compiler that understands the C++11 syntax, but once you have it, it’s plain sailing. You ever tried to compile gcc… it’s fun times for all 😉

Again, a new feature, but it generally comes with a whole bunch of things that have to change to support it.

This is where the disappointment comes in. All these shiny features are available on the shiniest of newest systems. As developers, we like having the newest stuff – from operating systems to development environments, to programming languages. They all provide us with the ability to do our jobs better, and in a more efficient manner. It also allows us to royally screw things up much more rapidly, and then fix it so you almost don’t notice that it happened.

That’s not where most of the world lies. Most folks are living in the ‘it got installed, I’m not touching it’ world. It makes things difficult for us developers as we have to match up our work to what functions in their environment. That means we can’t use the newest version of X, because that’s not going to be present on the end-user’s system.

There is a sliver of bright light in the form of the automatic update. If you’re using Google Chrome, or any recent version of Firefox then unless you change something, it will always be silently updating to the newest version behind your back. This means that the next time you start it up, you’ve got the latest and greatest available. All the features are present. Unfortunately, this also means that the changes can trigger failures. This can be caused by a lack of testing, or a lack of backwards compatibility.

When it happens because of a lack of backwards compatibility, then people get genuinely angry – it used to work and now it simply doesn’t, and for no reason whatsoever. On Internet Explorer we have the ‘do the wrong thing’ switch, which causes the browser to act in the old, bad way, so that a user’s experience does not change when they install the newer browser.

I don’t think this is really going anywhere, so I’ll leave it as-is then.