Why does a simple piece of c generate a SIGSEGV on linux as opposed to a SIGBUS on OSX

One of the lads in the office has the following piece of code:


when compiled on linux, and run, it generates a SIGSEGV, but when run on OSX, it generates a SIGBUS,


The definition of SIGBUS is `Access to an undefined portion of a memory object.`, the definition of `SIGSEGV` is that it’s `Invalid memory reference.`.

The issue is that OSX translates a native MACH exception into a POSIX signal. on OSX, the route is through a translation layer, the code of which indicates:

*ux_signal = SIGSEGV;
*ux_signal = SIGBUS;

This indicates that if you crash with an address that is mapped into the process, but isn’t an invalid address, you’ll get a SIGBUS, however, if you crash with an address that isn’t mapped into the process, you’ll get a SIGSEGV, i.e. all addresses that are unmapped will generate SIGSEGV, while all address that are mapped, but aren’t usable in the way you ask will generate a SIGBUS.

There is some other code:

if (machine_exception(exception, code, subcode, ux_signal, ux_code))

which deals with general protection faults e.g. pages that are simply not mapped.

when you compile the .c file, because you declared the variable main, it gets default storage/allocation which means it goes into the DATA segment of the application, none of which allows the execution of code from there.

When you try to run the program on OSX, it tries to execute the ‘code’ at main, which is data, the exception it gets at this point is bad access. The memory is available, but it isn’t mapped as executable, so you get a SIGBUS.

When you try to run this program on linux, you get a SEGV. The reason is simply because you’re trying to access memory that isn’t mapped for purpose.

When you change the code, to invoke a simple null pointer:

int main(int argc, char **argv) {
void (*crash)(void) = 0;

You get a SEGV, this is caused because there is no 0 mapped page visible to the user (you can use vmmap on processes to see what’s mapped/not mapped in this case).

.NET Date and Time Handling

So I started out reading the documentation for .NET date and time handling and I felt I needed a drink. If this is what .NET developers have to deal with, then no wonder so many of them have issues.

It starts with the DateTime structure. It’s purpose is to represent a Date and Time; however this structure possesses both the timestamp and the fact that it’s either a UTC time (i.e. no TZ information whatsoever), or a local time.

Obtaining a value for ‘now’ involves using the static properties DateTime.Now or DateTime.UtcNow. Once you’ve instantiated a DateTime value, you can’t actually change it, so for example doing DateTime.Now.ToUniversalTime() won’t actually change the structure, it returns a new instance of DateTime, which is converted based on the system’s current timezone settings, which means that you can actually get an incorrect value when the time straddles a daylight savings change.

There is an entire Microsoft document on best practices for dealing with DateTime values. It’s a bit tricky. Firstly, try to only deal with dates and times in UTC where possible. There is a major gotcha in the XmlSerializer code in the .NET 1.0/1.1 framework where it always treats the DateTime as localtime, even if you’ve passed in a UtcTime. This seems to have been fixed in later revisions of .NET; so hopefully you won’t encounter it.

Obtaining a DateTime object from a String is a matter of using the DateTime.parse(), you just have to ensure that the parse-string format matches the input string value. The default parse method uses the thread’s current Culture, which means you have to pay attention to things like the pesky mm/dd/yyyy vs dd/mm/yyyy conventions when you’re parsing. If you want to specify the formatting that you’re trying to parse in a manual fashion, you need to use the parse(String stringToParse, IFormatProvider provider) style, which allows you to use another Culture’s parsing format, or define your own.

If you’re planning on performing operations on a DateTime item, such as adding a few hours, days, etc to the original value bear in mind that in order to be safe, you should perform the operation with the timezone set to UTC (which doesn’t have any DST adjustments), and then swap it back to local time. This double trip is intended to avoid errors around ‘spring forward’ and ‘fall back’ times. Even if you don’t think you’re going to encounter those times it’s better safe than sorry.

Because a DateTime only stores a time in either ‘local’ or UTC, when you choose to persist this information you are best off storing the information in UTC and keeping an adjacent element for the timezone, should you actually choose to store it. The simplest way to store it is as the result of the Ticks property. This value is an integer, and is in ten-millionths of a second since 12:00 midnight, January 1, 0001 A.D. (C.E.) in the Gregorian calendar.

The DateTime structure has the ability to extract the Year, Month, Day, Hour, Minute, Second and Milliseconds from the value. It allows the extraction of the ‘Ticks’ value, but this is an absolute value, not an extraction. All these elements are in terms of the Gregorian calendar. It doesn’t matter if your current culture does not use the Gregorian calendar, it always returns the data in terms of the Gregorian calendar. I don’t know the reason for this choice, considering that it understands the current timezone; but it’s most likely due to the culture being a property of the current thread rather than a more ‘systemmy’ property.

If you want to extract based on a different calendar, such as the Persian calendar, you need to extract them via an instance of the calendar. The Persian Calendar is easily instanced from System.Globalization.PersianCalendar, but if you need the calendar for a specific region, you should instantiate a CultureInfo using new System.Globalization.CultureInfo(String CultureCode, bool userOverride), and then extract it’s Calendar property. In this case I can do:

var date = DateTime.utcNow;
CultureInfo cinfo = new System.Globalization.CultureInfo("EN-ie", false); // get the English-speaking, Ireland culture
int year = cinfo.Calendar.GetYear(date);
// This year is based on the calendar from Ireland (Gregorian), but you get the idea as to how to use it for other calendars.

The Calendar class deals with the breaking of a Date & Time into pieces. Those pieces are defined in terms of the calendar – this can be confusing if you’re only used to a single calendar, as many people are. I live in a region that uses the Gregorian calendar exclusively, so I deal with dates and times in those terms. The Gregorian calendar was formalized by Pope Gregory in 1582 (by that calendar’s reckoning), which was, in itself a correction to the Julian calendar. You can, in fact, get dates and times in terms of the Julian calendar, which was off because it insisted on a leap year every 4 years, with no exceptions – bear in mind that the Gregorian calendar has two exceptions to the rule – no leap years on years divisible by 100, but it does have a leap year if it’s also divisible by 400 – so, for example 1900 was not a leap year, but 2000 was a leap year, while 2100 will not be a leap year.

So with the .NET environment we have the DateTime, which represents a point in time, we then have the Calendar, which is a representation of the DateTime in terms of a particular calendar i.e it’s representation in terms of the year, month, day, day of week. There can me more delineations of the date & time – it’s up to the calendar to define them.

There are many ways to make an instance of a DateTime, possibly the simplest of which is to use the Now static property, but you can also construct them explicitly, using a variety of constructors, The simplest is the DateTime(ticks), which makes one in terms of the number of ticks since 0-time. But you can construct them made up of just the year, month and day; combined with an hour, minute and second; or even subsequently combined with a millisecond value. These can be specified as ‘local’ or UTC, and can also be specified in terms of a calendar. Once you’ve constructed the DateTime instance, it will no longer be represented in terms of that calendar, but will be represented in terms of the Gregorian calendar; e.g.

var cal = new System.Globalization.JapaneseCalendar();
DateTime aDateTime = new DateTime(2014, 1, 1, cal);
System.Diagnostics.Debug.Assert(aDateTime.Year == 2014);

This assertion will trigger, because it’s no longer in terms of the Japanese Calendar.

Parsing of DateTime values can be done using DateTime.parse(), which takes a string, and turns it into a DateTime. it uses the current Culture; unless it can be parsed as ISO 8601 – so when I’m here in little old ireland, if I put in a string of 1/2/2015, it tells me February 1, 2015, but if I’m in the US, it represents January 2, 2015. So you have to be careful when parsing DateTime values. This is where the IFormatProvider parameter comes in – this interface is generally not manually created by the user, but is, instead extracted from an instance of a culture e.g.

IFormatProvider if_us = new System.Globalization.CultureInfo("EN-us", false);
IFormatProvider if_ie = new System.Globalization.CultureInfo("EN-ie", false);
DateTime ieDate = DateTime.Parse("1/2/2015", if_ie);
DateTime usDate = DateTime.Parse("1/2/2015", if_us);
System.Diagnostics.Debug.Assert(ieDate.Year == usDate.Year);
System.Diagnostics.Debug.Assert(ieDate.Month != usDate.Month);
System.Diagnostics.Debug.Assert(ieDate.Day != usDate.Day);
System.Diagnostics.Debug.Assert(ieDate.Month == usDate.Day);
System.Diagnostics.Debug.Assert(ieDate.Day == usDate.Month);

None of these assertions trip, because we know/understand how the string is parsed by these cultures.

The next thing you generally need to do is deal with displaying times in different time zones. You already have a DateTime instance, which is either local, or UTC, and you want to turn it into a local time. So what you need is a TimeZoneInfo, and then going back to a DateTime that displays in that format:

TimeZoneInfo irelandtz = TimeZoneInfo.FindSystemTimeZoneById("GMT Standard Time");
TimeZoneInfo newyorktz = TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
var now = DateTime.UtcNow;
DateTime irelandnow = TimeZoneInfo.ConvertTimeFromUtc(now, irelandtz);
DateTime newyorknow = TimeZoneInfo.ConvertTimeFromUtc(now, newyorktz);
Console.WriteLine(" {0} {1}", irelandnow, irelandtz.IsDaylightSavingTime(irelandnow) ? irelandtz.DaylightName : irelandtz.StandardName);
Console.WriteLine(" {0} {1}", newyorknow, newyorktz.IsDaylightSavingTime(newyorknow) ? newyorktz.DaylightName : newyorktz.StandardName);

Now this code is awful, you are not actually picking the timezone by location – it’s by a name of timezone, rather than the location that specifies the timezone – e.g. Europe/Dublin and America/NewYork. There is a much better alternative for dealing with DateTime values, and that’s to use the NodaTime, which gives a little bit better an experience:

Instant now = SystemClock.Instance.Now;
DateTimeZone dublin = DateTimeZoneProviders.Tzdb["Europe/Dublin"];
DateTimeZone newyork = DateTimeZoneProviders.Tzdb["America/New_York"];
Console.WriteLine(new ZonedDateTime(now, dublin).ToString());
//2015-02-07T20:48:20 Europe/Dublin (+00)
Console.WriteLine(new ZonedDateTime(now, newyork).ToString());
//2015-02-07T15:48:20 America/New_York (-05)

Dealing with time on computers is so much fun

When a computer boots up, the operating system typically takes it’s current idea of what time it is from the hardware clock. On shutting down, it typically pushes back it’s own idea of what time it is back to the BIOS so that the clock is up-to-date relative to what time the operating system thinks it is.

The operating system while running tries to keep the concept of what time it is by having a counter it increments based on the kicking of some interval timer (on newer operating systems this is not strictly how it works, but the idea is similar).

The problem is, of course, that the thing that generates the timer isn’t completely accurate. As a result, over time, the operating system clock drifts in relation to what the real time is. This drift depends on the quality of the hardware going into the computer. Some are more expensive and thus don’t drift as much, some are cheap and drift quite a lot (some are so bad as to cause a drift of many minutes over the course of a day).

This is generally undesirable, I want to be reminded that I have a meeting before it actually starts so I can load up on a cup of coffee beforehand. How do we fix this?

The answer is to ask another computer and hope that it has a better idea of what time it is than this current one.

The most common protocol used for getting time over a network is the Network Time Protocol (NTP). In it’s simplest mode, you ask the remote system what time it is and then shift your time to match the time from the remote system by adjusting the time of your local system over time (to prevent confusing jumps in time where it goes from 9am to 8.15 instantly confusing you no end). The fancier methods involve asking multiple systems for the time, constructing a model of how crap your clock is relative to their clock and using that to constantly adjust your time to try to keep it in sync with the remote system. You periodically ask the remote systems for the time and use it to update your model. With this you can typically keep your clock within ~100ms of the time that’s considered ‘now’.

This is fine, we now have a relatively accurate clock which we can ask what time it is. The only problem is that depending on where you are in the world, your idea of what time it is has been altered by the wonderful concept of the timezone.

A properly written operating system should keep time without caring about what timezone you belong to. When you ask for a time it gives you a number which is unambiguous. This is vitally important when it comes to trying to perform operations on the time. If it’s unambiguous, then you can use it for mathematical operations such as determining the difference between two times. If it’s ambiguous then you’ll have trouble trying to do anything with it.

When you introduce the possibility of ambiguity into the representation of time, problems arise. For example, if I ask you how many minutes are there between 01:30 and 01:35, the obvious answer of 5 minutes my be incorrect because 01:30 is BST and 01:35 is GMT, which means the difference is 65 minutes; plus if I just showed you that time the day after a daylight savings transition without telling you what timezone it was in you wouldn’t be able to tell.

But what does this actually mean? For most developers, you should follow two cardinal rules:

  • Avoid changing the date from it’s unambiguous form until the last possible moment.
  • Always store the date in it’s unambiguous form.

If you have to store a date with a timezone, store the timezone as separate information to the actual unambiguous date, that way you can easily perform operations on the unambiguous time, and display the original time relative to the timezone it was supplied.

This doesn’t even get into the complexities of calendars. While the West generally uses on the Gregorian calendar, many other places in world don’t use that calendar (there are a lot of them).

Standard C (POSIX) is poorly supplied with an API for dealing with dates and times, especially in relation to timezones. You can generally set the timezone of the application, but it’s a thread-unsafe abomination that really should DIAF.

Most of the newer languages have significantly better date/time support. I’ll go through them in a series of posts, starting with C#.

A simple C dtrace aggregation consumer

This is a little bit of gluing from other sources, but it’s a simple piece of code to consume the content of an aggregation from dtrace using the `libdtrace` library. I’ve tested it on Mac OSX 10.9 and it works, the usual caveats that it wont work all the time, and you need privileged access to get it to run:

#include <assert.h>
#include <dtrace.h>
#include <stdio.h>

// This is the program.
static const char *progstr =
"syscall::recvfrom:return \
{ @agg[execname,pid] = sum(arg0); }";

// This is the aggregation walk function
static int
aggfun(const dtrace_aggdata_t *data, void *gndn __attribute__((unused)))
    dtrace_aggdesc_t *aggdesc = data->dtada_desc;
    dtrace_recdesc_t *name_rec, *pid_rec, *sum_rec;
    char *name;
    int32_t *ppid;
    int64_t *psum;
    static const dtrace_aggdata_t *count;

    if (count == NULL) {
        count = data;
        return (DTRACE_AGGWALK_NEXT);

    // Our agrgegation has 4 records (id, execname, pid, sum)
    assert(aggdesc->dtagd_nrecs == 4);

    name_rec = &aggdesc->dtagd_rec[1];
    pid_rec = &aggdesc->dtagd_rec[2];
    sum_rec = &aggdesc->dtagd_rec[3];

    name = data->dtada_data + name_rec->dtrd_offset;
    assert(pid_rec->dtrd_size == sizeof(pid_t));
    ppid = (int32_t *)(data->dtada_data + pid_rec->dtrd_offset);
    assert(sum_rec->dtrd_size == sizeof(int64_t));
    psum = (int64_t *)(data->dtada_data + sum_rec->dtrd_offset);

    printf("%1$-30s %2$-20d %3$-20ld\n", name, *ppid, (long)*psum);

// set the option, otherwise print an error & return -1
set_opt(dtrace_hdl_t *dtp, const char *opt, const char *value)
    if (-1 == dtrace_setopt(dtp, opt, value)) {
        fprintf(stderr, "Failed to set '%1$s' to '%2$s'.\n", opt, value);
        return (-1);
    return (0);

// set all the options, otherwise return an error
set_opts(dtrace_hdl_t *dtp)
    return (set_opt(dtp, "strsize", "4096")
        | set_opt(dtp, "bufsize", "1m")
        | set_opt(dtp, "aggsize", "1m")
        | set_opt(dtp, "aggrate", "2msec")
        | set_opt(dtp, "arch", "x86_64"));

main(int argc __attribute__((unused)), char **argv __attribute__((unused)))
    int err;
    dtrace_proginfo_t info;
    dtrace_hdl_t *dtp;
    dtrace_prog_t *prog;

    dtp = dtrace_open(DTRACE_VERSION, DTRACE_O_LP64, &err);

    if (dtp == 0) {
        return (1);
    if (-1 == set_opts(dtp))
        return (1);

    prog = dtrace_program_strcompile(dtp, progstr, DTRACE_PROBESPEC_NAME, 0, 0, NULL);
    if (prog == 0) {
        printf("dtrace_program_compile failed\n");
        return (1);
    if (-1 == dtrace_program_exec(dtp, prog, &info)) {
        printf("Failed to dtrace exec.\n");
        return (1);
    if (-1 == dtrace_go(dtp)) {
        fprintf(stderr, "Failed to dtrace_go.\n");
        return (1);

    while(1) {
        int status = dtrace_status(dtp);
        if (status == DTRACE_STATUS_OKAY) {
            dtrace_aggregate_walk(dtp, aggfun, 0);
        } else if (status != DTRACE_STATUS_NONE) {

    return (0);

What a pointless ‘auto updater’

Another morning, another Adobe Flash update.

Big ass dialog telling me that I have an update which links me to their website where I have to uncheck the dumb-ass add-on options for (depending on the week) McAfee and the Google toolbar.

Several months ago Adobe were pushing/advertising their auto-updating technology. They were pretty much saying ‘completely automated updates‘. To me it seems more like manual updating. You’re wasting my time with the prompts, the dialogs, the small download which subsequently downloads the update (each one is tied to the version of flash you’re downloading so what’s the damned point in having a stub downloader for each version).

If you want to see a proper auto-updater you should look to google. It downloads updates silently in the background, applies them (as much as possible) in the background and if you need to restart your browser it mentions this in a prompt. The downloads are tiny, caused by their use of their differential compression algorithm which keeps the updates small to the point of being downloadable in the background while not interfering with your normal use of the system. At the same time they’re not pushing a bunch of extra third-party software on you.

My Heart bleeds, truly…

The world has ended! openssl has this terrible bug which allows an attacker to receive a pot-luck of 64k portions of the memory address space of the SSL server. Thing that are up for grabs include usernames, passwords and SSL private keys; i.e. a veritable grab-bag of things that can be obtained from the server.
You should really change your password.

Actually, don’t bother; it’s not like it actually matters in the long run. You’re probably going to change your password from bunny2 to bunny3 anyway and it’s not like that was the most likely point of escape of your password in the first place. It’s probably that toolbar that installed when you installed that video player that was needed to watch that movie; you know the one? The one with the kittens? The one that you never got to see because you got distracted.

It’s been a really bad few months for security, and while people have been pointing fingers in various directions, deploying a Nelson-style ‘Ha Ha’, the only lesson that has been found in this case is that the bible has a great thing to say about this:

Do not gloat when your enemy falls; when they stumble, do not let your heart rejoice, or the Lord will see and disapprove and turn his wrath away from them.

… and then you’re in trouble.

As for me, I’m going to wait a few weeks and then change the passwords that are in any way connected to money. It would be nice if I could get some form of toolbar that would check sites as I go to them in a database and reveal if they were subject to the bug and prompt me to change the password once they’ve been declared clear. Someone get on that; it sounds like a fun project


Code to change assert behaviour when running under a debugger

In Linux, you can create a __assert_fail routine which will be triggered when an assert is encountered in code compiled with asserts enabled. Using some debugger detection, you can have the code behave differently in that situation. The code to accomplish detection at load-time looks like:

#include <assert.h>
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/ptrace.h>

static int ptrace_failed;

static int
    int status, waitrc;

    pid_t child = fork();
    if (child == -1) return -1;
    if (child == 0) {
        if (ptrace(PT_ATTACH, getppid(), 0, 0))
        do {
            waitrc = waitpid(getppid(), &status, 0);
        } while (waitrc == -1 && errno == EINTR);
        ptrace(PT_DETACH, getppid(), (caddr_t)1, SIGCONT);
    do {
        waitrc = waitpid(child, &status, 0);
    } while (waitrc == -1 && errno == EINTR);
    return WEXITSTATUS(status);

static void
    ptrace_failed = detect_ptrace();

void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function) {
    fprintf(stderr, "Assert: %s failed at %s:%d in function %s\n", assertion, file, line, function);
    if (ptrace_failed)

This code, because it uses an __attribute__((constructor)) will detect a program being started under a debugger.

You could move the detect_debugger() call into the __assert_fail function, and this provides full run-time detection of the debugger and behaviour alteration. On the assumption that not a lot of asserts() get triggered, you have an effective behaviour alteration of your code when run under a debugger. In this case the __assert_fail looks like:

void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function) {
    fprintf(stderr, "Assert: %s failed at %s:%d in function %s\n", assertion, file, line, function);
    if (detect_ptrace())

ascii progress bars

Sometimes you want to display a progress routine in C/C++, without resorting to termcap/curses. The simplest mechanism you can use is:

for (i = 0; i < 10000; i++){
    printf("\rIn Progress: %d", i/100);

This yields flickering on the display as the cursor moves back and forth over the entire line reprinting it. This gets worse with larger amounts of display data. If your progress routine is simple like this, then you can use another mechanism – the backspace \b, rather than the reset \r. You take advantage of printf reporting the number of characters it displayed as a return code, and then only display the required number of backspace characters.

So for example:

printf("In Progress: ");
int lastprinted = 0;
for (i = 0; i < 10000; i++){
    while (lastprinted--) printf("\b");
    lastprinted = printf("%d", i/100);

This way you get to only print a few characters every refresh, which minimizes the flicker.

Launching a JVM from C++ on Mac OS X

I’d been playing around with starting up a JVM from C++ code on Windows. I was futzing around between MSVC and cygwin/mingw and it all worked well. Then I decided to do the same under Mac OS X.

Firstly, if you’re using the OS X VM, when you try to compile some simple code under clang, you’re going to see this awesome warning:

warning: 'JNI_CreateJavaVM' is deprecated [-Wdeprecated-declarations]

Apple really, really don’t want you using the JavaVM framework; in general if you want to use their framework, you have to ignore the warnings. and soldier on.

Secondly, if you want a GUI, you can’t instantiate the VM on the main thread. If you do that then your process will deadlock, and you won’t be able to see any of the GUI.

With that in mind, we hive-off the creation of the VM and main class to a separate thread. With some crufty code, we make a struct of the VM initialization arguments (this code is not clean it contains strdups and news and there’s no cleanup on free):

struct start_args {
    JavaVMInitArgs vm_args;
    const char *launch_class;

    start_args(const char **args, const char *classname) {
        vm_args.version = JNI_VERSION_1_6;
        vm_args.ignoreUnrecognized = JNI_TRUE;

        int arg_count = 0;
        const char **atarg = args;
        while (*atarg++) arg_count++;
        JavaVMOption *options = new JavaVMOption[arg_count];
        vm_args.nOptions = arg_count;
        vm_args.options = options;

        while (*args) {
            options->optionString = strdup(*args);
        launch_class = strdup(classname);

Next we have the thread function that launches the VM. This is a standard posix thread routine, so there’s no magic there.

void *
start_java(void *start_args)
    struct start_args *args = (struct start_args *)start_args;
    int res;
    JavaVM *jvm;
    JNIEnv *env;

    res = JNI_CreateJavaVM(&jvm, (void**)&env, &args->vm_args);
    if (res < 0) exit(1);
    /* load the launch class */
    jclass main_class;
    jmethodID main_method_id;
    main_class = env->FindClass(args->launch_class);
    if (main_class == NULL) {
    /* get main method */
    main_method_id = env->GetStaticMethodID(main_class, "main", "([Ljava/lang/String;)V");
    if (main_method_id == NULL) {

    /* make the initial argument */
    jobject empty_args = env->NewObjectArray(0, env->FindClass("java/lang/String"), NULL);
    /* call the method */
    env->CallStaticVoidMethod(main_class, main_method_id, empty_args);
    /* Don't forget to destroy the JVM at the end */
    return (0);

What this code does is Create the Java VM (short piece at the start). Then it finds and invokes the public static void main(String args[]) of the class that’s passed in. At the end, it destroys that Java VM. You’re supposed to do that; for memory allocation’s sake.

Next we have the main routine, which creates the thread and invokes the run loop

int main(int argc, char **argv)
    const char *vm_arglist[] = { "-Djava.class.path=.", 0 };
    struct start_args args(vm_arglist, "launch");
    pthread_t thr;
    pthread_create(&thr, NULL, start_java, &args);

The trick is the CFRunLoopRun() at the end. What this does is triggers the CoreFoundation main application run-loop, which allows the application to pump messages for all the other run-loops that are created by the java UI framework.

The next thing is an example java file that creates a window.

public class launch extends JFrame {
    JLabel emptyLabel;

    public launch() {
        emptyLabel = new JLabel("Hello World");

        getContentPane().add(emptyLabel, BorderLayout.CENTER);

    public static void main(String args[]) {
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                launch l = new launch();

Because it’s got a EXIT_ON_CLOSE option, when you close the window the application will terminate.

Gluing this into a makefile involves the following; it’s got conditional sections for building with either the 1.6 or 1.7 VM, and there are a couple of changes that are needed in the cpp code to get this to work:


JHOME:=$(shell /usr/libexec/java_home -v 1.$(JAVA_VERSION))
SYSTEM:=$(shell uname -s)

ifeq ($(JAVA_VERSION),7)
LDFLAGS += -L$(VM_DIR)/server -Wl,-rpath,$(VM_DIR) -Wl,-rpath,$(VM_DIR)/server
CXXFLAGS += -I$(JHOME)/include -I$(JHOME)/include/$(SYSTEM)
LDLIBS += -ljvm
CXXFLAGS += -framework JavaVM

CXXFLAGS += -framework CoreFoundation

all: launch launch.class

launch.class: launch.java
	/usr/libexec/java_home -v 1.$(JAVA_VERSION) --exec javac launch.java

	rm -rf launch *.class *.dSYM

For the C++, you need to change the include path depending on whether you’re building for the 1.6 or 1.7 VM.

#include <JavaVM/jni.h>
#include <jni.h>


You can’t really use the code that was compiled against the 1.6 VM on the 1.7 environment, as it uses the JavaVM framework in 1.6, vs. the libjni.dylib on the 1.7 VM, but I would consider this a given across all JVM variants.

This source can be cloned from the Repository on BitBucket.

RAII, or why C++ doesn’t have a finally clause

One of the most common idioms I see in a delphi program looks like the following:

foo := TObject.Create;
    // Do something with foo

It’s primarily because you always create objects on the heap, and everything involving an object, essentially, is a pointer. This makes for a little bit of a memory management issue. You have to remember to destroy objects after you’ve created them, and because if something goes wrong, that destruction needs to take place in a finally block. Having it take place in a finally block keeps you safe from exceptions. If an exception is triggered it always passes through the finally block on it’s way back up the stack. This gives you the ability to cleanup temporary objects as needed

C++ uses the RAII idiom, which means that objects that are defined at a certain scope are always going to be destroyed once that scope is exited. What this means is that if you define an object X in a function Y, once Y is returned from then X will be destructed/destroyed. As an example:

std::stringstream streamer;
// do something with streamer

There’s no awkward streamer.create call, and once you return from the function streamer is appropriately tidied up

But wait, you say, they are not the same, what you are doing in Delphi is creating an object on the heap, while in C++ you are creating it on the stack, so of course during the process of unwinding said stack, you will destroy the object. The more equivalent code in C++ would have been:

std::stringstream *streamer = new std::stringstream();
// Do something with streamer
delete streamer;

Hah you say, no try finally means that if an exception is triggered in the ‘do something’ piece of code, you leak a streamer object on the heap.

To which I respond, silly rabbit, that’s why you didn’t create a pointer in the first place with the first piece of code. If you want to perform something like this, then you should use a smart pointer, which takes care of the destruction of the object once the smart pointer exits scope, like so:

std::unique_ptr&lt;std::stringstream&gt; streamer(new std::stringstream);
// Do something with streamer

But really, if you were just going to create an entity for the duration of a function, it’s far easier to just create it in-place without such complications

This leads to a little gotcha that regulary catches non C++ programmers when they are creating methods. As they typically come from a pointer-based economy (e.g. Delphi, Java), when they create a method:

function doSomething(object : TObject) : integer

What they’re doing is actually passing in a reference to TObject (as it’s just a pointer), and because it’s pass-by-value in this case, what they’re really just passing in is the value of the pointer. In C++ it’s a little bit different. When you pass in an object using the form:

int do_something(std::stringstream streamer)

What actually happens is a copy is made of the item being passed, and it’s that which ends up in the function; not the actual object that you’re passing in. If you want to pass in a reference to the object, then you need to use the reference passing semantic:

int do_something(std::stringstream &streamer)

You can use the const modifier if the method you’re invoking is not going to modify the passed in reference, which allows you to restrict the things you can do with the reference. In this form you don’t need to perform any indirection on the object (e.g. getting a pointer to it) in order to pass it in. This makes for slightly tidier code, which isn’t strewn with &’s on the way in, and var->’s in the method itself.

And for those Delphi haters out there; the reason I picked Delphi rather than Java is because Delphi is, unless you’re using the .NET variant, a non garbage collected language, and as such requires the free, otherwise you get memory leaks.

Objective C is another kettle of fish. Between the original model of retain/release, the GC model that was available on OS X from 10.5, and now the totally shiny ARC mechanism, it makes some people cry.