Quote for the day

Mahatma Gandhi:

A ‘No’ uttered from the deepest conviction is better than a ‘Yes’ merely uttered to please, or worse, to avoid trouble.

Quote heard on the ATP

Just a small quote I heard this morning:

You can’t judge someone on what you think they’re supposed to be doing.

Why does a simple piece of c generate a SIGSEGV on linux as opposed to a SIGBUS on OSX

One of the lads in the office has the following piece of code:

main;

when compiled on linux, and run, it generates a SIGSEGV, but when run on OSX, it generates a SIGBUS,

Why?

The definition of SIGBUS is `Access to an undefined portion of a memory object.`, the definition of `SIGSEGV` is that it’s `Invalid memory reference.`.

The issue is that OSX translates a native MACH exception into a POSIX signal. on OSX, the route is through a translation layer, the code of which indicates:


case EXC_BAD_ACCESS:
if (code == KERN_INVALID_ADDRESS)
*ux_signal = SIGSEGV;
else
*ux_signal = SIGBUS;
break;

This indicates that if you crash with an address that is mapped into the process, but isn’t an invalid address, you’ll get a SIGBUS, however, if you crash with an address that isn’t mapped into the process, you’ll get a SIGSEGV, i.e. all addresses that are unmapped will generate SIGSEGV, while all address that are mapped, but aren’t usable in the way you ask will generate a SIGBUS.

There is some other code:

if (machine_exception(exception, code, subcode, ux_signal, ux_code))
return;

which deals with general protection faults e.g. pages that are simply not mapped.

when you compile the .c file, because you declared the variable main, it gets default storage/allocation which means it goes into the DATA segment of the application, none of which allows the execution of code from there.

When you try to run the program on OSX, it tries to execute the ‘code’ at main, which is data, the exception it gets at this point is bad access. The memory is available, but it isn’t mapped as executable, so you get a SIGBUS.

When you try to run this program on linux, you get a SEGV. The reason is simply because you’re trying to access memory that isn’t mapped for purpose.

When you change the code, to invoke a simple null pointer:


int main(int argc, char **argv) {
void (*crash)(void) = 0;
crash();
}

You get a SEGV, this is caused because there is no 0 mapped page visible to the user (you can use vmmap on processes to see what’s mapped/not mapped in this case).

.NET Date and Time Handling

So I started out reading the documentation for .NET date and time handling and I felt I needed a drink. If this is what .NET developers have to deal with, then no wonder so many of them have issues.

It starts with the DateTime structure. It’s purpose is to represent a Date and Time; however this structure possesses both the timestamp and the fact that it’s either a UTC time (i.e. no TZ information whatsoever), or a local time.

Obtaining a value for ‘now’ involves using the static properties DateTime.Now or DateTime.UtcNow. Once you’ve instantiated a DateTime value, you can’t actually change it, so for example doing DateTime.Now.ToUniversalTime() won’t actually change the structure, it returns a new instance of DateTime, which is converted based on the system’s current timezone settings, which means that you can actually get an incorrect value when the time straddles a daylight savings change.

There is an entire Microsoft document on best practices for dealing with DateTime values. It’s a bit tricky. Firstly, try to only deal with dates and times in UTC where possible. There is a major gotcha in the XmlSerializer code in the .NET 1.0/1.1 framework where it always treats the DateTime as localtime, even if you’ve passed in a UtcTime. This seems to have been fixed in later revisions of .NET; so hopefully you won’t encounter it.

Obtaining a DateTime object from a String is a matter of using the DateTime.parse(), you just have to ensure that the parse-string format matches the input string value. The default parse method uses the thread’s current Culture, which means you have to pay attention to things like the pesky mm/dd/yyyy vs dd/mm/yyyy conventions when you’re parsing. If you want to specify the formatting that you’re trying to parse in a manual fashion, you need to use the parse(String stringToParse, IFormatProvider provider) style, which allows you to use another Culture’s parsing format, or define your own.

If you’re planning on performing operations on a DateTime item, such as adding a few hours, days, etc to the original value bear in mind that in order to be safe, you should perform the operation with the timezone set to UTC (which doesn’t have any DST adjustments), and then swap it back to local time. This double trip is intended to avoid errors around ‘spring forward’ and ‘fall back’ times. Even if you don’t think you’re going to encounter those times it’s better safe than sorry.

Because a DateTime only stores a time in either ‘local’ or UTC, when you choose to persist this information you are best off storing the information in UTC and keeping an adjacent element for the timezone, should you actually choose to store it. The simplest way to store it is as the result of the Ticks property. This value is an integer, and is in ten-millionths of a second since 12:00 midnight, January 1, 0001 A.D. (C.E.) in the Gregorian calendar.

The DateTime structure has the ability to extract the Year, Month, Day, Hour, Minute, Second and Milliseconds from the value. It allows the extraction of the ‘Ticks’ value, but this is an absolute value, not an extraction. All these elements are in terms of the Gregorian calendar. It doesn’t matter if your current culture does not use the Gregorian calendar, it always returns the data in terms of the Gregorian calendar. I don’t know the reason for this choice, considering that it understands the current timezone; but it’s most likely due to the culture being a property of the current thread rather than a more ‘systemmy’ property.

If you want to extract based on a different calendar, such as the Persian calendar, you need to extract them via an instance of the calendar. The Persian Calendar is easily instanced from System.Globalization.PersianCalendar, but if you need the calendar for a specific region, you should instantiate a CultureInfo using new System.Globalization.CultureInfo(String CultureCode, bool userOverride), and then extract it’s Calendar property. In this case I can do:

var date = DateTime.utcNow;
CultureInfo cinfo = new System.Globalization.CultureInfo("EN-ie", false); // get the English-speaking, Ireland culture
int year = cinfo.Calendar.GetYear(date);
// This year is based on the calendar from Ireland (Gregorian), but you get the idea as to how to use it for other calendars.

The Calendar class deals with the breaking of a Date & Time into pieces. Those pieces are defined in terms of the calendar – this can be confusing if you’re only used to a single calendar, as many people are. I live in a region that uses the Gregorian calendar exclusively, so I deal with dates and times in those terms. The Gregorian calendar was formalized by Pope Gregory in 1582 (by that calendar’s reckoning), which was, in itself a correction to the Julian calendar. You can, in fact, get dates and times in terms of the Julian calendar, which was off because it insisted on a leap year every 4 years, with no exceptions – bear in mind that the Gregorian calendar has two exceptions to the rule – no leap years on years divisible by 100, but it does have a leap year if it’s also divisible by 400 – so, for example 1900 was not a leap year, but 2000 was a leap year, while 2100 will not be a leap year.

So with the .NET environment we have the DateTime, which represents a point in time, we then have the Calendar, which is a representation of the DateTime in terms of a particular calendar i.e it’s representation in terms of the year, month, day, day of week. There can me more delineations of the date & time – it’s up to the calendar to define them.

There are many ways to make an instance of a DateTime, possibly the simplest of which is to use the Now static property, but you can also construct them explicitly, using a variety of constructors, The simplest is the DateTime(ticks), which makes one in terms of the number of ticks since 0-time. But you can construct them made up of just the year, month and day; combined with an hour, minute and second; or even subsequently combined with a millisecond value. These can be specified as ‘local’ or UTC, and can also be specified in terms of a calendar. Once you’ve constructed the DateTime instance, it will no longer be represented in terms of that calendar, but will be represented in terms of the Gregorian calendar; e.g.

var cal = new System.Globalization.JapaneseCalendar();
DateTime aDateTime = new DateTime(2014, 1, 1, cal);
System.Diagnostics.Debug.Assert(aDateTime.Year == 2014);

This assertion will trigger, because it’s no longer in terms of the Japanese Calendar.

Parsing of DateTime values can be done using DateTime.parse(), which takes a string, and turns it into a DateTime. it uses the current Culture; unless it can be parsed as ISO 8601 – so when I’m here in little old ireland, if I put in a string of 1/2/2015, it tells me February 1, 2015, but if I’m in the US, it represents January 2, 2015. So you have to be careful when parsing DateTime values. This is where the IFormatProvider parameter comes in – this interface is generally not manually created by the user, but is, instead extracted from an instance of a culture e.g.

IFormatProvider if_us = new System.Globalization.CultureInfo("EN-us", false);
IFormatProvider if_ie = new System.Globalization.CultureInfo("EN-ie", false);
DateTime ieDate = DateTime.Parse("1/2/2015", if_ie);
DateTime usDate = DateTime.Parse("1/2/2015", if_us);
System.Diagnostics.Debug.Assert(ieDate.Year == usDate.Year);
System.Diagnostics.Debug.Assert(ieDate.Month != usDate.Month);
System.Diagnostics.Debug.Assert(ieDate.Day != usDate.Day);
System.Diagnostics.Debug.Assert(ieDate.Month == usDate.Day);
System.Diagnostics.Debug.Assert(ieDate.Day == usDate.Month);

None of these assertions trip, because we know/understand how the string is parsed by these cultures.

The next thing you generally need to do is deal with displaying times in different time zones. You already have a DateTime instance, which is either local, or UTC, and you want to turn it into a local time. So what you need is a TimeZoneInfo, and then going back to a DateTime that displays in that format:

TimeZoneInfo irelandtz = TimeZoneInfo.FindSystemTimeZoneById("GMT Standard Time");
TimeZoneInfo newyorktz = TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
var now = DateTime.UtcNow;
DateTime irelandnow = TimeZoneInfo.ConvertTimeFromUtc(now, irelandtz);
DateTime newyorknow = TimeZoneInfo.ConvertTimeFromUtc(now, newyorktz);
Console.WriteLine(" {0} {1}", irelandnow, irelandtz.IsDaylightSavingTime(irelandnow) ? irelandtz.DaylightName : irelandtz.StandardName);
Console.WriteLine(" {0} {1}", newyorknow, newyorktz.IsDaylightSavingTime(newyorknow) ? newyorktz.DaylightName : newyorktz.StandardName);

Now this code is awful, you are not actually picking the timezone by location – it’s by a name of timezone, rather than the location that specifies the timezone – e.g. Europe/Dublin and America/NewYork. There is a much better alternative for dealing with DateTime values, and that’s to use the NodaTime, which gives a little bit better an experience:

Instant now = SystemClock.Instance.Now;
DateTimeZone dublin = DateTimeZoneProviders.Tzdb["Europe/Dublin"];
DateTimeZone newyork = DateTimeZoneProviders.Tzdb["America/New_York"];
Console.WriteLine(new ZonedDateTime(now, dublin).ToString());
//2015-02-07T20:48:20 Europe/Dublin (+00)
Console.WriteLine(new ZonedDateTime(now, newyork).ToString());
//2015-02-07T15:48:20 America/New_York (-05)

Dealing with time on computers is so much fun

When a computer boots up, the operating system typically takes it’s current idea of what time it is from the hardware clock. On shutting down, it typically pushes back it’s own idea of what time it is back to the BIOS so that the clock is up-to-date relative to what time the operating system thinks it is.

The operating system while running tries to keep the concept of what time it is by having a counter it increments based on the kicking of some interval timer (on newer operating systems this is not strictly how it works, but the idea is similar).

The problem is, of course, that the thing that generates the timer isn’t completely accurate. As a result, over time, the operating system clock drifts in relation to what the real time is. This drift depends on the quality of the hardware going into the computer. Some are more expensive and thus don’t drift as much, some are cheap and drift quite a lot (some are so bad as to cause a drift of many minutes over the course of a day).

This is generally undesirable, I want to be reminded that I have a meeting before it actually starts so I can load up on a cup of coffee beforehand. How do we fix this?

The answer is to ask another computer and hope that it has a better idea of what time it is than this current one.

The most common protocol used for getting time over a network is the Network Time Protocol (NTP). In it’s simplest mode, you ask the remote system what time it is and then shift your time to match the time from the remote system by adjusting the time of your local system over time (to prevent confusing jumps in time where it goes from 9am to 8.15 instantly confusing you no end). The fancier methods involve asking multiple systems for the time, constructing a model of how crap your clock is relative to their clock and using that to constantly adjust your time to try to keep it in sync with the remote system. You periodically ask the remote systems for the time and use it to update your model. With this you can typically keep your clock within ~100ms of the time that’s considered ‘now’.

This is fine, we now have a relatively accurate clock which we can ask what time it is. The only problem is that depending on where you are in the world, your idea of what time it is has been altered by the wonderful concept of the timezone.

A properly written operating system should keep time without caring about what timezone you belong to. When you ask for a time it gives you a number which is unambiguous. This is vitally important when it comes to trying to perform operations on the time. If it’s unambiguous, then you can use it for mathematical operations such as determining the difference between two times. If it’s ambiguous then you’ll have trouble trying to do anything with it.

When you introduce the possibility of ambiguity into the representation of time, problems arise. For example, if I ask you how many minutes are there between 01:30 and 01:35, the obvious answer of 5 minutes my be incorrect because 01:30 is BST and 01:35 is GMT, which means the difference is 65 minutes; plus if I just showed you that time the day after a daylight savings transition without telling you what timezone it was in you wouldn’t be able to tell.

But what does this actually mean? For most developers, you should follow two cardinal rules:

  • Avoid changing the date from it’s unambiguous form until the last possible moment.
  • Always store the date in it’s unambiguous form.

If you have to store a date with a timezone, store the timezone as separate information to the actual unambiguous date, that way you can easily perform operations on the unambiguous time, and display the original time relative to the timezone it was supplied.

This doesn’t even get into the complexities of calendars. While the West generally uses on the Gregorian calendar, many other places in world don’t use that calendar (there are a lot of them).

Standard C (POSIX) is poorly supplied with an API for dealing with dates and times, especially in relation to timezones. You can generally set the timezone of the application, but it’s a thread-unsafe abomination that really should DIAF.

Most of the newer languages have significantly better date/time support. I’ll go through them in a series of posts, starting with C#.

A simple C dtrace aggregation consumer

This is a little bit of gluing from other sources, but it’s a simple piece of code to consume the content of an aggregation from dtrace using the `libdtrace` library. I’ve tested it on Mac OSX 10.9 and it works, the usual caveats that it wont work all the time, and you need privileged access to get it to run:

#include <assert.h>
#include <dtrace.h>
#include <stdio.h>

// This is the program.
static const char *progstr =
"syscall::recvfrom:return \
{ @agg[execname,pid] = sum(arg0); }";

// This is the aggregation walk function
static int
aggfun(const dtrace_aggdata_t *data, void *gndn __attribute__((unused)))
{
    dtrace_aggdesc_t *aggdesc = data->dtada_desc;
    dtrace_recdesc_t *name_rec, *pid_rec, *sum_rec;
    char *name;
    int32_t *ppid;
    int64_t *psum;
    static const dtrace_aggdata_t *count;

    if (count == NULL) {
        count = data;
        return (DTRACE_AGGWALK_NEXT);
    }

    // Our agrgegation has 4 records (id, execname, pid, sum)
    assert(aggdesc->dtagd_nrecs == 4);

    name_rec = &aggdesc->dtagd_rec[1];
    pid_rec = &aggdesc->dtagd_rec[2];
    sum_rec = &aggdesc->dtagd_rec[3];

    name = data->dtada_data + name_rec->dtrd_offset;
    assert(pid_rec->dtrd_size == sizeof(pid_t));
    ppid = (int32_t *)(data->dtada_data + pid_rec->dtrd_offset);
    assert(sum_rec->dtrd_size == sizeof(int64_t));
    psum = (int64_t *)(data->dtada_data + sum_rec->dtrd_offset);

    printf("%1$-30s %2$-20d %3$-20ld\n", name, *ppid, (long)*psum);
    return (DTRACE_AGGWALK_NEXT);
}

// set the option, otherwise print an error & return -1
int
set_opt(dtrace_hdl_t *dtp, const char *opt, const char *value)
{
    if (-1 == dtrace_setopt(dtp, opt, value)) {
        fprintf(stderr, "Failed to set '%1$s' to '%2$s'.\n", opt, value);
        return (-1);
    }
    return (0);
}

// set all the options, otherwise return an error
int
set_opts(dtrace_hdl_t *dtp)
{
    return (set_opt(dtp, "strsize", "4096")
        | set_opt(dtp, "bufsize", "1m")
        | set_opt(dtp, "aggsize", "1m")
        | set_opt(dtp, "aggrate", "2msec")
        | set_opt(dtp, "arch", "x86_64"));
}

int
main(int argc __attribute__((unused)), char **argv __attribute__((unused)))
{
    int err;
    dtrace_proginfo_t info;
    dtrace_hdl_t *dtp;
    dtrace_prog_t *prog;

    dtp = dtrace_open(DTRACE_VERSION, DTRACE_O_LP64, &err);

    if (dtp == 0) {
        perror("dtrace_open");
        return (1);
    }
    if (-1 == set_opts(dtp))
        return (1);

    prog = dtrace_program_strcompile(dtp, progstr, DTRACE_PROBESPEC_NAME, 0, 0, NULL);
    if (prog == 0) {
        printf("dtrace_program_compile failed\n");
        return (1);
    }
    if (-1 == dtrace_program_exec(dtp, prog, &info)) {
        printf("Failed to dtrace exec.\n");
        return (1);
    }
    if (-1 == dtrace_go(dtp)) {
        fprintf(stderr, "Failed to dtrace_go.\n");
        return (1);
    }

    while(1) {
        int status = dtrace_status(dtp);
        if (status == DTRACE_STATUS_OKAY) {
            dtrace_aggregate_snap(dtp);
            dtrace_aggregate_walk(dtp, aggfun, 0);
        } else if (status != DTRACE_STATUS_NONE) {
            break;
        }
        dtrace_sleep(dtp);
    }

    dtrace_stop(dtp);
    dtrace_close(dtp);
    return (0);
}

What a pointless ‘auto updater’

Another morning, another Adobe Flash update.

Big ass dialog telling me that I have an update which links me to their website where I have to uncheck the dumb-ass add-on options for (depending on the week) McAfee and the Google toolbar.

Several months ago Adobe were pushing/advertising their auto-updating technology. They were pretty much saying ‘completely automated updates‘. To me it seems more like manual updating. You’re wasting my time with the prompts, the dialogs, the small download which subsequently downloads the update (each one is tied to the version of flash you’re downloading so what’s the damned point in having a stub downloader for each version).

If you want to see a proper auto-updater you should look to google. It downloads updates silently in the background, applies them (as much as possible) in the background and if you need to restart your browser it mentions this in a prompt. The downloads are tiny, caused by their use of their differential compression algorithm which keeps the updates small to the point of being downloadable in the background while not interfering with your normal use of the system. At the same time they’re not pushing a bunch of extra third-party software on you.

My Heart bleeds, truly…

The world has ended! openssl has this terrible bug which allows an attacker to receive a pot-luck of 64k portions of the memory address space of the SSL server. Thing that are up for grabs include usernames, passwords and SSL private keys; i.e. a veritable grab-bag of things that can be obtained from the server.
You should really change your password.

Actually, don’t bother; it’s not like it actually matters in the long run. You’re probably going to change your password from bunny2 to bunny3 anyway and it’s not like that was the most likely point of escape of your password in the first place. It’s probably that toolbar that installed when you installed that video player that was needed to watch that movie; you know the one? The one with the kittens? The one that you never got to see because you got distracted.

It’s been a really bad few months for security, and while people have been pointing fingers in various directions, deploying a Nelson-style ‘Ha Ha’, the only lesson that has been found in this case is that the bible has a great thing to say about this:

Do not gloat when your enemy falls; when they stumble, do not let your heart rejoice, or the Lord will see and disapprove and turn his wrath away from them.

… and then you’re in trouble.

As for me, I’m going to wait a few weeks and then change the passwords that are in any way connected to money. It would be nice if I could get some form of toolbar that would check sites as I go to them in a database and reveal if they were subject to the bug and prompt me to change the password once they’ve been declared clear. Someone get on that; it sounds like a fun project

.