Fragmentation percentage FAIL

defrag That’s a huge amount of fragmentation. I think I may not be able to survive that percentage. It took < 5 seconds to defrag. Please people, percentages are supposed to mean something 😉

automatically sending text messages from email messages

Using the natty little webtext script (updated to use an $HOME/.webtextrc file for the username/password) and a little bit of procmail magic, we can now send messages containing the subject line of a specific email message once it has been received.
Firstly, there’s the .procmailrc file. The recipe I’m using looks like:

# send text messages
:0 W
* ^subject: IM:
| $HOME/bin/sendim

The script sendim is a simple bash script that checks the subject line contains “IM: ” and then sends on the remainder of the subject line as a text message to my mobile phone.
As a security precaution, I’ve added a special header ‘X-apikey’ which is checked by the sendim. If the apikey doesn’t match then the rule doesn’t fire. You should replace the XXX item with your own value generated using echo <some text here> | sha1sum. By not putting the api check in the .procmailrc file you can quietly drop messages that don’t have the correct key instead of keeping them in your inbox.

#!/bin/bash -p

export PATH=$HOME/bin:$PATH

subject=
apikey=
apikey_c="XXX"

while read foo; do
    [[ -z $foo ]] && break
    subject=${subject:-$(echo $foo | sed -n "s/[sS]ubject: IM: //p")}
    apikey=${apikey:-$(echo $foo | sed -n "s/X-apikey: //p")}
done

if [[ -n $subject && $apikey = $apikey_c ]]; then
    webtext -t YYY "$subject" >/dev/null 2>&1
fi

Source of sendim. Don’t forget to replace the XXX and YYY with your chosen items.

Consistency checking a block device

I’ve been testing the resizing of the drives located on a Dell MD3000, and i’ve seen errors when resizing after the 2TB mark. This is on the new firmware which supports > 2TB logical drives. I wrote a script to write to random locations of a block device. It can then read them back and verify that they’re still the same as what was written. Rather than writing to the entire device I use random sampling, with a few fixed points on the block device. I pretty much get consistent failures. If I put in the failed locations into the next write run they come out again in the subsequent run. Kind of makes resizing a dangerous operation, even though it is stated that resizing is non-destructive.

I realize that the array is nothing more than a rebrand of another device, but it would be great if it was tested in a lab before something this bad got out to the customers.

#! /usr/bin/perl -w

use strict;
use Getopt::Long;
use Digest::MD5 qw(md5_hex);
use File::Basename;

my $fs;
my $readfile;
my $writefile;

my $numpatterns = 2048;
my $seed = undef;
my $size;
my $real_size;
my $help;

my %vars;
my @def_offsets = (0);

sub usage($) {
        print <<EOM;
Usage: $0 –fs=<filesystem> –read=<file>|–write=<file>
        [–num=<number of blocks>] [–offset=<offset to test>]
        [–seed=<random number seed>]
EOM
        exit ($_[0]);
}

my $result = GetOptions( fs=s => \$fs,
        num=i => \$numpatterns,
        seed=i => \$seed,
        read=s => \$readfile,
        offset=i => \@def_offsets,
        write=s => \$writefile,
        h|help => \$help);

usage(0) if defined($help);
warn "Need file system to use" if (!defined($fs));
warn "Need either a read or write file" if (!(defined($readfile) || defined($writefile)));

usage (1) if (!defined($fs) || !(defined($readfile) || defined($writefile)));
my $base = basename($fs);

open (IN, "</proc/partitions") || die "Could not load partition tables";
while (<IN>) {
        chomp();
        my ($major, $minor, $blocks, $name) = m/(\w*)\s+(\w*)\s+(\w*)\s+(\w*)$/;
        next if (!defined($major));
        if ($name eq $base) {
                $real_size = $blocks;
                last;
        }
}
close(IN);

die "Could not get size" if (!defined($real_size));

# Write to the offset in blocks
sub write_to_offset($$) {
        my ($offset, $buffer) = @_;
        sysseek(INFS, $offset * 1024, 0);
        my $write = syswrite(INFS, $buffer, 1024);
        if (!defined($write) || $write != 1024) {
                warn "Failed to write: $offset $!\n";
        } else {
                $vars{$offset} = md5_hex($buffer);
        }
}

sub read_from_offset($) {
        my ($offset) = @_;
        my $buffer;
        sysseek(INFS, $offset * 1024, 0);
        my $read = sysread(INFS, $buffer, 1024);
        if (!defined($read) || $read != 1024) {
                warn "Could not read 1024 bytes at $offset $!";
                return (1);
        }
        if (md5_hex($buffer) ne $vars{$offset}) {
                warn "Data at offset $offset was not the same as expected";
                return (1);
        }
        return (0);
}

sub get_buffer {
        my $i = 0;
        my $buffer = "";
        while ($i++ < 256) {
                my $randval = int(rand(255 * 255 * 255 * 255));
                $buffer .= chr($randval >> 24) . chr(($randval >> 16) & 255) .
                        chr(($randval >> 8) & 255) . chr($randval & 255);
        }
        (length($buffer) == 1024) || die "Buffer was " . length($buffer);
        return $buffer;
}

if (defined($readfile)) {
        # reading from previous file
        open (INPUT, "<$readfile") || die "Could not open previous run log";
        while(<INPUT>) {
                chomp();
                my ($key, $value) = m/(.*)=(.*)/;
                if ($key eq "patterncount") {
                        $numpatterns = $value;
                        next;
                }
                if ($key eq "size") {
                        $size = $value;
                        next;
                }
                if ($key eq "seed") {
                        $seed = $value;
                        next;
                }
                $vars{$key} = $value;
        }
        close(INPUT);
} else {
        $seed = time ^ $$ ^ unpack "%L*", `ls -l /proc/ | gzip -f` if (!defined($seed));
        $size = $real_size if (!defined($size));
        open (OUTPUT, ">$writefile") || die "Could not open new run log";
        print OUTPUT "patterncount=$numpatterns\n" .
                "size=$size\n" .
                "seed=$seed\n";
}

print "Size: $real_size [$size] Seed: $seed\n";
srand($seed);

my $mode = "<";
$mode = "+<" if ($writefile);
open(INFS, "$mode$fs") || die "Could not open raw device";

if ($writefile) {
        map { write_to_offset($_, get_buffer()) } @def_offsets;
        write_to_offset($size – 1, get_buffer());
        while($numpatterns > 0) {
                my $offset = int(rand($size));
                print "Writing pattern: $numpatterns           \r";
                next if defined($vars{$offset});
                write_to_offset($offset, get_buffer());
                $numpatterns–;
        }
        map { print OUTPUT "$_=" . $vars{$_} . "\n" } keys(%vars);
        close(OUTPUT);
} else {
        my $failcount = 0;
        my $tocount = scalar(keys(%vars));
        map { $failcount += read_from_offset($_); printf("To Count: %0.7d\r", $tocount–); } sort(keys(%vars));
        print "Count difference: $failcount\n";
}


consistency.pl.txt

signal versus sigaction

the use of the

signal(int signum, void (*handler)(int))

is a smidgin dangerous on various operating systems. Under Solaris, for example once the signal has been delivered to the process the signal handler is reset, so a typical piece of code that wants to reuse the signal handler repeatedly will typically set the signal handler again when receiving the signal. This leads to a minor race condition where upon receipt of the signal and the re-setting of the handler the process receives another copy of the same signal. Some of these signals cause Bad things to happen – such as the stopping of the process (SIGTSTP for example). Under Linux it keeps the signal handler in place, so you have no fear of the event triggering an unwanted event.
The manual page for

signal

under Linux makes it clear that the call is deprecated in favour of the much more functional

sigaction(int sig, const struct sigaction *restrict act, struct sigaction *restrict oact)

call, which keeps signal handlers in place when you don’t pass the SA_RESETHAND parameter as part of the sa_flags parameter of the sigaction structure. So you get to explicitly choose to accept a signal once, and then have the system deal with it in the default manner afterwards.
Signals, are of course a real pain in the ass when dealing with sub-processes. For example the use of ptrace to perform profiling works well until you fork. If another SIGPROF signal arrives before you can create your signal handler then the child process is terminated as that’s the default behaviour in that situation.
Under Solaris (and Leopard) you can make use of dtrace to perform profiling on a set of processes without needing to deal with vagaries of signal handling, making this a non-issue. For those of you stuck in LD_PRELOAD land, probably the only thing that can be done is to set the signal disposition to be ignored before execing the new process. you have a small window where the profiling is missing, but the overall increased stability of the application is improved by preventing it from accidentally being terminated due to a profiling signal being received too soon. I know the accuracy nuts would hate that, but it’s part of the price of dealing with standards.

Important! Must install! You will die without it!

CreativeWhine Oh get over yourself! I do not need to install the music management software on my computer and not having it installed is not the end of the world. It’s almost as bad as the apple updater suggesting you install Safari. Mind you, it’s nowhere near as annoying about it, and it doesn’t suggest that the world will end if you don’t download it (but, you know, it just might…)

Touch typists of the world unite

Vista’s built in search box on the start menu is a boon for launching applications. The general accessibility of windows applications to users of the keyboard is a huge boon to those of us who try to keep our grubby little fingers on the keyboard.
This does not seem to be quite the case on the mac. I’m probably unaware of all the keyboard accelerators that are available – after all, I’ve been using Windows for a lot longer. A lot of my use of the keyboard was prompted by a long use of FVWM while in Sun, where practically everything was usable without having to stray to the mouse. Mind you on laptops, the location of the touchpad is a lot better in this regard – you just drag your claw-like thumb over the tracking surface.

Cheap and cheerful pwait for linux

#!/bin/bash -p
if [ $# -eq 0 ]; then
echo "Usage: $(basename $0) " 1>&2
exit 1
fi
while [ -d /proc/$1 ]; do sleep 0.5; done

If I implemented it using inotify, I presume I can get rid of the sleep, but that entails compiled code.

shorten irish links with url.ie

I like supporting Irish websites, so I tend to use url.ie for links. the algorithm for generating the link seems to be sequential, so I was happy yesterday when my link for the perfect coffee went to http://url.ie/pdc, or as Dale Cooper would (hopefully) say – perfect damn coffee.

The great data recovery ‘challenge’

I have to laugh when I see the great data recovery ‘challenge’. Lets be honest here folks, businesses are in it to make some level of profit from their efforts. To that extent they have facilities in place to recover data from damaged drives due to a variety of problems from simple surface level damage all the way through to failed drive electronics (swapping out logic boards).
The price quoted is generally based on the amount of effort that needs to be gone thorough. Accidental erasure is probably the cheapest. Simple disk-level damage (e.g. a few dodgy sectors) can be resolved using tools like Steve Gibson’s Spinrite; which is pretty much a good example of what these companies would be doing. Drive electronics failures would cost more – for example they may need to disassemble the drive in a protected atmosphere to replace something. Large scale physical damage to the drive may entail extracting it from the original housing and essentially replicating the internals of the drive in order to read the data from it. This would be very expensive, but would succeed in the face of quite significant damage.
The intentional erasure of the data using utilities like dd are pretty much a non-starter. For the first part, you need insanely expensive specialist equipment, the rate of data recovery is slow (we’re probably talking in the order of bits per second) and the chances of actually recovering anything useful on a typical hard drive is nil.
For any typical person trying to wipe their data any of the secure erasure utilities available for purchase or for free are more than adequate to prevent the data being recovered by any agencies.

Not a lot of font choice

Adobe Buzzword Font List This is the list of typefaces available in Adobe’s new Buzzword. It is really, really pretty; implemented in Flash, but when it comes to using it we discover that the two main fonts are missing – Times & Helvetica (or Times New Roman & Arial for ‘softies).
All the online offerings from the Adobe Beta are pretty nice, and cover the most fundamental of things, and some of the more useful features – like change tracking in Buzzword. It’s all flash; so I have the fear that it will crash my browser.
It’s yet to happen me on the mac, though; even though I keep losing the browser on Linux