Home
login

Reverse a Binary Stream Using Busybox

Today I had the need to reverse a binary stream using only bash and commonly-available command-line utilities. Not tac, sed, or rev, which are all line-oriented utilities that work best on ASCII data. I needed something that I could trust with binary data. This is what I came up with. Feel free to point out my weakness.

The first round was this:

reverse() {
    local i=0
    cat | xxd -c 1 | awk '{print $2}' | tac | \
        while read F; do
            printf "%06x: %s\n" $i $F; i=$((i+1))
        done | xxd -c 1 -r
}

I wasn't a huge fan of the while loop to prefix the lines with addresses for 'xxd -r'. The streams that I am using this for are only several kB max, so efficiency was not my first goal, but why not try to make it faster if you have the option? Some reading reveals that 'tac' is not available on every Unix platform. And 'xxd' is only available if you have vim installed. I swapped in 'hexdump' for 'xxd', but hexdump does not have a reverse, so I had to find a way to do that. This is where awk comes into play, doing and integer to character conversion for each line. This happens to run in about 6 times faster than the original version and uses stuff that even busybox has.

My final version was this:

reverse() {                                                                                                            
    cat | hexdump -v -e '/1 "%d\n"' | \
        sed -e '1!G;h;$!d' | \
        awk '{printf "%c", $0}'
}

You might use it like this:

$ reverse < file > file.reversed
# or
$ command -in -a | pipeline | reverse | process | reverse > some_output

Problem with null bytes

At least on my busybox, the
Quote:
awk '{printf "%c", $0}'
doesn't output anything when the argument is 0. So, it reverses the input, but also strips null bytes. Sigh.

Curse those null bytes

Yup, that be a bug in busybox.... The string get built properly in awk_printf() but then the OC_PRINTF callback uses fputs() with the C-string that awk_printf() returns. No length information is returned and puts stops at the first NUL byte. If you are only concerned with the printf case (not sprintf), then the fix is pretty trivial. You only need to pass in a size_t * into awk_printf() so it can return the length as well as the printed string and then call fwrite() instead of fputs(). I say trivial, but I realize not everyone likes to edit and recompile their programs. :)