Code and Hacks

Stuff I've stumbled on or figured out... mostly Perl, Linux, Mac and Cygwin.

My Photo
Name:
Location: CA, United States

Perl hacker, investor and entrepreneur.

Wednesday, October 14, 2009

Moving to a Mac... which Perl?

We'll after neglecting this blog for quite some time, I'm now back. I had to swap my laptop during the summer, and I decided to give one of the MacBook Pros a try. So I'll be adding Perl on the Mac and the Mac in general to the topics covered here. My first dilemma with the new Mac was which perl to use.

  • Leopard only had 5.8 installed, and I've been hooked on 5.10 for a while now. (Snow Leopard has added 5.10, but by the time I got the upgrade I was commited to the ideal of keeping the system perl separate from my development perl.)
  • Having come from Arch Linux, I stumbled upon and really liked Arch OS/X. Unfortunately, it appears that it isn't as well tested as MacPorts. In order to build any Perl modules that us XS with the Arch OS/X perl, I needed to use:
    $ perl Makefile.PL \
        LDDLFLAGS="-arch x86_64 -arch i386 -arch ppc \
                 -bundle -undefined dynamic_lookup -L/usr/local/lib" \  
        LDFLAGS="-arch x86_64 -arch i386 -arch ppc -L/usr/local/lib" \
        CCFLAGS="-arch x86_64 -arch i386 -arch ppc -g -pipe \
                 -fno-common -DPERL_DARWIN -fno-strict-aliasing \
                 -I/usr/local/include -I." \
        OPTIMIZE="-Os"
    
    Ummm... I don't think so! While I created an bash alias for it, cpan/cpanp where requiring constant tweaks. I assume I could have exported those variable from my bashrc, but I would rather avoid global changes like that.
  • Next I tried compiling my own perl. I ended up doing it several times as I learned where to put it, and realized I had forgotten to enable things like threads. This really seems to be the best way to go, but I would rather someone else keep up with security patches, new versions, etc.
  • So finally I tried MacPorts. So far so good. I have had trouble remembering to check the variants (port variants <port-file>), but otherwise thumbs up.

One thing I realized that I want, is a record of all the ports that I have installed (not a list of all the installed ports, just those that I had purposely installed). So, I wrote a short bash script that I stuck in ~/bin/port to keep a log:

#!/bin/bash

case "$1" in
  install|uninstall|upgrade|activate )
      echo "`date` $@" >> ~/.macports.log
      ;;
  *)      
esac

/opt/local/bin/port $@

Now anytime I run port install perl5.10 +shared +threads it is added to a log file. Rebuilding the system should be a snap. (I'm sure I could have gotten this by grepping for sudo and port install from the /var/log/system.log* files, but I like having it all in one place and not worrying about log files being rotated out.)

One other tweak I need to make, was for CPANPLUS. I wanted to be able to install modules in either the system perl (by running /usr/bin/cpanp) or the MacPort perl (/opt/local/bin/cpanp), but both of those read my user config file (~/.cpanplus/lib/CPANPLUS/Config/User.pm) which need a full path for perlwrapper => '/usr/bin/cpanp-run-perl'. So I moved just that part of the config to the system config file by runnning the following in each cpanp:

$ s save system
$ s edit system

Then removing everything but the perlwrapper configuration. And finally taking the perlwrapper configuration out of my User.pm file. One other thing I needed to do to make 5.10 the default perl. MacPort defaults to perl5.8, but the following took care of that:

$ cd /opt/local/bin
$ sudo mv perl perl.bak
$ sudo cp perl-5.10 perl
# make cpanp -> cpanp-5.10, etc.
$ for i in *-5.10 ; do x=${i%%-5.10} ;  sudo mv $x $x-5.8 ; sudo ln -s $i $x ; done 

I see Python has a python_select port-file. Maybe we need something like that for Perl.

Labels: , ,

Wednesday, May 27, 2009

Easy Access to Your Minicpan Repository

I am a big fan of CPANPLUS and minicpan. I like the plugin structure and power of CPANPLUS. ([Warning: shameless plug follows] I have written a simple plugin that allows you to see/install the prereqs for an module with commands like cpanp /prereqs show or cpanp /prereqs install.) And minicpan is great for getting work done on an airplane or when I am away from the net.

One thing I have struggle with in the past is getting cpanp to use either my local minicpan mirror or another mirror other than my default. Editing the config file is not that hard, but it is far to permanent for what I am trying to do. So I wrote two simple scripts (basically tweaked versions of /usr/bin/cpanp) that change the mirror to my local minicpan or a mirror passed on the command line: cpanp-local and cpanp-mirror. Both could be significantly improved, documented and they should probably be combined, but they get the job done.

Labels: , ,

Wednesday, April 29, 2009

C-Like Pointers In Perl...Oh No!

Tuesday night David Lowe gave a very interesting talk at SF.pm on pack/unpack and some of the awful things you can do with them.1 We ended the meeting talking about whether you could use the pack format "P" (which packs and unpacks "a pointer to a structure (fixed-length string)") to force poor Perl to do C-like pointer arithmetic.

David is using unpack to do a binary search of fixed width blobs of data in order to avoid unserializing it. His current (minor) bottleneck is creating the pack format string dynamically for each step in the binary search (ie, 'x' . ($record_size * $record + 1)). The math is fast, the string concatenation is relatively slow. I wondered if you could use the "P" format to avoid creating the format string on each pass and stick with simple integer arithmetic.

After a bit of hacking, it turns out this can be done. Instead of David's very complicated:

# Create an unpack format to skip the first $record * $record_size 
# bytes, then return the next 100 byte null padded string
my $format  = 'x' . ( $record_size * $record ) . 'Z100';
# Unpack from our binary blob
my $element = unpack( $format, ${$frozen_haystack_ref} );

You get the nearly unfathomable:

 
# Use pointer arithmetic to calculate where the record is in memory
# and convert the Perl integer into an unsigned long integer
my $ptr     = pack( 'L!', $ptr_to_base + $record_size * $record );
# Pull 100 bytes from that spot in memory
my $element = unpack( 'P100', $ptr );

And voila, Perl is doing pointer arithmetic and accessing structures just like C. Unfortunately, unpack("P") won't take a native Perl integer as an argument. You need to use pack("L!") to turn a Perl integer it into a long integer. So we trade the string concatenation in David's code for a pack("L!") in this code. And even worse, string concatenation is about 20% faster than unpack.

So, while this doesn't appear to help David speed up his already cheetah like code, it does prove that you can have pointers in Perl. Of course, you should never ever do anything like this. It is fraught with potential bugs and will drive anyone stuck maintaining your code insane.

Feel free to take apart my ugly benchmarking code. Maybe someone who knows this better can actually save David a few clock-cycles.

--

By the way, thanks to Matt Trout who got me motivated to (re)start blogging about Perl. In the past, I have gotten bogged down by setting up a site rather than focusing on adding content2. This time I decided to let Google do the work for me and focus on the content. Hopefully, this will result in more regular (and interesting?) posts. Feedback is very welcome.

Footnotes:
1. David actually has good reasons to do these horrible things, given some of the performance demands of his code, for the rest of us this is just fun^H^H^Hwrong. 2. Either putting together my own TT based blog/site or trying to get MT to work the way I want.

Labels: ,