perl

You Did It

Thu, 2014-10-23 10:59

Working on more ways to turn perfectly good audio into messy noise.

movie year histogram

Sat, 2011-01-01 21:01

I try to keep a log of every movie I watch. I've done this for the last seven years, and I was wondering: what is the distribution of years of the movies I've watched? Here it is:

histogram of movie years

I was shocked to see the large spike in the 2000-2005 range. I tend to feel like I watch a lot of older movies, and I do, so I was suprised to see that this range had the most films in it. Quartile-wise, my first quartile was 1943, median was 1962, 3rd quartile was 1990. So I've watched three times as many movies older than 1990 than newer than 1990. I thought the ratio would be higher.

In case you are wondering how I wrangled this data, here's how. My log is a plain html file. Movie years in enclosed in td tags. So I wrote a little perl script, and ran it on the file:

while (<>) {
my($line) = $_;
if ($line =~ /\
(\d\d\d\d{1})/) {
print $1,"\n";
}

(That {1} is probably not needed.)

I sent the output to a file and got a list of all the years, one per line. I fed that into R, and it made the histogram for me.

In the interest of fairness, I probably ought to watch some movies from 1990-1995, the most underrepresented five year period in the chart. Any suggestions for movies from those years I may not have seen?

lwp::simple

Tue, 2009-10-13 22:35

Okay, so I'm trying to write a little script to check TCM's schedule a couple times a month and let me know if there is anything on it that I might want to check out.

I looked into doing this with Perl, and I found the LWP and LWP::Simple module, but it seems not to work as advertised.

For instance, this:

>perl -MLWP::Simple -e 'getprint "http://www.sn.no/"'

is supposed to return the source of the index file at www.sn.no. Instead, it returns nothing.

Now I know things are actually working pretty well, since it I do the same with a text file (rather than an html file) like this:

>perl _MLWP::Simple -e 'getprint "http://www.madandmoonly.com/doctormatt/mathematics/squares/rezz01.txt"'

it spews that text file all over standard out.

I'm stumped.

schedule watcher perl script?

Sun, 2009-10-04 22:16

I watch a lot of movies on TCM (Turner Classic Movies). I love a lot of movies from the 30s and 40s, and there are lots on TCM. Often, I run across an actor, actress or director of whom I would like to see more. It would be awesome if I had a way to automatically be informed when movies by certain persons (or particular movies I know I'd like to see) are coming up.

TCM has a nice schedule page, updated each month, here. It ought to be simple to write a little perl script to grab this page and search it for names in a file I would maintain. I could have the script run automatically once a month, and somehow let me know the results (email?).

That sounds like a fun little project. I've never written any code that grabs stuff off the web, though I know people do it all the time, so how hard could it be?

postscript line width doubler

Wed, 2009-08-26 21:04

I maintain the text that is used for the Math 120 (Precalculus) course at the University of Washington. (You can check it out here). The book has many figures (1500?), which all exist as postscript files in the text's archive. Now, over the last number of years, I've been trying to improve the book, and this has included improving the figures. Many figures were made with line widths that are too tiny, and sometimes don't print well. One way to fix this is to open the figures xfig file, edit the line widths there, and export to postscript again. I wanted a quicker way, and had previously found a script on line which doubled the line widths in a postscript file. But, yesterday I couldn't find it, so I wrote one myself.

The thing is, these postscript files are generated by xfig, so they have what I think is a peculiarity. Near the head of the file, in all the postscript files I've looked at in the archive, is this line:

/slw {setlinewidth} bind def

This "aliases" slw to stand for setlinewidth. Then, throughout the file are lines like

7.500 slw

consisting of nothing but a number (the line width) and slw. This makes things easy to manipulate.

So, I wrote this little perl script. It halves the difference between the current
line width and twice the new minimum width.

For instance, if $minWidth is set to 15, then line weights of 5 get increased to 17.5, line weights of 10 get increased to 20, line weights of 15 get increased to 22.5, 20 to 25, etc. Only weights below twice the $minWdith (in this case 30) get increased.

This works well, I think.


#!/bin/perl

$minWidth = 15; # all widths will be at least this much after processing

while(<>) { # read the file one line at a time
$line=$_; # call the line $line
if ($line =~ /[+-]?(\d+\.\d+|\d+\.|\.\d+)\s+slw/ ) { # if the line contains a number followed
# by whitespace and the string "slw", do this
$num=$1; # set $num to be the number bit
if ($num<2*$minWidth) { # if $num is small enough to need to be increased
print 0.5*$num+$minWidth," slw \n"; # increase the line width
}
else {
print $line; # otherwise no change
}
}
else {
print $line; # otherwise no change
}

}

To use this code, put it in a file called, say, lineWidths.pl. Then execute a command like

perl lineWidths.pl < original.ps > new.ps

and new.ps will be just like original.ps except the small line widths will now be somewhat thicker.

Something similar should work for most any postscript file, but this aliasing business makes me wonder how simple a completely general-purpose script would be to create.