I try to keep a log of every movie I watch. I've done this for the last seven years, and I was wondering: what is the distribution of years of the movies I've watched? Here it is:

I was shocked to see the large spike in the 2000-2005 range. I tend to feel like I watch a lot of older movies, and I do, so I was suprised to see that this range had the most films in it. Quartile-wise, my first quartile was 1943, median was 1962, 3rd quartile was 1990. So I've watched three times as many movies older than 1990 than newer than 1990. I thought the ratio would be higher.
In case you are wondering how I wrangled this data, here's how. My log is a plain html file. Movie years in enclosed in td tags. So I wrote a little perl script, and ran it on the file: I sent the output to a file and got a list of all the years, one per line. I fed that into R, and it made the histogram for me. In the interest of fairness, I probably ought to watch some movies from 1990-1995, the most underrepresented five year period in the chart. Any suggestions for movies from those years I may not have seen? Okay, so I'm trying to write a little script to check TCM's schedule a couple times a month and let me know if there is anything on it that I might want to check out. I looked into doing this with Perl, and I found the LWP and LWP::Simple module, but it seems not to work as advertised. For instance, this: >perl -MLWP::Simple -e 'getprint "http://www.sn.no/"' is supposed to return the source of the index file at www.sn.no. Instead, it returns nothing. Now I know things are actually working pretty well, since it I do the same with a text file (rather than an html file) like this: >perl _MLWP::Simple -e 'getprint "http://www.madandmoonly.com/doctormatt/mathematics/squares/rezz01.txt"' it spews that text file all over standard out. I'm stumped. I watch a lot of movies on TCM (Turner Classic Movies). I love a lot of movies from the 30s and 40s, and there are lots on TCM. Often, I run across an actor, actress or director of whom I would like to see more. It would be awesome if I had a way to automatically be informed when movies by certain persons (or particular movies I know I'd like to see) are coming up. TCM has a nice schedule page, updated each month, here. It ought to be simple to write a little perl script to grab this page and search it for names in a file I would maintain. I could have the script run automatically once a month, and somehow let me know the results (email?). That sounds like a fun little project. I've never written any code that grabs stuff off the web, though I know people do it all the time, so how hard could it be? I maintain the text that is used for the Math 120 (Precalculus) course at the University of Washington. (You can check it out here). The book has many figures (1500?), which all exist as postscript files in the text's archive. Now, over the last number of years, I've been trying to improve the book, and this has included improving the figures. Many figures were made with line widths that are too tiny, and sometimes don't print well. One way to fix this is to open the figures xfig file, edit the line widths there, and export to postscript again. I wanted a quicker way, and had previously found a script on line which doubled the line widths in a postscript file. But, yesterday I couldn't find it, so I wrote one myself. The thing is, these postscript files are generated by xfig, so they have what I think is a peculiarity. Near the head of the file, in all the postscript files I've looked at in the archive, is this line: So, I wrote this little perl script. It halves the difference between the current For instance, if $minWidth is set to 15, then line weights of 5 get increased to 17.5, line weights of 10 get increased to 20, line weights of 15 get increased to 22.5, 20 to 25, etc. Only weights below twice the $minWdith (in this case 30) get increased. This works well, I think. $minWidth = 15; # all widths will be at least this much after processing while(<>) { # read the file one line at a time } To use this code, put it in a file called, say, lineWidths.pl. Then execute a command like Something similar should work for most any postscript file, but this aliasing business makes me wonder how simple a completely general-purpose script would be to create.
while (<>) {
my($line) = $_;
if ($line =~ /\
(\d\d\d\d{1})/) {
print $1,"\n";
}
(That {1} is probably not needed.)
lwp::simple
Tue, 2009-10-13 22:35
schedule watcher perl script?
Sun, 2009-10-04 22:16
postscript line width doubler
Wed, 2009-08-26 21:04
/slw {setlinewidth} bind def
This "aliases" slw to stand for setlinewidth. Then, throughout the file are lines like
7.500 slw
consisting of nothing but a number (the line width) and slw. This makes things easy to manipulate.
line width and twice the new minimum width.
#!/bin/perl
$line=$_; # call the line $line
if ($line =~ /[+-]?(\d+\.\d+|\d+\.|\.\d+)\s+slw/ ) { # if the line contains a number followed
# by whitespace and the string "slw", do this
$num=$1; # set $num to be the number bit
if ($num<2*$minWidth) { # if $num is small enough to need to be increased
print 0.5*$num+$minWidth," slw \n"; # increase the line width
}
else {
print $line; # otherwise no change
}
}
else {
print $line; # otherwise no change
}
perl lineWidths.pl < original.ps > new.ps
and new.ps will be just like original.ps except the small line widths will now be somewhat thicker.