<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
   <channel>
      <title>mark</title>
      <link>http://huangfamily.com/mark/</link>
      <description></description>
      <language>en</language>
      <copyright>Copyright 2007</copyright>
      <lastBuildDate>Sun, 20 May 2007 18:19:59 -0500</lastBuildDate>
      <generator>http://www.sixapart.com/movabletype/?v=3.2</generator>
      <docs>http://blogs.law.harvard.edu/tech/rss</docs> 

            <item>
         <title>Spam filtering with Gmail</title>
         <description><![CDATA[<p>Pair's default <a href="http://www.pair.com/support/knowledge_base/e-mail/junk_e-mail_filtering_overview.html">spam filtering</a> options are pathetic. <a href="http://spamassassin.apache.org/">SpamAssassin</a> is wholly inadequate, and after 2 months of trying, I still couldn't get Bayes to work. So I finally gave up, disabled spam filtering, and am now using a Gmail account as the spam filter for all of my Pair aliases. I configured the Gmail account to forward messages to <tt>gmail@huangfamily.com</tt>. The idea is basically:

<p>
Ham &rarr; @huangfamily.com &rarr; huangfamily.com@gmail.com &rarr; gmail@huangfamily.com &rarr; Deliver
</p>
<p>
Spam &rarr; @huangfamily.com &rarr; huangfamily.com@gmail.com &rarr; Gmail Spam folder
</p>

<p>At the top of my <tt>.procmailrc</tt> file (which handles all mail sent to <tt>@huangfamily.com</tt>), I added the following rule:</p>

<textarea cols="60" rows="10">
# Forward all mail to Gmail for filtering. Gmail will forward clean
# messages back to us.
:0
* ! X-Forwarded-For: huangfamily.com@gmail.com gmail@huangfamily.com
{
  :0
  * X-Envelope-To: gmail@huangfamily.com
  /dev/null

  :0 w
  ! huangfamily.com@gmail.com
}
</textarea>

<p>The rule basically says, send all new messages to Gmail, and if they return, deliver them. Since Gmail doesn't forward spam, only clean messages will return. To prevent loops, the rule checks for the <tt>X-Forwarded-For</tt> header set by Gmail when it forwards messages. Messages sent directly to <tt>gmail@huangfamily.com</tt> are dropped.</p>

<p>I suppose that a smart spam bot could fake the <tt>X-Forwarded-For</tt> header. They can't fake <tt>Received</tt> headers, though, so if this starts happening, I could replace the condition with one that checks that the message was sent from a <tt>google.com</tt> relay.</p>]]></description>
         <link>http://huangfamily.com/mark/2007/05/spam_filtering_with_gmail.html</link>
         <guid>http://huangfamily.com/mark/2007/05/spam_filtering_with_gmail.html</guid>
         <category>Pair</category>
         <pubDate>Sun, 20 May 2007 18:19:59 -0500</pubDate>
      </item>
            <item>
         <title>Renaming the comment script</title>
         <description><![CDATA[<p>Bots saved the name of the CGI script used for posting comments on Craftlog. So now it changes every day as part of my daily MT maintenance script. Make sure that the real script <tt>mt-comments.cgi</tt> is <b>not</b> executable:</p>

<textarea cols="60" rows="10" readonly="true">
#!/usr/local/bin/bash

# Change comment script name
MT=$HOME/public_html/mt
cd $MT

rm -f mt-comments.*.cgi

tmp=$(mktemp -u mt-comments.XXXXXXXXXX)
sed -i -e "s/^CommentScript .*/CommentScript $tmp.cgi/" mt-config.cgi
chmod a-x mt-comments.cgi
install -m 755 mt-comments.cgi $tmp.cgi
</textarea>]]></description>
         <link>http://huangfamily.com/mark/2007/05/renaming_the_comment_script.html</link>
         <guid>http://huangfamily.com/mark/2007/05/renaming_the_comment_script.html</guid>
         <category>Movable Type</category>
         <pubDate>Sun, 20 May 2007 12:59:29 -0500</pubDate>
      </item>
            <item>
         <title>Daily DB cleaning</title>
         <description><![CDATA[<p>MT 3.2 only deletes junk comments from your blog when you click on the name of your blog from the Main Menu. Presumably because they didn't want to add periodic triggers throughout the code and they thought that most people accessed their blogs every day from the Main Menu. In addition, the activity log can really slow down the DB. So I configured a cron job to clean out the DB every night:</p>

<textarea cols="60" rows="10" readonly="true">
#!/usr/local/bin/bash

# Set these to appropriate values
db_name=${USER}_XXXXXX
db_username=XXXXXX
db_password=XXXXXX
db_server=dbXXX.pair.com

# Delete junk comments
yesterday=$(/usr/local/bin/gdate -d yesterday +%Y-%m-%d)
/usr/local/bin/mysql \
-u "$db_username" \
-p"$db_password" \
-h "$db_server" \
-e "DELETE FROM mt_comment WHERE comment_junk_status < 0 AND comment_modified_on < '$yesterday'" \
$db_name

# Trim activity log
one_month_ago=$(/usr/local/bin/gdate -d '1 month ago' +%Y-%m-%d)
/usr/local/bin/mysql \
-u "$db_username" \
-p"$db_password" \
-h "$db_server" \
-e "DELETE FROM mt_log WHERE log_modified_on < '$one_month_ago'" \
$db_name
</textarea>]]></description>
         <link>http://huangfamily.com/mark/2007/05/daily_db_cleaning.html</link>
         <guid>http://huangfamily.com/mark/2007/05/daily_db_cleaning.html</guid>
         <category>Movable Type</category>
         <pubDate>Sun, 20 May 2007 12:44:05 -0500</pubDate>
      </item>
            <item>
         <title>Akismet</title>
         <description><![CDATA[<p><a href="http://akismet.com/">Akismet</a> rocks, zero comment spam today. MT-Akismet was a snap to install. Be sure that your blog's Junk Score Threshold is 1 or less (MT default is 0), or else everything will get junked.</p>]]></description>
         <link>http://huangfamily.com/mark/2006/11/akismet.html</link>
         <guid>http://huangfamily.com/mark/2006/11/akismet.html</guid>
         <category>Movable Type</category>
         <pubDate>Sat, 04 Nov 2006 22:39:54 -0500</pubDate>
      </item>
            <item>
         <title>64-bit xine-lib fix</title>
         <description><![CDATA[<p>amaroK built for x86_64 was randomly crashing on me when playing AAC files through xine-lib. I tracked down the problem to an implicit pointer conversion in xine-lib (moral: always declare your functions), but someone had already found it:</p>

<ul>
<li><a href="http://sourceforge.net/mailarchive/message.php?msg_id=15405006">[dannf@debian.org: Bug#360003: xine-lib: implicit pointer conversion]</a></li>
</ul>

<p>xine-lib built from CVS should work.</p>]]></description>
         <link>http://huangfamily.com/mark/2006/07/64bit_xinelib_fix.html</link>
         <guid>http://huangfamily.com/mark/2006/07/64bit_xinelib_fix.html</guid>
         <category>Software</category>
         <pubDate>Tue, 11 Jul 2006 23:13:39 -0500</pubDate>
      </item>
            <item>
         <title>CGI.pm bug</title>
         <description><![CDATA[<p>I've tried this on a number of Movable Type 3.2 blogs out there, and it doesn't happen on theirs, but when you pass a badly formed query string to a CGI script, such as the query string in the <b>Select a Design using StyleCatcher</b> link at the bottom of the <b>Templates</b> page, <tt>extlib/CGI.pm</tt> croaks with the error:</p>

<pre>
Use of uninitialized value in hash element at /mt/extlib/CGI.pm line 554
</pre>

<p>That's if you're lucky and you're debugging with <a href="http://www.pair.com/support/knowledge_base/authoring_development/system_cgi_cgiwrap.html">cgiwrapd</a>.  Otherwise you will probably just get an <a href="http://www.w3.org/Protocols/HTTP/HTRESP.html">HTTP 500 Internal Server Error</a>.</p>

<p>The offending link looks like this:</p>

<pre>
/mt/plugins/StyleCatcher/stylecatcher.cgi?;from=list_templates;blog_id=1
</pre>

<p>But I was also able to make <tt>mt.cgi</tt> die by just passing it</p>

<pre>
/mt/mt.cgi?;foo
</pre>

<p>Here's a patch which fixes the problem for me.</p>

<pre>
--- MT-3.2-en_US/extlib/CGI.pm  Wed Oct 12 08:55:56 2005
+++ mt/extlib/CGI.pm    Tue Jul 11 19:18:56 2006
@@ -546,6 +546,7 @@
     my($param,$value);
     foreach (@pairs) {
        ($param,$value) = split('=',$_,2);
+       next if not defined $param;
        next if $NO_UNDEF_PARAMS and not defined $value;
        $value = '' unless defined $value;
        $param = unescape($param);
</pre>]]></description>
         <link>http://huangfamily.com/mark/2006/07/cgipm_bug.html</link>
         <guid>http://huangfamily.com/mark/2006/07/cgipm_bug.html</guid>
         <category>Movable Type</category>
         <pubDate>Tue, 11 Jul 2006 22:55:48 -0500</pubDate>
      </item>
            <item>
         <title>MTEntryExcerpt</title>
         <description><![CDATA[The Movable Type <a href="http://www.sixapart.com/movabletype/docs/3.2/a_template_tag_reference/entry/mtentryexcerpt.html">MTEntryExcerpt</a> tag is annoying in that it adds an ellipsis (...) even if the entry body is short enough that it doesn't need to be truncated. Here's a patch that fixes this behavior.

<pre>
--- MT-3.2-en_US/lib/MT/Template/ContextHandlers.pm     Tue Oct  4 17:01:01 2005
+++ mt/lib/MT/Template/ContextHandlers.pm       Sun Feb 12 00:14:12 2006
@@ -1045,7 +1045,11 @@
     $words = 40 unless defined $words && $words ne '';
     my $excerpt = _hdlr_entry_body($ctx, { words => $words, %$args });
     return '' unless $excerpt;
-    $excerpt . '...';
+    my $body = _hdlr_entry_body($ctx, { words => ($words + 1), %$args});
+    if ($excerpt ne $body) {
+       $excerpt .= '...';
+    }
+    $excerpt;
 }
 sub _hdlr_entry_keywords {
     my $e = $_[0]->stash('entry')
</pre>]]></description>
         <link>http://huangfamily.com/mark/2006/07/mtentryexcerpt.html</link>
         <guid>http://huangfamily.com/mark/2006/07/mtentryexcerpt.html</guid>
         <category>Movable Type</category>
         <pubDate>Tue, 11 Jul 2006 22:20:59 -0500</pubDate>
      </item>
            <item>
         <title>pair Networks</title>
         <description><![CDATA[<p>Maitreya and I host several domains and <a href="http://www.sixapart.com/movabletype/">Movable Type</a> blogs with a single <a href="http://www.pair.com/services/web_hosting/advanced.html">Advanced Account</a> at <a href="http://www.pair.com/">pair Networks</a>. pair is a little more expensive than their competitors, like <a href="http://www.dreamhost.com/">DreamHost</a>, but I like them because they're based in Pittsburgh, don't oversubscribe, never have downtime, and provide easy-to-use configuration tools. They've also never implemented stupid policies like counting your CPU minutes. They've been in business a long time and understand the business of shared hosting.</p>

<p>Here's a trick I use to host several different websites on <a href="http://huangfamily.com/">huangfamily.com</a> without having to pay for additional virtual domains each month. I had to pay one-time vanity domain charges for each site, but I use the magic of <a href="http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html">mod_rewrite</a> in an <tt>.htaccess</tt> file to provide the illusion that each site is hosted separately.</p>

<pre>
RewriteEngine on

RewriteCond %{SERVER_NAME} .*craftlog\.org$
RewriteRule ^index.html$ craftlog/ [L]

RewriteCond %{SERVER_NAME} .*princetonjudo\.(org|net)$
RewriteRule ^index.html$ princetonjudo/ [L]

RewriteCond %{SERVER_NAME} .*oomny\.net$
RewriteCond %{REQUEST_URI} !^/oomny/.*$
RewriteRule ^(.*)$ oomny/$1 [L]
</pre>

<p><a href="http://craftlog.org/">craftlog.org</a> and <a href="http://princetonjudo.org/">princetonjudo.org</a> are both Movable Type blogs stored in subdirectories of the <a href="http://huangfamily.com/">huangfamily.com</a> document root. Requests for the front pages of each site are rewritten to requests for their subdirectories. <a href="http://oomny.net/">oomny.net</a> is Ariel's website hand-coded in WordPad. Requests to Ariel's site are silently prepended with <tt>oomny/</tt>. I gave her full FTP access to the <tt>oomny/</tt> subdirectory and just told her to code as if it were the document root of her server.</p>]]></description>
         <link>http://huangfamily.com/mark/2006/07/pair_networks.html</link>
         <guid>http://huangfamily.com/mark/2006/07/pair_networks.html</guid>
         <category>Pair</category>
         <pubDate>Tue, 11 Jul 2006 19:54:17 -0500</pubDate>
      </item>
            <item>
         <title>iTunes XML to M3U Converter</title>
         <description><![CDATA[<p>I couldn't get any of the various scripts floating out there to work.</p>

<ul>
<li><a href="http://homepage.mac.com/beryrinaldo/AudioTron/Export_Playlist_to_M3U.html">Export iTunes Playlist to M3U</a></li>
<li><a href="homepage.mac.com/beryrinaldo/AudioTron/Export_Playlist_to_M3U/archos.html">Using Export Playlist to M3U with an Archos Jukebox</a></li>
<li><a href="chimpen.com/things/archives/001403.php">things: iTunes2m3u update</a></li>
<li><a href="http://www.xml.com/pub/a/2004/11/03/itunes.html?page=2">XML.com: Hacking iTunes</a></li>
</ul>

<p>Specifically, I couldn't get any of them to convert my music collection to M3U for import into <a href="http://amarok.kde.org/">amaroK</a>. Either the script didn't work, ran only on a Mac, didn't convert 8-bit filenames correctly, couldn't handle UTF-8 encoded track names, or all of the above. So I wrote my own Python script to do it.</p>

<p>My music resides on an NTFS partition that is obviously not mounted as <tt>C:/My Documents/My Music/iTunes/iTunes Music</tt> on my Linux partition. The script takes an option for automatically replacing the path to the Music Folder with an alternate path, e.g., <tt>/mnt/Music/</tt> or something similar. It takes an iTunes XML file as input and writes the playlists to <tt>.m3u</tt> files.</p>

<ul>
<li><a href="/mark/software/itunes2m3u.py">itunes2m3u.py</a></li>
</ul>]]></description>
         <link>http://huangfamily.com/mark/2006/07/itunes_xml_to_m3u_converter.html</link>
         <guid>http://huangfamily.com/mark/2006/07/itunes_xml_to_m3u_converter.html</guid>
         <category>Software</category>
         <pubDate>Wed, 05 Jul 2006 01:26:05 -0500</pubDate>
      </item>
      
   </channel>
</rss>
