Bryan's thoughts on web design and development

How to auto-remove old files/directories from linux

I usually just keep old backups, perhaps for some undefined historical purposes, but more likely just out of laziness. But clearly, this method (or lack thereof) can eat up disk space rather quickly.

I recently created a simple bash script that will find an remove all files or directories that are older than a specified number of days (as determined by modification dates). This helps keep Xobni’s backup directories small without having to manually go in there every few months, which is even more lazy-friendly. Awesome.

#!/bin/bash
# removes old directories to clear space
# 7 = number of days to keep directories/files

find /var/backup/ -type d -mtime +7 -exec rm {} -R \;
find /var/backup/something -type d -mtime +7 -exec rm {} -R \;

I have this script running via cron every week and it works like a charm.

Copy tables from one mysql database to another


At Xobni, I’ve developed a simple testing and deployment platform that helps us ensure that new web code is good before it’s pushed out the door. Part of this means maintaining two separate databases, one for development and one for live mode (Ruby on Rails developers will find this concept familiar). That way, we can test our database-enabled pages without adversely affecting the live database.

But this means the development database gets stale over time and must be refreshed. Ideally it should contain data from the day before or some reasonable approximation. In the past, I relied on the “copy-database” command in phpMyAdmin to do this. But being a manual process, and being fairly intensive on the server to boot, it rarely got done.

So I set out to create a php script that would mirror the two databases on a regular schedule. Essentially to copy one database’s tables to another, erasing the old data in the development database.

It’s not rocket science, as it turns out. Here it goes:

$todb = “development”;
$fromdb = “live”;

$sql = “SHOW TABLES”;
$result = mysql_query($sql);
while($row = mysql_fetch_array($result)) {
$table = $row[0];
$sql = “DROP TABLE IF EXISTS `$todb.$table`”;
mysql_query($sql);
$sql = “CREATE TABLE $todb.$table LIKE $fromdb.$table”;
mysql_query($sql);
$sql = “INSERT INTO $todb.$table SELECT * FROM $fromdb.$table”;
mysql_query($sql);
}

The script just loops through each table in your $fromdb and copies it to new tables in the $todb. I have it set to run every 24 hours via cron.

The LIKE syntax is fairly new in mysql (as of 4.1) and permits a fully copy of table, including all indexes and keys. The rest is all pretty standard stuff.

Be sure to fill in the appropriate mysql connection strings, and you’re good to go. And do a dry run or two before you run back crying that your deployment database got erased and you don’t have any backups.

I hope this was helpful.

Windows Tools: TortoiseSVN

If you’re on a windows machine and you use SVN, I’ll bet you already know about TortoiseSVN. If not, it’s seriously the most important application in my arsenal. It makes svn management a snap in windows, where before I had to adopt a whole clunky editor (Eclipse) to do the job.

If you don’t know the magic of TortoiseSVN yet, try it!

Windows Tools: 7-Zip

There are a lot of Windows unzip utilities out there, but 7-zip outshines the rest by far. The shell commands (right click a file to zip or extract), the multitude of supported formats, and the speed of extraction make it a star. And it definitely beats out XP’s standard unzipper for UI.

Deploy PHP like a Starfleet Commander


Ruby on Rails brought with it a lot of good practices in regards to code structure and maintenance. At least for me, my experience with RoR two years ago was the first time I’d used subversion and deployed my code (via capistrano).

Even though I use PHP on my day-to-day, there’s no reason why PHP can’t be deployed like RoR. So let’s learn a thing or two from the RoR community and create a deploy script for PHP.

What does it mean to ‘deploy’ your code? Deployment is more than just a fancy name for syncing your svn repository with your web root. Deploying your code lets you:

  • exclude certain files (i.e. those pesky .svn folders),
  • put up a ‘maintenance’ page while your large upload is running,
  • set permissions automagically,
  • leave only ssh (and www) access open on your live server,
  • deploy to multiple locations at once across servers (for testing, clustering, etc),
  • and perhaps even compress your code on the fly.

Besides, saying ‘deploy’ every day kind of makes me feel like the commander of a starship battle group!

Unfortunately, deployment options for PHP seem pretty meager, so I figured I’d give you a peek into the custom way we do it, in case it’s helpful to you in your work.

I’m assuming you’re already using subversion. If you’re not, you should. Even if you’re on windows, using TortoiseSVN makes using svn as easy as pie. Its long-term benefits far outweigh the short setup time. Just use it.

I also assume you are running a dedicated server. It will be hard to implement this on a shared server due to the probable lack of rsync.

Here at Xobni, our process is pretty simple:

  • Webdevs test on their local machines using a custom VMware virtual machine running Debian, and do a subversion checkin when everything looks good.
  • The deploy script is called on our intranet server with a (-t) switch, which svn-ups the deploy directory there, and then rsync’s it to a test directory on our primary www box.
  • cd /var/www/deploy/scripts/
    ./deploy.sh -t

  • If all looks dandy on our test url, we push it live by running the deploy script again with the (-d) switch which does the same thing as before, but the rsync is to the root www directory. And whammo! we’re live.
  • If something goes wrong, we can always rollback with the -r switch. But I’ve never needed it. No, not me.

The deploy script

#! /bin/bash

# folders in wwwroot you want to include in deploy
sites=( domain.com common scripts )

# check for required root or sudo access
if [ “$USER” != “root” ]; then
echo “Please run as root”
exit 1
fi

Make sure to replace the elements in the {sites} array with the various folders you want to deploy to your www root. Also, the script checks for the root access it will need to svn up and such. Alternately, you could just run it as a privileged user.

Now we need to check to make sure our deploy script has not been updated. Since the deploy script is itself in the svn repository, we want to make sure we’re using the very latest.

# check for updated version of this script
MODDATEF=$(sudo stat -c %Y /var/www/deploy/scripts/deploy.sh)
svn up /var/www/deploy/scripts
MODDATEL=$(sudo stat -c %Y /var/www/deploy/scripts/deploy.sh)
chmod 755 /var/www/deploy/scripts -R
if [ “$MODDATEF” != “$MODDATEL” ]; then
echo “Deploy script updated. Please run again.”
exit 1
fi

Next we need to get the parameters from the script call to see what the user wants us to do, and then run an svn up:

TEST=0
# GET PARAMETERS
while getopts “:dtr:” optname
do
case “$optname” in
“d”)
echo “*** Will DEPLOY! ***”
DEPLOY=1
;;
“r”)
REVISION=$OPTARG
;;
“t”)
TEST=1
DEPLOY=1
echo “TEST deploy”
;;
*)
echo “error: unexpected parameter”
exit 1
;;
esac
done

# inform the user if we are rolling back
if [ !$REVISION ]; then
echo “Not rolling back, -r NUM to rollback”
else
echo “ROLLBACK to rev num $REVISION”
fi

# SVN UP (or down) SITES
for site in ${sites[@]}
do
if [ $REVISION ]; then
#echo “Rolling back to rev num $REVISION”
svn up /var/www/deploy/$site -r $REVISION
else
#echo “svn to latest revision. run with -r NUM to rollback”
svn up /var/www/deploy/$site
fi
done

Now that we’ve svn updated our root, we’re ready to deploy if the right parameter has been specified (-t or -d). Note that we set up passwordless ssh between our servers, which I recommend if you want a single-command deploy.

deploysite=”/var/wwwroot”
predeployloc=”/var/www/deploy”
#rsync options to use
deployoptions=”–compress –delete –progress –exclude .svn –archive -e ’ssh -p 22′”
# note: deploys to test dir always, and live dir only if deploy

if [ $DEPLOY ]; then
for site in ${sites[@]}
do
# WWW
su - privledgeduser -c “rsync $deployoptions $predeployloc/$site privledgeduser@265.265.265.265:$deploysite-test”
echo “TEST - WWW deploy completed ($site)”
if [ $TEST != 1 ]; then
su - privledgeduser -c “rsync $deployoptions $predeployloc/$site privledgeduser@265.265.265.265:$deploysite”
echo “LIVE - WWW deploy completed! ($site)”
fi
done
echo “Completed!”
else
echo “Did not deploy: run with -d to deploy”
fi

Be sure to change ‘privledgeduser’ to the passwordless user, and 265.265.265.265 to your remote server’s ip address.

Now, when you run deploy.sh, with one line you’ll be taking care of syncing your site to the very latest in your svn repository!

Ok, so maybe commanding ships into battle would be more fun, but you have to admit it’s a thrill when your infrastructure works so seamlessly. One of these days I’m going to convince the guys to install a flashing red light above my desk that I can activate when I deploy. To the stars we go!

Track spam with your gmail account

As you know, gmail has an awesome spam filter. It’s the reason I started using it for personal email in the first place.

They also have a cool feature that is not at all well documented, which can help you find out where your spam is coming from. The premise is simple - you can attach ANY string to your email address with a plus sign like so:

yourname+spammersitecom@gmail.com

The email will still make it to your gmail account, and you’ll be able to see that string.
If you follow this format when signing up on web sites, you can track to see who’s selling your information — and essentially arrive at the source of the spam. Since most spam comes from your email spreading from one to many, this is where tracking down the source is important. If they’re scrupulous, you can just ask to be removed.

In one instance, by doing this, I found out that a web backup service I signed up for had shared my email to some advertisers, and I was able to opt-out (the source) before it went out any further.

You might try this on web sites that you post your email to as well - that’s where most of my spam comes from, and it’s enlightening to see which of my sites draws in the most spambots. It helps me decide where I should implement web forms instead of just listing my email address.

This is very similar to an idea I launched a few years back called spamtree.com (I’ve since taken the domain down). It was pretty basic: you’d sign up for a flexible email address like this (i.e. bryan-applecom@spamtree.com), and if they ever emailed you or sold your name, you could track that on a graph (and set up permissions that, for example, only allowed them to email you once). Presumably, the more your email was passed around, the graph would look somewhat like a tree with branches and nodes, hence the name.

It never caught on, probably because it was so nerdy (like me!), but I’m glad to see that gmail has introduced the backbone of this concept. I don’t even know if they *meant* to for tracking spam - but that’s what’s so fun about unadvertised features: no one is forcing you into a particular frame of mind, so you can be inventive in their use.

Saving your Wordpress posts from tyranny (or, How to Disable the Wordpress WYSIWYG editor)

Even with today’s modern version 2.3 era, Wordpress’s WYSIWYG editor is still the evil, overbearing dictator it always was: “You couldn’t possibly have wanted to put that tag there - I’ll go ahead and remove it for you!” How helpfully MS-Word-smart-paste-paper-clip of you!

Thankfully, it’s easier to remove this most egregious of components than a real dictator! Just follow these easy steps:
1) Go to ‘Users’
2) Click ‘edit’ next to your account (you’ll have to do this for every user who doesn’t want WYSIWYG)
3) Uncheck ‘Use the visual editor when writing’
4) Revel in the newfound power!

A better programmer’s font

For years I’ve used Courier New or whatever font my IDE chose for me. On a whim, I decided to search for a better programmer’s font; one that’s not only is easier on the eyes, but is more compact and efficient. I think I’ve found it: ProggyFonts

The Proggy font collection is composed of several free monospaced fonts designed for programmers, and emphasizes what hackers really want: easy readability and concise presentation. It’s so much better than what I was using this morning that I can’t believe I lived with Courier for so long.

I know it’s probably a sign of deep rooted psychological instability to get this excited over a font, but I can’t help it! Yay!

IMAP + Gmail + Outlook + Xobni = Bliss

In case you haven’t heard, Gmail has finally released IMAP support for Gmail. This means that you can access Gmail from Outlook (or any email client for that matter). The clear implication here is that I can use Xobni on my gmail inbox, finally!

Setting up Gmail on Outlook

Memory Leak in Google Desktop

The last few days I’ve been noticing my system lagging pretty badly at the end of the day.

On a whim, I took a look at mem usage in the task manager, and noticed the largest memory hog was not firefox as is the custom, but google desktop…

As I watched, google gobbled up memory at a rate of over 50K per second!

Auuga! I only have a few extensions installed - time, google calendar, weather (22C in SF, no way!), and my skype list. Indexing disabled. But considering it starts up with 7.5Mb, that’s a pretty huge increase. What’s up, Google? Didn’t have enough memory in the ol datacenter and had to use mine?

UPDATE: It appears to be related to the Skype Widget. After removing it, Google Desktop is no longer hogging all the memory. It’s a difficult position to be in for Google - they are putting their good name on the line with programmers that may or may not be very good. I wonder if OpenSocial will suffer the same sorts of issues.