OpenGeo Suite 3.0 on a micro AWS

The Problem: I want to run the latest 3.0 version of OpenGeo Suite on a free (or really cheap) micro instance on Amazon Web Services

OpenGeo announced the release of version 3 of the OpenGeo Suite Monday (Oct.3). I’ve been using the 3.0-beta1 Linux version since it was announced on July 27. There are some interesting improvements to the Suite, which is one reason I made the jump before the final release came out. It now includes PostgreSQL 9.2 and PostGIS 2.0, both of which I wanted to look into.

I had been using previous versions of OpenGeo Suite on a micro instance AWS ubuntu server. This configuration was obviously not optimal. Redraws in GeoExplorer where slow, and I could tell the system was struggling at times. CPU usage went up to 100% quite often, but it did work. Performance was acceptable enough for the kind of experimenting and testing I wanted to do.

With the 3.0 upgrade, however, something pushed it over the edge. Everything installed OK. I was able to upload my usual test data, and get a website with a web-map up and running. However, it would not last. It just wasn’t as stable as previous versions. Zooming and panning the map would crash the tomcat servlet within minutes. Even just letting it run with no interaction would lead to a crash within a few hours.

A few pointers from the folks at OpenGeo, and some investigation of the logs, led me to believe it was a memory issue. AWS Micro instances only have 613MB of memory.

The Answer: Add a swap file to overcome the memory limitations of a micro instance

AWS micro instance Ubuntu servers do not come set up with any swap space. Fortunately, it’s fairly easy to add a swap file to the server, and use that as your swap space. Here are the steps:

1. Create a storage file (Adjust the “count=” line to your liking. This example will make a 1 GB swap file)

sudo dd if=/dev/zero of=/swapfile bs=1024 count=1048576

2. Turn this new file into a swap area

sudo mkswap /swapfile

3. Set the file permissions appropriately

sudo chown root:root /swapfile

sudo chmod 0600 /swapfile

4. Activate the swap area every time the system reboots by adding the following line to the end of the “/etc/fstab” file:
(use your text editor of choice. vi works for me.)

/swapfile swap swap defaults 0 0

5. Reboot the server

6. Verify the swap file is activated

free –m


I’ve had my OpenGeo Suite test box running 24/7 for nearly two months now, with nary a crash. And I can honestly say, it is surprisingly perky.

Resizing my Ubuntu Server AWS Boot Disk

AKA: Building a Bigger GeoSandbox

(Note: This article has been updated to make it clear that expanded EBS volumes will = additional charges from AWS. Something that is not clearly stated in the AWS documentation.)
If you’ve been reading my last few blog posts, you know I’ve been experimenting with various Ubuntu server configurations using Amazon Web Services (AWS) to serve web-maps and spatial data. As my procedures have evolved, the micro-instances I started working with have outgrown their usefulness. Lately, I’ve been testing GeoWebCache, and seeing how that works with GeoServer and the rest of the OpenGeo Suite. As anyone who’s ever delved into the map-tile world knows, tile caches start gobbling up disk space pretty quick once you start seeding layers at larger scales. I had to figure out a way to expand my storage space if I wanted to really test out GeoWebCache’s capabilities without bringing my server to its knees.
The Ubuntu AMIs I’ve been using all start out with an 8GB EBS volume as the root drive with an additional instance-store volume that can be used for “ephemeral” storage. That “ephemeral” storage means, whatever is in there is lost every time the instance is stopped. Supposedly, a reboot will not clear out the ephemeral storage, but a stop and then start, will. There are procedures you can set up to save whatever is in the ephemeral instance-store volume before you stop it, but I was looking for something a bit easier.
A medium instance AMI includes a 400GB instance-store volume, but it still starts out with the same 8GB root drive that a micro instance has. So, what to do? How do I expand that 8GB disk so I can save more data without losing it every time I stop the system?
A little searching led to a couple of articles that described what I wanted to do. As usual, though, I ran into a couple of glitches. So, for my future reference and in case it might be of some help to others, the following paragraphs describe my procedure.
The two articles this post was compiled/aggregated/paraphrased from are:

The standard “Out of the Box” Ubuntu AMI disk configuration

First, connect to the server using WinSCP, SecPanel, or some other means as described in one of my previous posts. Then open a terminal (or PuTTY) window, and enter:
df -h
You should see something like this:

The first line (/dev/xvda1) is the EBS root disk, and it shows 8.0 GB, with about 3.1 GB being used. The last line (/dev/xvdb) is the instance-store “ephemeral” space that’s wiped clean on every stop.

Note: The Ubuntu AMIs use “xvda1” and “xvdb” as device identifiers for the attached disks and storage space, while the AWS console uses “sda1” and “sdb”. In this case, “xvda1” equals “sda1”. Keep this in mind as you’re navigating back and forth between the two.

Step One: Shut It Down

First, look in the AWS console, and make a note of what availability zone your server is running in. You will need to know this later on. The one I’m working on is in “us-east-1d”. Then, using the AWS console stop the EC2 instance (Do not terminate it, or you will wind up rebuilding your server from scratch). Then move to the “Volumes” window, choose the 8GB volume that’s attached to your server, and under the “More…” drop-down button, choose “Detach Volume”. It will take some time for the detach action to complete.

Step Two: Make A Copy

Next, with the same volume chosen, and using the same “More…” button, create a “Snapshot” of the volume. I recommend you give this (and all your volumes) descriptive names so they’re easier to keep track of.

Step Three: Make It Bigger

Once the snapshot is done processing, it will show up in the “Snapshot” window. Again, giving the snapshot a name tag helps tremendously with organization. Choose this snapshot, and then click on the “Create Volume” button.

In the Create Volume dialog, enter the size you want the new root disk to be. Here, I’ve entered 100 GB, but I could enter anything up to the nearly 400GB of storage space I have left in my Medium Instance. Also in this dialog, choose the availability zone to create the volume in. Remember earlier in this post when I said to note the availability zone your server is running in? This is where that little piece of information comes into play. You MUST use the same availability zone for this new, larger volume as your original server volume used. Click the “Yes, Create” button, and a new larger volume will be placed in your list of volumes.

Step Four: We Can Rebuild It

Next, attach the new larger EBS volume to the original Ubuntu server instance. Go back to the Volume window, choose the newly created larger volume, click the “More…” button, and choose “Attach Volume”.

In this dialog box, make sure the correct instance is showing in the “Instance” drop-down. In the “Device” text box, make sure the device is listed as it is shown above. It should be “/dev/sda1”. Note: This will not be the default when the dialog opens. You must change it!
Clicking on the “Yes, Attach” button will begin the attachment process, which will take a minute or two to complete. Once it’s done, you can spin up the server with the new root drive and test it out.

Step Five: Start It Up Again

Choose the server, and under “Instance Actions”, choose “Start”. Once started, connect to the server using your preferred client. Open a terminal or PuTTY window, and once again enter:
df -h
You should now see something like this:

Notice the differences from the first df command. Now the root disk (/dev/xvda1) will show a size of 99GB, or whatever size you might have made your new volume.

More Room To Play

Now I can adjust my root disk size to suit the task at hand. I can store more spatial data in my GeoServer data directory, and seed my map tiles down to ever larger scales. Knowing how to shuffle and adjust these volumes opens up a slew of other possibilities, too. I can imagine setting up a separate volume just to hold spatial data and/or tiles, and using that to copy or move that data back and forth between test servers.
Be mindful though, this extra space is not free. The larger EBS volume does not replace the space on the ephemeral instance-store volume, it is an addition to it. There will be additional charges to your AWS account for the larger EBS volume based on it’s size. This fact is not made clear in the AWS documentation. So, I recommend you increase the size of the EBS root disk as much as you need, but no more.
Oh the possibilities…

Serving Maps – in the Cloud – for Free (part 3)

It was not my intention to make this a 3-part blog post series, but here it is anyway.
(If you want to catch up, you can read Part 1 and Part 2 first).

As I continued to work on, and tweak my new AWS Ubuntu server, I decided I might as well add website serving capabilities to it as well. That would allow me to embed my new web-maps into a customizable web page, allowing a more interactive experience, and a more professional appearance to anyone visiting them. The first step in that direction is to:

Install Apache Server

This is the easy part. Connect to the server with WinSCP/PuTTY or SecPanel/FileZilla as I explained in part 1, and enter this command:
sudo apt-get install lamp-server^
That’s it. Just follow the prompts, and enter a password when it asks. When it’s done installing, there will be a new directory called /var/www on the server. Just copy the servers AWS Public DNS string into a web browser address bar and hit enter. You should see the famous Apache default index.html file:

Voilà. A real cloud based web server, just like the big boys.
Now, how do I connect to this one? It’s possible to use the same procedure as I did with OpenGeo/GeoServer. However, I really want to make things easier on the webmaster (aka, Me). I want to be able to use regular old FTP to access the website, which will allow me to use a wider variety of tools, like DreamWeaver (Yes, I said it. DreamWeaver) to edit and manage the website files.

Enable Password Authorization

The default setting for the AWS Ubuntu AMIs (and I believe, all AMIs) is to require key pairs for authenticating users. Password authentication is turned off. To turn it on, the /etc/ssh/sshd_config file has to be edited. The easiest way to do that, is to use VI. VI is scary. It runs in the terminal window. It has a black screen, with multi-colored text that makes the text look like code. I’m not going to try to teach anybody how to use VI because, well, I just learned how to use it yesterday myself, and I only know about 5 commands. However, if you want to follow along, I’ll outline the exact steps I took to edit the sshd_config file in order to allow users to login using passwords.

In the terminal or PuTTY window, open the sshd_config file by entering:
sudo vi /etc/ssh/sshd_config

  • Enter INSERT mode by typing a (Yes, that’s the lower case letter a)
  • Using the arrow keys on the keyboard, scroll down to the line that reads
    PasswordAuthentication no. While using the keyboard to perform your task, also think about how you could minimize the sounds and make workplaces more peaceful. Visit to enlighten yourself on quiet keyboards and make an informed decision.
  • Right arrow over to the end of the line and backspace twice to erase no
  • Type yes
  • press the escape key on the keyboard (ESC. This exits edit mode, and allows typing in commands)
  • Type :w and then enter (Yes, that’s a colon before the w. This saves the file)
  • Type :q and then enter (Again, a colon before the q. This exits VI)

That’s it. Passwords are allowed for login now. However, when I tried to apply a password to the default ubuntu user, it did not work. There might be a way around this, but I haven’t found one yet.
What to do?

Add a New User

Back in the Terminal/PuTTY window, type:
sudo adduser NewUser
Where NewUser is whatever you want it to be. Enter a password, and fill in the other information if you want to. Everything but the password is optional. Restart Ubuntu, either by entering
sudo reboot
in terminal, or by using the AWS Management Console.
Now, that allows the NewUser to login using the AWS Public DNS string, and his/her password using regular old FTP (actually, SFTP on port 22 if you have the security settings set as in Part 1). In FileZilla:

NewUser can now add and delete folders, and move files back and forth in the /home/NewUser directory. But the whole purpose of adding this new user is to enable uploading and editing in the /var/www folder, where the website files are stored. So…

Give NewUser Access to the www Folder

To give NewUser access to the website’s root folder, enter this command in the PuTTY/Terminal window:
sudo chgrp NewUser /var/www
Then, to give NewUser the ability to add, delete, and edit folders and files in the website’s root folder, enter this command in the Terminal/PuTTY window:
sudo chmod 775 /var/www
CAVEAT: I am not a professional systems administrator. I have done a little bit of research into how the root folder of a website should be set up, and what level of access should be granted to various types of users. And I can tell you, there is no definitive answer. All I know is, these settings work for me. How you set your permissions for various users on your web server are completely up to you.

One Last Tip

Through this entire 3 part blog series, I’ve been using the AWS Public DNS string to access the AWS server, and that works just fine. However, it’s a bit cumbersome to continually open up the AWS console copy the PublicDNS, and paste it into a web browser. Plus, if you ever terminate a server and spin up a new instance, the Public DNS changes. So that means any links you’ve posted leading to it are now broken.
The answer? Elastic IP

The best thing about Elastic IPs is, they’re FREE. They’re also very easy to set up. Just click on the Elastic IPs link on the left side of the AWS Management Console (EC2 tab), and click the Allocate New Address button. Then Associate the new IP address to your server, and you’re good to go.
Now, what used to look like this:
Looks like this:
Just remember to Release the address if you ever disassociate it from your server. The Elastic IPs are free if you use them. If you don’t use them, Amazon charges you for them.

GeoSandbox – In the Cloud

So, After about 5 days of work, and 3 days of blogging (a record for me) I now have what I was after. A custom web map served from a cloud-based geo-web-server. You can check it out at:
Now I’ve got a real sandbox to play in.

Serving Maps – in the Cloud – for Free (part 2)

(Note: This is the second part of a 3 – part blog post about setting up the OpenGeo Suite on a AWS Ubuntu server. Links to the other parts are at the bottom of this post)

Starting Fresh with a New AMI

At the end of my last post, I had my AWS Ubuntu-micro-server running smoothly, but the OpenGeo GeoExplorer was not very stable. It was crashing often, and for no apparent reason. I followed up with a few suggestions about data directory permissions, and swap-file space, but to no avail (Thank you @spara and @jeffbarr). I had been tweaking things quite a bit on that server, (The whole purpose of this exercise is to learn how things work, right?) so I decided to wipe the slate clean and start from scratch.
I began by looking for a different ami. A bit of searching led me to the Ubuntu Cloud Portal – AMI Locator, which facilitates searching and filtering all of the Ubuntu AMIs available. At the bottom of the table, I chose “Zone: us-east-1”, and “Name: oneric”.
I then clicked on the ami-a562a9cc link, (a 32-bit ebs server) which then opened up the Request Instances Wizard that I talked about in the last post.
Following everything I outlined in part-1, I wound up with a shiny new Ubuntu server connected to my Windows machine through WinSCP and PuTTY.
In the PuTTY window, I entered the the following commands to make sure the new server was up to date:
sudo apt-get update
sudo apt-get upgrade
Here’s a hint: The PuTTY window does not have any menus or toolbars, and control-v does not work for pasting text. If you copy the above commands, and then simply right-click in the PuTTY window, the commands will be pasted in. Hitting enter will then run them.

Install the OpenGeo Suite

Next up, is getting the OpenGeo Suite installed. I’ve described this process in other posts, but here it is in short form. Just remember to substitute <YourAWSPublicDNS> with your actual Public DNS string, which looks something like this:

  • In the PuTTY window (or terminal if you’re using some form of Linux), sudo to root:

sudo su

  • Then enter these commands. I’ve found they work best if they’re entered one at a time:

wget -qO- | apt-key add -
echo "deb lucid main" >> /etc/apt/sources.list
apt-get update
apt-cache search opengeo
apt-get install opengeo-suite

  • Back in the AWS Management Console, choose the server instance, go up to the “Instance Actions” button, and click Reboot
  • Once it’s finished rebooting, test the OpenGeo Suite
    • In a browser window, go to: http://<YourAWSPublicDNS>:8080/dashboard/
    • Launch GeoExplorer
    • Click the Login button on the right end of the toolbar.
      • Default Login credentials are User: admin, Password: geoserver
    • Make any changes to the map you want
    • Save the map (There is a save map button on the toolbar)
    • …and exit GeoExplorer

The map should now be publicly viewable at:
Here’s what mine looks like:
Now I have a real cloud-based web-map- server up and running. But wait. There’s more. The next step to making this a truly useful map server, is to add some custom data to it.

Upload some Data

Using WinSCP, I added a new folder under the /home/ubuntu directory.

  • Travel to the “/home/ubuntu” directory on the remote side
  • Right click > New > Directory…
  • Name the new folder, and make sure permissions are set to
    Owner: RWX, Group: R-X, and Other: R-X, (Octal: 0755), otherwise, upload and GeoServer access will not work


    • In the Local panel, I made my way to where I store GIS data on my workstation lappy. This particular folder holds all the shapefiles I plan on using with any of my OpenGeo Suite/GeoServer boxes, and they’re all in Web Mercator projection (EPSG: 3857).
    • Highlighting the files I want to upload on the Local side, I then drag and drop them into the new remote folder
    • Upload promptly ensues

Next up, is…

Loading this new data into GeoServer

  • Open up the OpenGeo Suite dashboard once more at: http://<YourAWSPublicDNS>:8080/dashboard/
  • Click on the GeoServer link, and Login

Loading data into GeoServer is another complicated process, so I won’t go into those details here. The process for importing data into a PostGIS database is well documented on the OpenGeo website. Importing shapefiles is not much different.
Now I have some custom data on my server. I can add styles to it, set up a new map using GeoExplorer, and post it for the world to see.
Here’s a look at a map I put together just for testing purposes:

And the link:
I’m pretty happy with the way this turned out. Everything seems to be working OK so far. The new instance is much more stable than my first try. It hasn’t crashed once, even though I felt like I was pushing it to the limit with all the uploading, styling, and layout editing I was doing in GeoExplorer.
Now, if it were only 5 o’clock, I’d be able to celebrate with a beer. What’s that? It’s 4:30?
Close enough! 🙂
Link to part 1
Link to part 3

Serving Maps – in the Cloud – for Free (part 1)

My latest personal project (still in progress) is to get a true cloud-based map server up and running, posting maps from a free-tier Amazon Web Services (AWS) Ubuntu server. This has not been easy. I’ve looked at AWS a number of times over the last year, and a few things have made me shy away from trying it out. Mainly, It’s incredibly hard to decipher all the jargon on the AWS website. And it’s not your everyday jargon. It’s jargon that’s unique to the AWS website. It’s jargon2. Amazon has been sending me multiple emails the last few weeks warning me that my free-tier account status is about to expire. That, and a few days free of pressing work spurred me on to dive in and give it a try. I knew this was going to be a complicated process, so I wanted to document it for future reference. That’s what led to this post.
As the title says, this is part 1 of what will most likely be a 2 part post. (Update: It wound up being a 3 part series) At this point I have the server up and running. I’m able to download, edit, and upload files to the directories I need to. I have an Apache server running on the instance, and the OpenGeo Suite installed. However, I am having some problems with the OpenGeo Suite. As soon as I get them ironed out, I’ll either update this post, or add a part 2.
So, here we go…
(If you’re already familiar with the AWS management console and AMIs, you can scroll down to the “How do I connect to this thing…” section)

Wading through the AWS setup

The first step in the process is to sign up for an AWS account which allows you to run a free Amazon EC2 Micro Instance for one year. These free-tier instances are limited to Linux operating systems. You can see the details and sign up here:
The next thing I did was to sign into the AWS Management Console and take a look around.
Gobbledygook. I needed some help translating this foreign language into something closer to English.
There are a lot of websites out there that try to explain what’s what in AWS, and how to use it. One such example is “Absolute First Step Tutorial for Amazon Web Services”, and what follows here is largely based on what I found there. The easiest way to get started is by using an “ami” which is a pre-built operating system image that can be copied and used as a new instance. A little more searching ensued, and I found a set of Ubuntu server amis at alestic – The tabs along the top let me choose the region to run the new server from, (for me, us-east-1). I picked an Ubuntu release (Ubuntu 11.10 Oneric), made sure it was an “EBS boot” ami, and chose a 64-bit server.
This brought up the Amazon Management Console – Request Instances Wizard. The first screen held the details about the AMI I was about to use.
(You can enlarge any of the following screen shots by clicking on them)

  • I made sure the instance type was set to Micro (t1.micro, 613 MB) and clicked continue.
  • I kept all the defaults on the Advanced Options page and clicked continue.
  • I added a value to the “Name” tag to make it easier to keep track of the new instance and clicked continue.
  • I chose “Create a new Key Pair” using the same name for the key pair as I used for the instance.
  • I clicked “Create & Download your Key Pair”, and saved it in an easy to get to place.

There are some differences in where you should save this key depending on what operating system you’re using, which I’ll explain later in this post.

On the next screen, I chose “Create a new Security Group”, again naming it the same as I did the instance. Under Inbound Rules, I chose the ports to open:

  • 22 (SSH)
  • 80 (HTTP)
  • 443 (HTTPS)
  • 8080 (HTTP)

…clicking “Add Rule” to add each one, one at a time. If you’re following along, it should look something like this:

The last screen showed a summary of all of the settings, and a button to finally launch the instance.

Once launched, it shows up in the AWS Management Console, under the EC2 tab.

The good news: After all that, I finally have a real cloud-based server running Ubuntu on AWS.
The bad news: That was the easy part.
Now the question is:

How do I connect to this thing, and get some real work done?

The default settings on AWS lock things down pretty tight. And that’s how it should be for any server, really. The thing is, this is more of a test-bed than a production server. I want to be able to easily navigate around, experiment with settings, and see how things work. Having some kind of a GUI really helps me out when I want to learn where things are, and how they work together. Long story short – I settled on setting up an FTP client to view the directory structure and files on the AWS server, and used command line commands to change settings, permissions, and perform some editing of files (Yes, I’m talking VI). It’s a bit harder to find info on how to set things up on a Linux box, so I’ll start there. Windows will follow.

For Linux (Ubuntu/Mint) users

If you’re an experienced, or even a novice Linux user, you’re familiar with Secure Shell (SSH), or at least heard the term before. Most websites explaining how to access a new Ubuntu AWS instance from a Linux box suggest using SSH, tell you to put the downloaded key file in the ~/.ssh folder, or the /etc/ssh folder, and then changing its permissions so it’s not publicly viewable by running the following command in terminal:
sudo chmod 400 ~/ssh/<yourkeyfilename>.pem
If you’re going to be doing all your work through the command line using only SSH, that is the way to go. However, I wanted to connect to my new cloud server through FTP so I can upload, download, and otherwise manage files with some kind of GUI. After many hours of searching and testing and beating my head against the wall, I settled on using SecPanel and FileZilla.
The major hurdle I had to overcome in order to use FTP on a Linux (Ubuntu/Mint) box to connect to my AWS server, is AWS’s use of Key Pairs instead of passwords. There are no ftp clients that I could find that allow using key pairs for authentication. Yes, I vaguely remember managing to set up an SSH tunnel at one point, but that seemed overly complicated to me, and not something I want to go through every time I have to update a webpage. To get around this, I used two pieces of software: SecPanel, and FileZilla. If you’re familiar with FTP at all, you should be familiar with FileZilla, so I won’t explain how to use it here, except to reiterate, it does not allow using key pairs to authenticate user sign-in to a server. To get around that, SecPanel comes to the rescue. The problem with SecPanel? There is absolutely no documentation on the website, nor any help file in the software. Needless to say, much hacking ensued.
To get right to the point, here’s what I did to get things working:

  • I copied my key file out of the hidden folder (~/.ssh) and into a new “/home/<user>/ssh” folder, keeping the same “400” file permissions.
  • In SecPanel, I entered the following values in the configuration screen:
  • Entered a Profile Name and a Title in the appropriate boxes.
  • Copied the Public DNS string from the AWS management console
    (which looks something like “”)
    and pasted that into the “Host:” box.
  • Entered User: “ubuntu” and Port: “22”
  • Entered the complete path to my key file into the “Identity:” box
  • Everything else I kept at the default settings.
  • Clicked on the “Save” button

Here’s what it looks like:

Going back to the Main screen in SecPanel, there should be a profile listed that links to the profile just set up. Highlighting that profile, and clicking on the SFTP button then starts up FileZilla, and connects to the AWS server, allowing FTP transfers… as long as the folders and files being managed have access permission by the user entered in SecPanel.

So, how do we allow the “ubuntu” user to copy, edit, upload, and download all the files and folders necessary for maintaining the server?

  • Open a terminal window and SSH into the Ubuntu server
    (sudo ssh –i <PathToKeyFile>.pem ubuntu@<UniqueAWSinstance> ).
  • Get to know the chown, chgrp, and chmod commands.
  • Use them in Terminal.
  • Make them you friend.

You can also perform all the other server maintenance tasks using this terminal window, e. g. apt-get update, apt-get upgrade, apt-get autoclean, and installing whatever other software you want to use on the new server.

Really, it’s not that hard once you dive into it. And, the fact that you can now SEE the files you’re modifying, SEE the paths that lead to them, and SEE what the permissions are before and after changing them, makes things a whole lot easier. For example, the following command:
sudo chgrp ubuntu /var/www
will change the /var/www “Group” to “ubuntu”, which will then allow the ubuntu user (you) to upload files to that directory using FTP.

For Windows Users

Windows access was much easier to set up than it was in Ubuntu/Mint. For this I used PuTTY and WinSPC. As in Linux, I copied the Key File to a new SSH folder under my user name. A couple of differences here: there are no access permissions to worry about in Windows, however, the Key File has to be converted to a different format before WinSPC and PuTTY can use it. Both the WinSPC and PuTTY downloads include the PuTTYgen Key Generator that can convert the <keyname>.pem file to the appropriate <keyname>.ppk format. In PuTTYgen, click on “Load”, set the file type to “*” to see all files, and make your way to the <keyname>.pem file. Once it’s loaded in PuTTYgen, click the “Save private key” button, and save the file to wherever you want. I saved mine to my new SSH folder, (without adding a passphrase).

Next it’s just a matter of opening WinSCP, setting the “Host name:” to the AWS Public DNS string, “Port number:” to 22, “User name:” to “ubuntu”, “Private key file:” to the path to the key file, and “File protocol:” to SFTP.

Clicking the “Save…” button will save these settings so they don’t have to be entered every time you want to log in. The “Login” button will open an FTP like window where files and folders can be managed.

And, there’s a “Open session in PuTTY” button on the toolbar that will open a PuTTY terminal where commands can be entered just like an Ubuntu terminal window.

File permissions can be set by entering chown, chgrp, and chmod commands in PuTTY just like using SSH in Ubuntu.

Next up, getting my OpenGeoSuite running

As I said at the beginning of this post, I have the OpenGeo Suite installed, and have been able to serve maps from it for short periods of time. However, I still need to iron out some wrinkles. It’s been suggested that my problems might be due to the lack of swap space on AWS micro instances. It might not even be possible to run the entire suite on a micro instance, I don’t know. If that’s the case, I might have to strip it down to just running GeoServer. But that will have to wait for another day.
Update – 12/21/2011
Link to part 2
Update – 12/22/2011
Link to part 3

Not Just Another MacBook

An account of my recent adventures in repairing and setting up a tri-boot MacBook

Step one – Obtain the MacBook

Dead Mac
I do not recommend obtaining a MacBook the way I did (or wish it upon anyone), but I do believe in making lemonade from lemons whenever possible. Last week I got a call from one of my daughters explaining that her sister, who attends the same college as her, had spilled “something” on her computer, rendering it inoperable. Luckily my girls attend a college that’s not far from home, so the next day was spent bouncing between the Apple store and Best Buy analyzing our options. I’ll leave out the gory details here, but the end result was: I chipped in the cost of a replacement PC, the daughters split the cost of an upgrade to a new MacBook Pro, and I wound up with the old, wet, borked MacBook to use however I wanted.
Here are the two culprits in the dead-Mac fiasco. Daughter on the right is the former dead-Mac owner. Daughter on the left is the Mac-killer.
The Two Culprits

Step Two – Rebuild the MacBook

The tech at the Apple store said there was a significant amount of “liquid” inside the Mac when he opened it up. The estimate for repairs was $750, with no guarantees that it would work properly in the end. He offered us some info for an off-site computer repair person that might be able to fix it for less. But his recommendation was to sell it for parts. With this in mind, I decided to take a chance, open it up, and see what I could do with it. I had nothing to loose.
Upon removing the base plate, it was obvious that the “liquid” spilled was something a bit stronger than water. My guess was some combination of beer and margarita mix, but daughter 1 insisted it was nothing stronger than wine.  I read in a few places that it’s possible to actually wash the logic board in various ways (using water, alcohol, and a few other concoctions) but that seemed a bit extreme to me. Since the “liquid” involved did not contain a lot of extraneous solids and/or sugary substances, I decided a simple drying out was the best route to take.
Mac disassembled
A quick search led me to the fantastic ifixit site that led me through the entire disassembly process. I found the Mac a lot easier to take apart without ruining anything than I have the few PC laptops I’ve worked on. I disassembled the entire bottom half of the Mac, wiped off the “water” droplets, hit everything with compressed air, heated it up with a hair dryer, and put everything back together again. This whole process took about 5 hours. No one was more surprised than I was when I plugged the thing in and it started right up. That felt good.

Step Three – Improve the MacBook

Both my daughters use Macs, so I’ve had a chance to work on them a few times. I’ve never been a fan of the Mac operating system. There’s something about the user interface, with all those expanding and bouncing icons that just turns me off. That and a few other things that I won’t go into here, leave me preferring Windows over OSx. Still, I saw this as a chance to give the Mac another try. One of the best computing decisions I ever made was to turn my work PC into a Win7/Ubuntu dual boot machine. I’ve tried virtual machines in the past, but I think if you’re serious about trying out and learning a new OS, it deserves to be installed on a high quality computer, and run natively on its own partition. Since I did that, I’ve found I enjoy learning and using Ubuntu much more.
So, this time around, I decided to up the ante, and go for three. The big three: Mac OSx, Windows 7, and Ubuntu 11.04.


My process follows very closely the steps outlined in the Ubuntu Community Documentation MacBookTripleBoot page. This page was created using a 2008 MacBook, so I’ve pointed out a few specific steps below where I had problems, or deviated from that process.


First thing to do is get the Mac Bootcamp drivers on a disc or USB drive. I highly recommend getting the Bootcamp drivers assembled before doing anything else. I did not use Bootcamp to setup the partitions, but the drivers are needed after installing Windows in order to use all of the Mac hardware from within Windows. It’s mentioned in a few forum posts that the bootcamp drivers are available from within the itself. For example:

Solution: Control-click on the Boot Camp Assistant program, choose Show Package Contents, and then navigate into Contents » Resources. Inside there you’ll find DiskImage.dmg. Just burn that disk image, and you’ll have your Windows drivers CD.

However, I was unsuccessful in finding these on my system. You can also download the drivers by running the bootcamp application. However, you must do this BEFORE partitioning your hard drive, or it won’t work. Something I learned the hard way. A third option I did not get to try, is simply using your original Mac OSx disk. Of course the two culprits in this Mac-fiasco could not find said disks, so I was SOL there. I wound up waiting for daughter 2 to come home with her new MacBook Pro, and I downloaded the bootcamp drivers from that, using the burn directly to disk option.
One last note: When you get to the point in the Bootcamp app where it asks you to partition the hard drive, cancel out. You want to use the Mac disc utility app to do that instead.


The next step is to download, install, and run rEFit. This is a pretty straight forward process, but I will point out one thing. I suggest you run the rEFit partitioning tool once before actually partitioning the hard drive. This will sync up your partition tables, and fixed some errors I was getting before I did that.

Disk Partitions

Use the Mac OSx disk utility to create partitions for the three operating systems. As outlined in the  MacBookTripleBoot page, I used the command line, and entered:
diskutil resizeVolume disk0s2 100G "JHFS+" 4-Linux 60G "MS-DOS FAT32" 4-Windows 0b
…in order to divide my drive up into a 100GB OSx partition, a 60GB Ubuntu partition, and a 90GB Windows partition. (The 0b for the Windows partition assigns whatever’s left to that partition). Then use rEFit to sync the new partition tables again.

Install the OSs

Install Windows on the last partition, as outlined in the MacBookTripleBoot page. I can confirm that using an XP disk and a Win7 upgrade disc does work just fine. No need for a single full install disk.
Install Ubuntu on the middle (third) partition as outlined in the MacBookTripleBoot page. The installation screens appear to be different in the 11.04 edition of Ubuntu. Just make sure you use the correct partition for the root mount point, and install the grub boot loader to that partition’s boot sector, not MBR.

Clean Up

Once you have Windows installed, use the Bootcamp setup program to install the necessary Windows drivers so you can use all the Mac hardware. I had to do this in order to get my wifi, bluetooth, ethernet, and a few other things working.
Ubuntu was able to recognize most of the Mac hardware, but I had to use an ethernet cable to hook up to my router in order to access the internet. Ubuntu then automatically identified the drivers needed for the Mac’s wireless and graphics adapters.

That’s It

Everything seems to be working smooth as silk. If it weren’t for the white case and MacBook logo staring me in the face, I wouldn’t know I was using a Mac. I made one change to the rEFit configuration file. I went into the refit.conf file and changed the timeout setting to 0 in order to disable automatic booting. This loads up the rEFit boot menu, letting me choose whichever OS I feel like using at that time. The only glitches I’ve noticed is during a restart Ubuntu hangs sometimes, requiring a push on the power button. And Windows seems to want to reboot back into Windows occasionally, instead of to the rEFit boot menu. That’s easily fixed by holding down the Option button during the reboot.

ArcGIS vs QGIS Clipping Contest Rematch

Round 2 in which ArcGIS throws in the towel.

(Please note: This post is about clipping in ArcGIS version 10.0. The functionality has been improved, and problems mentioned have been fixed in later versions of ArcGIS)
This is a follow-up to my previous post where I matched up ArcGIS and QGIS in a clipping contest. One of the commenters on that post expressed some concern that there might be “…something else going on…” with my test, and I agreed. It was unfathomable to me that an ESRI product could be out-done by such a wide margin. Knowing that ArcGIS often has problems processing geometries that are not squeaky clean, I began my investigation there. I ran the original contour layer through ArcToolbox’s Check Geometry routine, and sure enough, came up with 5 “null” geometries. I deleted those bad boys, and ran it through ArcToolbox’s “Repair Geometry” routine, and then ET GeoWizard’s “Fix Geometry” routine for good measure (These may or may not be identical tools, I do not know). No new problems were found with either tool.
I wanted to give ArcGIS  a fighting chance in this next round, but also wanted to level the playing field a bit. I did a restart of my Dell m2400 (see the specs in the previous post), exited out of all my desktop widgets, and turned off every background process I could find. I also turned of Background Processing in the Geoprocessing Options box. The only thing running on this machine was ArcGIS 10, and the only layers loaded were the contour lines and the feature I wanted to clip them to. I ran the “Arc Toolbox > Analysis Tools > Extract > Clip” tool and watched as it took 1 hour 35 minutes and 42 seconds for ArcGIS to go through the clipping process before ending with the message:
ERROR 999999: Error executing function
Invalid Topology [Topoengine error.]
Failed to execute (Clip)

Now granted, this is much better than the 12 hours it took the first time I ran it, but still, no cigar in the end.
Giving QGIS a chance to show it’s stuff, I used Windows version 1.5.0 to run a clip on the same files, on the same machine. QGIS took all of 6 minutes and 27 seconds to produce a new, clean contour layer.
QGIS - Contours v2
I ran this through the same geometry checks as the original contour layer, and came up with no problems.
My goal here is not to jump all over ESRI and do a dance in the end zone. I would really like to figure out what’s going on. As I’ve said before, I’ve had problems in the past with ArcGIS producing bad geometries with its Clipping process (and other tools, too). But the fact that another product can handle the same set of circumstances with such ease baffles me.
I’ve put about as much time as I can into this test, and taken it as far as I can. If you would like to give it a go, feel free to download the files I used through his link:
(Note: This is a 878MB file, and is not completely uploaded as of this posting. Check back later if the link does not work for you right now)
If any of you have better results than I did, or find any faults with my files or process, please let me know and I WILL make a note of them here. Thank you.

ArcGIS–QGIS Faceoff

Is QGIS a viable alternative to ArcGIS?

(Please note: This post is about clipping in ArcGIS version 10.0. The functionality has been improved, and problems mentioned have been fixed in later versions of ArcGIS)
I’ve never enjoyed working with contours. They seem to bog down my system more than any other layer type I work with. However, most of my clients are so used to looking at USGS Topo maps they expect to see them on at least one of the maps I produce for them. I recently worked on a project covering a five-town area in the Catskill Mountain region. The large area covered, and the ruggedness of the topography was proving exceptionally troublesome in processing their contours. So much so that I decided to look at other options to get the work done. I’ve used a variety of GIS tools over the years, but do most of my paying work exclusively in ArcGIS. It’s what I’m most familiar with, it does (nearly) everything I need it to do, and therefore provides my clients with the most efficient use of my time. However, in this situation that was not the case.
The one geoprocessing operation that frustrates me most often (in ArcGIS) is the Clip operation. It seems to take more time than most other geoprocessing tools, and often results in bad geometries. This happens so often, I usually resort to doing a union, and then deleting the unwanted areas of the Union results. For some reason this works much faster, and with more reliable results than doing a Clip.
Since what I wanted to do here was a clip on a contour layer, I was in for double trouble. Yes, I could have clipped the original DEM I wanted to produce the contours from first, then generated contours from the clipped DEM. But that wouldn’t have led to anything to write about. So, here’s a short comparison of how ArcGIS handled the process versus QGIS:

The hardware and software used:

ArcGIS 10, SP2

  • Windows 7, 64 bit
  • Dell Precision m2400 laptop
  • Intel Core 2 Duo CPU, 3.06GHz
  • 8 GB RAM

QGIS 1.4.0

  • Ubuntu 11.4
  • Dell Inspiron 600m laptop
  • Intel Pentium M CPU, 1.60 Ghz
  • 1GB RAM

A fair fight?

I started out with ArcGIS, and loaded up my 20’ contour lines and a 1 mile buffer of the study area to which I wanted to clip them. I began the clip operation 3 times. The first two times I had to cancel it because it was taking too long, and I needed to get some real work done. Curious to see how long it would really take, I let the process run overnight. The progress bar kept chugging away “Clip…Clip…Clip…Clip…”, and the Geoprocessing results window kept updating me with its progress, so I assumed it would complete eventually. In the morning, I looked in the Geoprocessing results window and found it had run for over 12 hours before throwing an error, never completing the clip operation. The error message said something about a bad geometry in the  output. Really, no surprise there.
ArcGIS - ClipContour1
(Yes, those are lines in the picture above, not polygons. They’re very densely packed)

QGIS gets to play

The next day I decided to give QGIS a shot at it. I copied the two shapefiles over to my 6 year old lappy. (The contour.shp file was 1.3GB) fired up QGIS, and ran the Clip operation on the two files.
QGIS - Contours Screenshot-Clip
This time it took all of 17 minutes and 21 seconds to get a new contour layer.
Clip Results - QGIS
So, who’s the winner here? Was it a far contest?
My take-away is, ESRI really needs to do some work on its Clip geoprocessing tool. As I said earlier, it is slow, and results in bad geometries more often than any of their other geoprocessing tools I use.
Addendum June 11, 2011: See the follow-up post here:

Find duplicate field values in ArcGIS using Python

As ESRI is making it’s move away from VB Script and towards Python, I’m also slowly updating my stash of code snippets along the way. One of those little pieces of code I use quite often is one that identifies duplicate field names in a layer’s attribute table. I find this particularly helpful when I’m cleaning up tax parcel data, looking for duplicate parcel-ID numbers or SBL strings. Since I’ve been working a lot with parcel data lately, I figured it was time to move this code over to Python, too. So, here it is in step-by-step fashion…

1 – Add a new “Short Integer” type field to your attribute table (I usually call mine "Dup").

2 – Open the field calculator for the new field

3 – Choose "Python" as the Parser

4 – Check "Show Codeblock"

5 – Insert the following code into the "Pre-Logic Script Code:" text box making sure you preserve the indents:

uniqueList = []
def isDuplicate(inValue):
  if inValue in uniqueList:
    return 1
    return 0

6 – In the lower text box, insert the following text, replacing "!InsertFieldToCheckHere!" with whichever field you want to check the duplicates for:

isDuplicate( !InsertFieldToCheckHere! )

This will populate your new field with 0’s and 1’s, with the 1’s identifying those fields that are duplicates.



Generating Vertical Buffers

One of the more popular analyses I’m asked to perform for my clients is a viewshed analysis. Beyond simply identifying what areas of a town are visible from roads or other public viewpoints, I’m often asked to help identify, and sometimes rank, areas that are most worthy of protection. One way to help a town identify and evaluate these high priority vistas, is to identify prominent ridgelines and the areas around them that are susceptible to inappropriate development.
One way to mitigate the impact of development on highly visible ridgelines is to make sure new buildings do not break the horizon line – the point where the ridge visibly meets the sky. Since most local zoning codes restrict building heights to 35-40 feet (in my client towns, anyway), producing a vertical buffer of 40 feet helps to identify the areas susceptible to such an intrusion.
The steps to produce such a vertical buffer are not overly complex, but I have not found them readily available online. So, for your benefit (and my easy reference) I outline the process here.
(Note: I outline these steps specifically using ArcGIS 10. I’m sure it’s possible using other tools, but this is what I use most often in my daily work)
Begin with:

A DEM of your study area

This DEM must be an integer raster for one of the following steps, so start off by using the raster calculator.
(Arc Toolbox > Spatial Analyst Tools > Map Algebra > Raster Calculator)
Use the INT( “DEM” ) expression, where “DEM” is the elevation raster that you want to convert from a floating point to an integer raster.

Prominent ridgelines

Ridgelines are often generated using watershed boundaries, with extensive field checking to identify the more prominent features. For these calculations, the ridgelines must be in a raster format that includes the elevation of each raster cell. To convert a line-type ridgeline to a raster, first buffer the ridgeline to produce a polygon feature (I typically use a 20 foot buffer) and then use Extract by Mask.
(Arc Toolbox > Spatial Analyst Tools > Extraction > Extract by Mask)
Use the Integer DEM produced in the first step as the input raster, and the polygon buffer of the ridgelines for the feature mask data.

Generate the Euclidean Allocation of the Ridgeline Raster

This is where the need for an integer raster comes into play.
(Arc Toolbox > Spatial Analyst Tools > Distance > Euclidean Allocation)
Simply use the Ridgeline raster generated in the previous step as the input raster, and choose the elevation value as the Source field. This process will generate a new raster that covers the entire study area, with each cell holding the elevation value of the ridgeline raster that is nearest to it.

Generate the Vertical Buffer

Use the Raster Calculator to generate a vertical buffer of the ridgeline.
(Arc Toolbox > Spatial Analyst Tools > Map Algebra > Raster Calculator)
The expression will look something like this: “IntegerDEM” >= (“RidgelineEucoAllo”-12) where “Integer DEM” is the DEM produced in step 1, “RidgelineEucoAllo” is the Euclidean Allocation raster produced in step 3, and “12” is the height of the buffer you want to produce. In this case, the raster measurements are in meters, so using a buffer value of 12 results in about a 39 foot vertical buffer. This allows me to identify areas where there is the potential for a new building built to the maximum allowable height to break the horizon line.
Once the buffer is generated, there is usually some cleanup required. As you can see from these results, buffers are generated in areas not connected to the ridgeline areas we want to focus on, so I’ll delete these before moving on to the next steps in this town’s viewshed analysis.
I hope this is helpful, and as always, I welcome any comments and feedback.