Mapping foreclosures

ArcView is a very handy program for mapping information, particularly when you can visually find relationships between disparate sets of data.

Recently, I worked with another Star reporter to analyze foreclosure trends in Tucson, and compared those with the incidence of high-risk loans. We used two sets of data: The first, from NICAR, contained information on every mortgage application in the United States for 2005 and 2006 (when risky lending was generally considered to be the most widespread). The second set was a list of most 2007 foreclosures that my colleague, Christie Smythe, obtained from RealtyTrac.

If you’re ever interested in figuring out trends via mapping, here’s a step-by-step list that could come in handy. Some of it can be dense and technical, but the general ideas alone could help. (Here’s the map, which Kori Rumore made visually spectacular, if you want an idea of the finished product.)

Step 1: Preparing the data

  • The RealtyTrac addresses, in Excel form, needed “geocoding” — the process that adds X and Y (latitude and longitude) coordinates to most addresses. I did that via batchgeocode.com.
  • Next, I needed to calculate lending trends by census tract. The Star obtained two years’ worth of data from NICAR, which I pulled into Microsoft Access. From there, I ran two SELECT queries with GROUP BY clauses:
    • one that counted the total number of risky mortgages per census tract with a rate spread greater than 3 percent, and
    • the second, which counted with the total number of approved loans.

Between the two, we were able to calculate the rate of high-risk (and possibly subprime) loans to total approved loans per census tract.

Step 2: Mapping the data

  • Armed with a spreadsheet with colums that show 1) census tract name and 2) percentage of high-risk loans, I was able to import that into ArcView. I already had a shapefile — a map overlay, if you will — of census tracts; the next step was to do a joinon the two fields in common between the data sets (the census tract number).
  • ArcView then allowed me to change the “symbology” to shade the different tracts based on percentage. We chose three different shades of color: one for 20-30% high-risk mortgages per tract, a second for 30-40%, and the third for more than 40 percent high-risk loans.

Step 3: Making sense of the data

Now, back to our geocoded addresses. ArcView has a function that allows us to plot points based on X and Y coordinates. That created a new “layer” with all of the foreclosures plotted.

The results: The foreclosure points were spread out all over Tucson, with more in the darker-shaded census tracts. In other words, areas with the greater concentrations of high-risk loans had more foreclosed homes the following year.

Here comes the magic. Census tracts are useful for journalists, but not necessarily for ordinary people. What if we were able to tell our readers which neighborhoods had the highest number of foreclosures? To do that:

  • We first selected the layer with our foreclosure points. Right-clicking it gave us an option to do a “join.”
  • We then told the program to join the data to another layer (the neighborhoods layer), a function known as a “spatial” join.
    • These layers are available from Pima County via its very own GIS repository. The best part is, they’re free.
  • We then executed the spatial join. When we’re done, we were left with a table that contained not only each neighborhood, but also the number of points (foreclosure locations) that fell inside those neighborhood boundaries. At last, we had a list of neighborhoods and how many foreclosures were inside their borders for most of 2007.

From here, the rest was straightforward. We then opened the table in Excel and sorted the table by the greatest number of foreclosures. Christie then called up the heads of some neighborhood associations for some thoughts:

Midvale Park Neighborhood Association President Joe Miller said the reason might be that Midvale Park was seen as a more desirable place to live than some of the surrounding neighborhoods. Some buyers may have stretched their finances to get in, he said.

“They were probably overly optimistic,” he said.