Ian Dees is making it easier for people to find and use all sorts of geodata. He is a member of the OpenStreetMap US board, founder of OpenAddresses and All The Places, and is always looking for new data to explore and share.
Ian was interviewed for GeoHipster by Mike Dolbow, about a month before the announcement that Mapzen is shutting down operations. We at GeoHipster wish everyone from the Mapzen team the best of luck in finding their next adventures. –Ed
Q: You’re part of a team at Mapzen. Tell us how you got started with them.
I joined Mapzen because I was excited to focus on maps and map data as part of my day job. I was also excited to work with the team at Mapzen, who have built some great services based on the data that I’ve enjoyed building for the last decade of my life. I ended up on the Tiles team working on the map data that makes up the Tilezen service but I also spent time working on other data systems like Terrain Tiles, All the Places, and OpenAddresses.
Q: What are some cool projects you’ve been working on lately?
I’ve been working with Seth Fitzsimmons on updating Mapzen’s Terrain Tiles dataset for the last few months and we finally got it out the door earlier this month. I enjoyed updating the Terrain Tiles using newer AWS products like Batch that allowed me to spin up tens of thousands of CPUs to regenerate every tile in the world using newer and higher resolution elevation data. It was quite a thrill playing around with that setup!
I’ve also been working on adding data to Mapzen Places through a web scraping project called All the Places. Mapzen Places is a dataset with hundreds of millions of “venues” or points of interest from around the world and a system to link them into a hierarchy of administrative boundaries. All the Places will scrape location data from websites and output GeoJSON that will then be matched with existing Mapzen Places entries to add details like phone number, opening hours, improved location, and more.
Q: You’re the founder of Open Addresses, correct? What’s the story behind how that effort began?
OpenAddresses began when I finished importing the buildings and addresses in Chicago. I was looking around for more data to import into OpenStreetMap while also dealing with the recently-formed Data Working Group’s import guidelines. I decided that taking the time to go through the import process for the hundreds of datasets I was finding wasn’t a good use of my time so I collected the sources into a spreadsheet so others could import them if they wanted.
At some point Nick Ingalls from Mapbox came along and helped me move this spreadsheet into GitHub along with a system to download and merge the data together. After lots of help by hundreds of contributors (like yourself – thanks!) we have over 500 million address points collected and the data is used by Mapzen and Mapbox’s geocoders to provide extremely accurate and up to date search results.
Q: I have to admit, Open Addresses is one of the Github repos I contribute the most to. I think of it kind of like a treasure hunt, finding county data sources for addresses, particularly in my home state. Did you ever think it would grow to over 100 contributors?
Absolutely not! It was a thrill to see the community grow so rapidly. I think the thing that really kicked it into gear was Mike Migurski’s addition of continuous integration builds that generated preview of changed sources in a pull request. It made it clear to the contributor what was getting added and what the data looked like. The instant gratification that those maps provided really made people excited to spend a bit more time looking for more data to add.
Q: I have to admit that is a really cool feature – almost like earning badges in an app or something. But also, I’ve done enough painstaking geocoding that if I’m helping someone, somewhere have an easier time at that, it seems a noble cause. Are there any other contributors expressing that kind of desire? Or, conversely, has there been any backlash from a source that didn’t know its data was being used?
I think there are two types of contributors: one like you who can see where this data ends up being used and is excited for where their contribution ends up downstream. The other is a “casual contributor” that somehow finds the repository and is able to quickly and easily add a data source. This doesn’t happen as much anymore because we cover so many places already, but they get quick confirmation that their contribution is helpful and we can more easily offer fixes if something is wrong.
We have received one or two requests to stop using a data source because we misinterpreted the licensing information, but the vast majority of requests we get are to point to a better source of data or to offer newer or more complete information directly to us. To help with both of these situations we’re working with Portland’s TriMet team to build a tool for data providers to submit data directly to us.
Q: You also were on the team behind CensusReporter. Census data – and interpreting it – has long been the bane of the digital geographer’s existence. I can’t believe it took the geo community that long to have something dedicated to making it easier to use. I refer people to the site all the time. What was the most challenging obstacle to overcome for that team?
It was a blast working with the small team of Ryan Pitts, Sara Schnadt, and Joe Germuska on Census Reporter. Joe had built similar systems before and wanted to make them better, so it was great to build on top of his vision and work with Sara and Ryan to build innovative interfaces on top of the data that I pulled together. The most challenging part of that project was building something that handled the depth of the data that Census Bureau provides while also making it approachable and searchable for reporters who didn’t have the time to completely understand how Census had organized the data. I think one of the most successful parts of Census Reporter is that it’s simple enough to use and update that it only takes a few hours every time Census releases new data twice a year to maintain. Otherwise it runs itself!
Q: You’re also on the Board for OpenStreetMap US. Other than, uh…spraying an occasional fire extinguisher on the mailing list, what are your duties there?
I’ve been the treasurer on the OpenStreetMap US board of directors for several years. That involves making sure bills get paid, taxes get filed, and coordinating sponsorships for State of the Map US. One of the trickiest parts of being active with OpenStreetMap US is figuring out how to spend our money in a responsible way that improves the community of mappers in the United States. We’re constantly looking for ways to do that and if anyone out there has suggestions, please let us know.
Q: Does any of this work compare to being on Obama Election 2012 campaign?
I built some of the strongest relationships I’ve ever had in the 8 months or so I was on the Obama 2012 campaign. Going through that experience made me re-organize my career priorities entirely: before 2012 I was concerned about how I would be able to move up the ladder into a challenging position while not transitioning into management. After 2012 I realized that an extremely important part of having a job is being emotionally happy and producing something that affected others in a positive way. Working with and learning from amazingly talented people also became very important to me, and I think both of these things had a huge effect on what I chose to do in my career afterwards.
Q: You recently moved back to Minnesota after several years away. What brought you back?
My wife and I grew up in the midwest, and went to college in the midwest, so when we spent 3 years in Virginia at her teaching job we felt out of our element. We definitely wanted to get back to the midwest as soon as possible and a job in Minneapolis opened up at the perfect time. She took it and we moved as soon as we could. It feels great to be back in our element now!
Q: You describe yourself as a “Map Nerd” on various social media outlets. Does that equate to being a geohipster?
When I think of a “nerd” I think of someone who enjoys obsessively figuring out a problem or topic. I call myself a map nerd because I enjoy all aspects of maps: everything from understanding where you are while wandering outside to finding the right command line incantation to reproject a shapefile. I’ve always spent way too much time on a computer and focusing on maps, map data, and the systems around it gives me somewhere to focus when I want to learn new things. I suppose focusing on these geo topics makes me a geohipster 🙂 .