Harel Dan is a GIS and Remote Sensing analyst based in Israel, and the GIS Coordinator at HaMaarag – Israel’s National Nature Assessment Program. Twitter / Website
Harel was interviewed for GeoHipster by Amy Smith.
Q: You’re the GIS Coordinator at HaMaarag, Israel’s National Nature Assessment Program. What is HaMaarag, and how does GIS factor into the program?
A: HaMaarag is a consortium of organizations that manage open landscapes, that was set up to provide evidence-based knowledge to managers and decision makers. We run several long-term projects that take place all over the country, in varying biomes and their ecotones, from evergreen Sclerophyllous woodlands to hyper-arid shrubs, monitoring several classes like Mammals, Birds, Reptiles, as well as vegetation. As such, the entire process of planning out, sampling and analysing the data is dependent on locations. Be it precise measurement of monitoring plot corner pegs with GPS, or creating spatially-balanced sampling methods. My job also entails collecting and processing spatial data from other organizations, with their peculiarities and errors.
Q: You do a mix of technical work, coordination with other agencies, and field work. That sounds like an interesting mix – could you describe a typical day in the life?
A: 6:00 AM, Phone rings, ornithologist on the line, asks me to explain to him how to load the background layer to the Fulcrum monitoring app. 8:30 AM, Log on computer, answer email from chief scientist of the nature and parks authority. 10:00 AM, Run the script that scrapes data from that website. 11:45 AM, Finish that map and send it to graphic design. 13:37 PM, Coffee. 14:03 PM, Back in office after wandering around the labs in the Steinhardt Museum of Natural History, where our offices are. 15:00 PM, Finish a call with the Open Landscapes head at ministry of environmental protection. 16:00 PM, Send drone orthos segmentation results to the botanist for assessment. 17:30 PM, Put kids to sleep. 19:00 PM, Goof around on whatever personal project distracts me these days.
Q: Based on your Twitter account and website, it seems you also take on a good amount of personal projects. What do you look for in a personal project? Any favorites you’d be willing to share?
A: My personal projects are a mix of disciplines and topics that on the one hand interest me, and on the other can be used as an excuse or reason to delve into something new; a concept, a programming language, a tool, etc. Furthermore, as a Geographer, I think I can bridge the gap between the analytical aspect and the human story it tells. For instance, over the summer I’ve made and published a constantly-updated map of fire damage in the south. I saw that there was a lack of connection between news reports and the scale of the damage that was creating misconceptions and lack of understanding. So telling this story was a chance to try out new internet tools to help streamline the work and be easy to read and comprehend for the general public.
Q: What inspired you to publish your analysis of SAR data to identify military radars? Were you nervous at all about the sensitivity of the subject matter?
A: I was intrigued by a peculiar image artifact when I was trying to incorporate Sentinel-1 data in my landcover classification mapping, which happened to appear mostly over broad-leaves and coniferous forests. After tweaking a Google Earth Engine script I’ve noticed that these artifacts converged over a single constant source, so I’ve figured out what these were. After a year or so of hesitance, asking around what should be the preferred action, and actually getting in touch with the Army, I had a job interview for a company that does SAR analysis, so I knew this would be a perfect time to publish the story. So with a tongue-in-cheek image alluding to some issues publicising the location of the radars in my country (It was a PNG image I made in MS Paint that read [REDACTED], you won’t believe how many people over-analysed this), I posted my findings on social media.
I got the job btw, but declined to take it as the conditions weren’t manageable from my perspective.
Q: You’ve successfully had your work featured in multiple publications. What advice do you have for other geohipsters out there looking to get more exposure?
A: Hustle. Made something interesting? Think you’re onto something? Post it on social media. If your career is not dependent on the number of publications in peer-reviewed journals, there’s no reason not to share your work and ideas with the geo community, no matter how half-baked they are.
Q: What do you do in your spare time? Any hobbies?
A: I have a garden with some fruit trees that I tend to when it’s not too hot, but other than that, I’m wholly immersed in being a full time parent to two small kids. Whatever spare time I have, it’s used to wind down and relax with techie reading material, or go on twitter and see what others are up to and engage in the war on Shapefile and banter on that other GIS software.
Q: Are you a geohipster? Why or why not?
A: I tick about a dozen or so results in the GeoHipster poll tally, so I guess I’m on the geohipster spectrum, even though I never got into the laptop stickers and pin badges fad. Besides, the backside of my laptop screen has velcro strips which I use to firmly attach dongles, chargers and an external drive full of hoarded geodata to reduce desktop clutter, this way I have room to place old printed atlases, a working sextante, PostGIS cheatsheet… OY MY GOD I’ve just realised I’m a geohipster.
Q: Any final words of wisdom for our global readership?
A: Don’t use Twitter’s Bing-based translation tool, it’s horrendous.
Ari Gesher, Matt Gordon and Julia Chmyz work at Kairos Aerospace, a Bay-Area-based company specializing in aerospace solutions for environmental surveying and digital mapping. Ari, Matt and Julia were interviewed in person by Amy Smith during the 2018 Mapbox Locate Conference in San Francisco. Describe Kairos Aerospace.
Ari: Kairos applies the notions of faster, cheaper, iterative cycles of technology to Aerospace. Specifically, with the mission of building sensors to spot very large leaks of Methane.
Julia: A less high-level description of Kairos — Kairos deploys aerial sensors, spectrometers, optical cameras, and thermal cameras to conduct large-scale surveys of assets from oil and gas companies, to survey those assets to discover things about them.
Matt: Kairos is a bunch of physicists and engineers who care about health and safety and climate change. We fly sensors and sell data about environmental pollutants (specifically methane) to oil and gas producers.
What led you each to Kairos?
Ari: I ended up at Kairos because the two original founders, Steve Deiker and Brian Jones, both worked at Lockheed for a long time, and they decided to start their own company. Steve’s wife worked with me at Palantir, and they knew that everything they were going to do was going to require a lot of heavy data processing, and that was not an area of expertise for them. They approached me for advice around what it would take to build a team with that kind of ability. That was late 2014. I was instantly interested, it sounded really, really cool… But, for reasons of childbirth, I was not about to switch jobs; I ended up being the original angel investor. Two years later I came on board as the director of software engineering.
Julia: Brian’s wife worked with the woman who was married to my grandfather. And so, my grandfather was actually another one of those original investors — This was 2015 — and he was saying to me, “Julia, there’s this great new company.” And I’m like, “Okay, Grandpa… I’m sure. That’s cool.”
Grandpa says, “They’re so great! They’re so great! You gotta send ‘em your resumé.” I was in school at the time (I’m a year out of college now), and I said, “Okay, fine grandpa, I’ll send ‘em my resumé.”
I hadn’t really looked into it, I just didn’t really want to work at this company my grandpa thought was so cool. But I sent my resumé, and I was really clear about this, I was like, “My grandpa’s really excited about this, but I’m not sure it’s such a good fit.” — expecting to give them an easy way out.
And instead, they wrote back and said, “We’re really interested! Your resumé looks great, we’d really love to have you on board.” So I came in and talked, and actually got to see for myself. And I was like, this looks really great. So I was an intern in the summer of 2016, when we were a third of size we are now. And then I came back full-time a year ago.
Matt: There’s a lot of funny history between Ari and I, which I won’t go into. I had just done my postdoc at Stanford in physics, and Ari recruited me to go work at Palantir. Then, about six years later, I quit and I was bumming around a bit, and making fire art.
Making what?
Matt: Making fire art… yeah… and I thought I would go get a real job. Ari, at that point, was an angel investor, and he tried to recruit me into his current job.
Ari: That’s right, I tried to hire Matt for my current job.
Matt: And I turned him down to go start my own company, to develop online treatment for substance use disorders. Which, let’s say, the world was not ready for… [Polite chuckles] Mark my words: you’re going to see it.
And then about a year after doing that, Ari saw I was on the job market again, and asked me to come work at Kairos, on a team of four people – two full-times, an intern, and a couple of physicists who commited code to our code base (for better or for worse).
How many people are there now?
Group: 18.
So it’s grown quite a bit?
Matt: Yeah. It’s moving.
Ari: Yeah there was sort of two different phases. The first two years, Brian and Steve quit their jobs and were literally in their garage in Los Altos, developing the hardware that is the heart of the methane sensor (which is the imaging spectrometer). And there’s pictures; like, one of them’s across the street, positioning a methane cell in the light path of a heliostat, the other one’s at the laptop with the original Mark-1 Spectrometer, making sure it worked.
Do they still have that?
Ari: They do — it sits on a shelf, and looks like a broken projector or something. [chuckles] So, the first two years was just validating that the hardware would work, and at the end of that, they had the design for what is today our production spectrometer, and the first production-designed unit (although we’re probably going to throw that one out pretty soon.)
The next two years have been developing both the operational side (How do we hook this thing up to a computer, and fly it, and collect data?), and also the software pipelines that sit behind it (How do we take that data off the instrument once it’s done? How do we upload it to the cloud, and develop the algorithms, from scratch, that turn that spectrographic data into the plume images that we have?).
Walk me through the process of: going out and sensing the area, to: you have a final product; and what that final product looks like.
Ari: The way that this works is that we’re given an area, a spot on the ground — the job we’re working on now is about 1,300 square miles?
Matt: We’re given a shapefile.
Ari: Right, we’re given a shapefile, and if we’re lucky, we’re also given a list of assets (another shapefile that tells us where all their wells and storage tanks and things are, so we can identify things once we find a plume over them). We then draw up flight plans to go fly over that area… like, if you look at it, you see the plane going back and forth like a lawn mower. And then, that data goes through the processing pipeline.
Example of a flight path
What comes out the other end are a stack of rasters that show us various measures of what the spectrometer has picked up. At a very rough level, what we’re actually sensing is a methane anomaly. Methane is everywhere in the atmosphere at some level; so it’s not “Is there methane here or is there no methane?”, but “Is there elevated methane?”
We use the large survey area, or chunks of it, to develop what we think the background levels of methane are in that area of the atmosphere. And then, we look for places in the data where there are elevated levels, and use that to interpolate a plume shape.
Example of a plume
One of the things we like to do at GeoHipster is geek out about the tools that people use; tell me about your day-to-day.
Ari: We’re mostly a Python shop. Very large amounts of effort dedicated to making GDAL install and compile correctly.
Matt: I do a lot of the GIS stuff at Kairos. There’s all the code for taking remote sensing data and GPS, and figuring out where that was placed on the ground. Then, taking all of that and creating GeoTIFFs out of that, with all the different metrics that we’re interested in.
Ari: And that’s all custom software, we don’t even use GDAL very much. We use GDAL to open the dataset that we write, but how we figure out what goes into each pixel is all ours.
Matt: Yeah, the ground placement of remote-sensed data is an art form… it’s interesting how much we’ve built from scratch. I think people with a lot of background in this probably know a lot of tricks and tools (and I’ve heard tell that there’s a book, but I’ve been unable to find it).
In terms of GIS nerdery: we used to do a lot of ad-hoc analysis in QGIS, and as we were increasing the number of reports we wanted to produce for customers, we wrote a QGIS plugin. It’s custom, and it’s not published anywhere because it’s specific to our workflow and our data, and it gives people summary information.
Anyone who has used QGIS will know that it’s like, incredibly powerful and can be incredibly frustrating. And if anyone from QGIS is reading this, I want them to know that I really appreciate the tool. We love it, and we would use something else if we thought it was better, and we don’t. There’s nothing else better.
Julia, you work on the tools that pilots use when they’re out collecting data. Can you tell us a bit about those?
Julia: There’s the feed that the flight operator sees in the plane, and the spectrometer frames that are being taken. There’s also all the IMU data that’s used for path stuff and all the later calculations… and this is our flight monitoring Mapbox Leaflet. The back end is built in Python, and the front end is in React.
Matt: Ari’s contribution was the X-Wing fighter.
Julia: The point of this is to make everything work as smoothly as possible — so the flight operators don’t have to spend their time staring at multiple log files, which is what they were doing before this.
Matt: So imagine a terminal, and just watching lines of term logs scroll past… in an airplane. In a very small plane.
Ari: Well, now that they use this, they say that they get kind of bored on the plane, because it gives them everything they need. In fact, we built this this tool not just to spit the information to the operator, but it also ingests all the raw data coming off the instrument; and we have a bunch of agents that watch that data for different conditions, and control the instruments.
It’s called R2CH4 as an homage to R2D2, who’s an astromech repair droid — and its primary job is not to save the universe, its primary job is just to make the X-Wing go.
I wouldn’t have caught that reference.
Well, CH4 is Methane sooooo… [makes the “ba-dum-tssssss” joke sound]
What do you do when you’re not at work – any hobbies? Matt, I heard about yours a little already: I know you’re a fire artist and you hang-glide?
Matt: I don’t hang-glide anymore, but yeah, I build weird Burner kinetic fire art. I’m making a fire Skee-Ball machine right now, where the balls are on fire. You get to wear big, fireproof kevlar gloves. I was going to bring it to Precompression, which is the pre-Burning Man party they do in SF, but the SF fire department nixed it.
Ari: I dabble in home automation. That’s kind of my tinkering hobby currently. I mean, I’ve had really good hobbies, but now my hobbies are basically my two children. But, you know… I used to be a DJ for a little while. I swear I used to have better hobbies — but I’ve really just been well-employed for like twelve years.
Julia: I spend most of my free time either outside, like hiking, or reading — real books with paper.
Ari: I thought that was illegal now?
Julia: It is here.
Just one last question for you.
Ari: 4-3-2-6! I’m glad you asked — it’s my favorite coordinate system.
Matt: 3-8-5-7 is way better, man.
Julia: …
Are you a geohipster? Why or why not?
Ari: Oh, absolutely. It’s interesting that all of us came to Kairos, not completely illiterate in the ways of GIS, but certainly not as well-steeped. And I was actually thinking about this on the way home: we have non-GIS operational data about what we do, but the core of what we do — everything is geo data. Like, there’s no non-geo data. And, what we’re trying to build is: taking a novel stream of data about the earth, and then running it through very, very modern software pipelines, to automate its processing, it’s production, all of that, in a way that requires understanding the bleeding edge of technology and blending that with GIS. And that’s what we spend all day doing.
Matt: I am geohipster because I make artisanal Geo data. And I’m opinionated about it. And I’m obnoxious. So, here a thing that I do, which is super geohipster: We produce a lot of stuff internally at the company, in WGS84 — which is not a projected coordinate system. It’s a geo-coordinate system — and I constantly complain about this. That we are producing GeoTIFFs in 4326, but we should be producing them in a projected coordinate system.
Julia: …And I want to tell you, we were doing all this way before it was cool.
Ari: One last thing — we use US-West 2 as our AWS data center, because it’s carbon-neutral (they run entirely on hydropower), so it fits in well with our overall mission.
Eric Fischer works on data visualization and analysis tools at Mapbox. He was previously an artist in residence at the Exploratorium and before that was on the Android team at Google. He is best known for “big data” projects using geotagged photos and tweets, but has also spent a lot of time in libraries over the years searching through old plans and reports trying to understand how the world got to be the way it is. Eric was interviewed for GeoHipster by Amy Smith.
Q: You’re coming up on four years at Mapbox, is that right? What do you do there?
A: I still feel like I must be pretty new there, but it actually has been a long time, and the company has grown tremendously since I started. My most important work at Mapbox has been Tippecanoe, an open-source tool whose goal is to be able to ingest just about any kind of geographic data, from continents to parcels to individual GPS readings, numbering into the hundreds of millions of features, and to create appropriate vector tiles from them for visualization and analysis at any scale. (The name is a joke on “Tippecanoe and Tyler Too,” the 1840 US Presidential campaign song, because it makes tiles, so it’s a Tyler.)
Q: I read that you’re working on improving the accuracy of the OpenStreetMap base map. Can you describe that process? I’m guessing one would need to figure out how accurate it is in the first place?
A: I should probably update my bio, because that was originally a reference to a project from long ago: to figure out whether it would be possible to automatically apply all the changes that the US Census had made to their TIGER/Line base map of the United States since it was imported into OpenStreetMap in 2006, without overriding or creating conflicts with any of the millions of edits that had already been made directly to OpenStreetMap. Automated updates proved to be too ambitious, and the project was scaled back to identifying areas where TIGER and OpenStreetMap differed substantially so they could be reconciled manually.
But the work continues. These days, TIGER is valuable to OpenStreetMap mostly as a source of street names and political boundaries, while missing and misaligned streets are now identified mostly through anonymized GPS data. Tile-count is an open source tool that I wrote a few months ago for accumulating, normalizing, and visualizing the density of these GPS tracks so they can be used to find streets and trails that are missing from OpenStreetMap.
Q: In the professional mapping world, I’ve noticed there’s a nervousness around datasets that aren’t time-tested, clearly documented, and from an authoritative source such as the US Census. These official datasets are great resources of course, but there’s a growing amount of data at our fingertips that’s not always so clean or complete. You’ve been successful at getting others to see that there’s a lot to learn about cities and people with dynamic (and sometimes messy) data that comes from many different sources. Do you have any advice on warming people up to thinking creatively and constructively with unconventional datasets?
A: I think the key thing to be aware of is that all data has errors, just varying in type and degree. I don’t think you can spend very much time working with Census data from before 2010 without discovering that a lot of features on the TIGER base map were missing or don’t really exist or are tagged with the wrong name or mapped at the wrong location. TIGER is much better now, but a lot of cases still stand out where Census counts are assigned to the wrong block, either by mistake or for privacy reasons. The big difference isn’t that the Census is necessarily correct, but that it tries to be comprehensive and systematic. With other data sets whose compilers don’t or can’t make that effort, the accuracy might be better or it might be worse, but you have to figure out for yourself where the gaps and biases are and how much noise there is mixed in with the signal. If you learn something interesting from it, it’s worth putting in that extra effort.
Q: Speaking of unconventional data: you maintain a GitHub repository with traffic count data scraped from old planning documents. For those who may not be familiar, traffic counts are usually collected for specific studies or benchmarks, put into a model or summarized in a report… and then rarely revisited. But you’ve brought them back from the grave for many cities and put them in handy easy-to-use-and-access formats, such as these ones from San Francisco. Are you using them for a particular project? How do you anticipate/hope that others will use them?
A: The traffic count repository began as a way of working through my own anxieties about what unconventional datasets really represent. I could refer to clusters of geotagged photos as “interesting” and clusters of geotagged tweets as “popular” without being challenged, but the lack of rigor made it hard to draw any solid conclusions about these places.
And I wanted solid conclusions because I wasn’t making these maps in a vacuum for their own sake. I wanted to know what places were interesting and popular so that I could ask the follow-up questions: What do these places have in common? What are the necessary and sufficient characteristics of their surroundings? What existing regulations prevent, and what different regulations would encourage, making more places like them? What else would be sacrificed if we made these changes? Or is the concentration of all sparks of life into a handful of neighborhoods in a handful of metro areas the inevitable consequence of a 150-year-long cycle of adoption of transportation technology?
So it was a relief to discover Toronto’s traffic count data and that the tweet counts near intersections correlated reasonably well with the pedestrian counts. Instead of handwaving about “popularity” I could relate the tweet counts to a directly observable phenomenon.
And in fact the pedestrian counts seemed to be closer than tweet counts to what I was really looking for in the first place: an indicator of where people prefer to spend time and where they prefer to avoid. Tweets are reflective of this, but also capture lots of places where people are enduring long waits (airport terminals being the most blatant case) rather than choosing to be present. Not every pedestrian street crossing is by choice either, but even when people don’t control the origin and destination of their trips, they do generally have flexibility to choose the most pleasant route in between.
That was enough to get me fixated on the idea that high pedestrian volume was the key to everything and that I should find as many public sources of pedestrian counts as possible so I could understand what the numbers look like and where they come from. Ironically, a lot of these reports that I downloaded were collecting pedestrian counts so they could calculate Pedestrian Level of Service, which assumes that high crossing volumes are bad, because if volumes are very high, people are crowded. But the numbers are still valid even if the conclusions being drawn from them are the opposite.
What I got out of it was, first of all, basic numeracy about the typical magnitudes of pedestrian volumes in different contexts and over the course of each day. Second, I was able to make a model to predict pedestrian volumes from surrounding residential and employment density, convincing myself that proximity to retail and restaurants is almost solely responsible for the number, and that streetscape design and traffic engineering are secondary concerns. Third, I disproved my original premise, because the data showed me that there are places with very similar pedestrian volumes that I feel very differently about.
If “revealed preference” measured by people crossing the street doesn’t actually reveal my own preferences, what does? The ratio of pedestrians to vehicles is still a kind of revealed preference, of mode choice, but the best fit between that and my “stated preference” opinions, while better than pedestrian volume alone, requires an exponent of 1.5 on the vehicle count, which puts it back into the realm of modeling, not measuring. There may yet be an objective measure of the goodness of places, but I haven’t found it yet.
Why did I put the data on GitHub? Because of a general hope that if data is useful to me, it might also be useful to someone else. The National Bicycle and Pedestrian Documentation Project is supposedly collecting this same sort of data for general benefit, but as far as I can tell has not made any of it available. Portland State University has another pedestrian data collection project with no public data. Someday someone may come up with the perfect data portal and maybe even release some data into it, but in the meantime, pushing out CSVs gets the data that actually exists but has previously been scattered across hundreds of unrelated reports into a form that is accessible and usable.
Q: What tools do you use the most these days to work with spatial data (including any tools you’ve created — by the way, thanks for sharing your geotools on Github)?
A: My current processes are usually very Mapbox-centric: Turf.js or ad hoc scripts for data analysis, Tippecanoe for simplification and tiling, MBView for previewing, and Mapbox Studio for styling. Sometimes I still generate PostScript files instead of web maps. The tool from outside the Mapbox world that I use most frequently is ogr2ogr for reprojection and file format conversion. It is still a constant struggle to try to make myself use GeoJSON for everything instead of inventing new file formats all the time, and to use Node and standard packages instead of writing one-of-a-kind tools in Perl or C++.
Q: You’re prolific on Twitter. What do you like about it, and what do you wish was better?
A: I was an early enough adopter of Twitter to get a three-letter username, but it wasn’t until the start of 2011 that I started really using it. Now it is my main source of news and conversation about maps, data, housing policy, transportation planning, history, and the latest catastrophes of national politics, and a place to share discoveries and things to read. I’ve also used long reply-to-myself Twitter threads as a way of taking notes in public as I’ve read through the scientific literature on colorblindness and then a century of San Francisco Chronicle articles revealing the shifting power structures of city planning.
That said, the Twitter timeline interface has become increasingly unusable as they have pulled tweets out of sequence into “in case you missed it” sections and polluted the remainder of the feed with a barrage of tweets that other people marked as favorites. I recently gave up entirely on the timeline and started reading Twitter only through a list, the interface for which still keeps the old promise that it will show you exactly what you subscribed to, in order.
Q: If you could go back in time, what data would you collect, from when, and where?
A: I would love to have pedestrian (and animal) intersection crossing volume data from the days before cars took over. Was the median pedestrian trip length substantially longer then, or can the changes in pedestrian volumes since motorization all be attributed to changes in population and employment density?
Speaking of which, I wish comprehensive block-level or even tract-level population and employment data went back more than a few decades, and had been collected more frequently. So much of the story of 20th century suburbanization, urban and small-town decline, and reconsolidation can only be told through infrequent, coarse snapshots.
And I wish I had been carrying a GPS receiver around with me (or that it had even been possible to do so) for longer, so that I could understand my own historic travel patterns better. I faintly remember walking to school as a kid and wondering, if I don’t remember this walk, did it really happen? Now my perspective is, if there is no GPS track, did it really happen?
Q: Are you a geohipster? Why or why not?
A: I think the most hipster thing I’ve got going on is a conviction that I’m going to find a hidden gem in a pile of forgotten old songs, except that I’m doing my searching in promo copies of 70-year-old sheet music instead of in the used record stores.
Nate Smith is technical project manager for the Humanitarian OpenStreetMap Team. He leads out the OpenAerialMap project and dives into all things technical across HOT’s operations. Originally from Nebraska, he is now based in Lisbon, Portugal, slowly learning Portuguese and attempting to learn to surf.
Nate was interviewed for GeoHipster by Amy Smith.
Q: We met at State of the Map Asia in Manila! What was it that brought you to the conference?
A: I came to State of the Map Asia through my role in two projects with the Humanitarian OpenStreetMap Team: OpenAerialMap and a new project called Healthsites. I had the chance to give short presentations about the projects, plus I wanted to connect with the OpenStreetMap community in Asia about the projects to get feedback and input on the direction of the projects.
Q: Tell us about the Humanitarian OpenStreetMap Team (HOT) and how you got involved.
A: I’ve been involved in HOT in one way or another since 2011. At the time I had just joined Development Seed in Washington DC. I began to get involved in any way I could with HOT, most of it started with trainings about Mapbox tools or collaborating on projects. Most of it initially revolved around helping identify data that could be helpful in an activation or joining in tracing. Over the years, I gradually got more involved in working groups which is the best place to get involved beyond contributing time to mapping. I’ve since joined HOT as a technical project manager to help build and manage projects around some of our core tools like OpenAerialMap or OSM Analytics.
Q: For those who may not be familiar with HOT, “activation” is kind of like bringing people together to participate in disaster mapping or a similarly geographically-focused humanitarian mapping effort, did I get that right?
A: Right, a HOT activation in the traditional sense is exactly that. It is an official declaration that the community is coming together to aggressively map an area for a disaster response. The Activation Working Group is one of several working groups where anyone can get involved, and they define the protocols, monitor situations, and are in contact with many OSM communities and humanitarian partners around the world.
Disaster mapping is a core part of the work HOT does. Not everything but still a big part. If you’re interested in helping think about activation protocols or want to help organize during an activation, come join and volunteer your time to support the work.
Q: What are some interesting projects you’re working on?
A: I’ve been actively working on two interesting projects: OpenAerialMap, and for lack of a better name at the moment, the Field Campaigner app. OpenAerialMap launched two years ago and we’ve been slowly rolling out new features and working with partners on integrating new data since. What’s interesting is the work we’re doing this summer — we’re rolling out user accounts, provider pages, and better data management tools. This is exciting as it lowers the barrier to start collecting imagery and contributing to the commons.
The second project is our new Field Campaigner app. It has a generic name at the moment but it’s part of a move for us to have better tools to manage data collection in the field. A majority of the work the global HOT community does is remote mapping. While this is super critical work and extremely helpful for people on the ground, there is a gap in how work is organized on the ground. This work looks to help improve the way data collection is organized and coordinated on the ground — we want to see field mapping in OpenStreetMap to be distributed and organized well. This work also crosses over some similar work that is happening across the board in this area — Mapbox is working on analyzing changesets for vandalism and a team from Development Seed and Digital Democracy through a World Bank project are working on an improved mobile OSM data collection app.
Q: How easy/hard is it to build these tools? Once they’re out in the world, what are some ways that people find and learn how to use them?
A: It’s not easy building tools to meet a lot of needs. A core thing for success many times is dogfooding your own work. We’re building tools that serve a wider audience but at the core we’re testing and helping spread the word about the tool because we use it.
But just because it’s not easy doesn’t mean people shouldn’t be trying. The more we experiment building tools to do better and faster mapping, whether it is remote or in the field, the more information we will have to improve and address the challenges many communities face.
Q: It looks like your job is fairly technical, but also involves outreach. Is there a particular aspect of your work that you enjoy the most?
A: I think the mix of technical and outreach is what I love most. Spending part of my day diving into some code while the other part talking or strategizing with organizations is what I’ve had the chance to do over the last six years through working with Development Seed and now HOT. I enjoy trying to be that translation person — connecting tools or ways of using data to solve real-world problems. I think one of the things I enjoy the most is the chance to help build products or use data with real world impact. Being able to support MSF staff responding to an Ebola outbreak at the same time working with world-class designers and developers is pretty great.
Q: Looking at your Twitter feed, you seem to travel a lot. What’s your favorite / least favorite thing about traveling? Favorite place you’ve been? Any pro travel tips?
A: I traveled a bit while living in DC but now that I’m living in Lisbon, Portugal I’ve had the chance to do some more personal travel throughout Europe which has been great. This past year I’ve had a chance to travel through Asia a bit more through HOT-related projects. My favorite part of traveling is the chance to meet people and experience new cultures or places. There are some incredible geo and OSM communities around the world and it’s been awesome to meet and work with many of them. Least favorite — awkwardly long layovers – you can’t get out.
I think my favorite spots have been Bangkok and Jakarta. I find that I enjoy big cities that have great food options. As for tips, I would say pack light and do laundry when you’re traveling, and always make time for good local food.
Q: Would you consider yourself a geohipster? If so, why, and if not, why not?
A: Heh, that is a great question. I think I’ve become less geohipster moving to Portugal. I drink light European beer, I don’t bike because there are too many hills, and drink too much Nespresso. Although I’m still a Mapbox-junky, work at a cowork in my neighborhood, and love open source, so maybe I still lean geohipster. 🙂
Q: On closing, any words of wisdom for our global readership?
A: Get out and visit a new place in the world if you can. And while you’re at it, reach out to the OSM communities there and meet them in person. You’ll meet some incredible and passionate people.
Amy Smith is a Geospatial Data and Technology Specialist with Fehr & Peers in San Francisco. She’s had some great opportunities working with geographic information systems in a variety of fields, including environmental studies, satellite imagery analysis, water resources, and transportation planning. Amy currently spends her days working with an amazing group of people focused on improving transportation in our communities. In her free time she enjoys exploring the hills of San Francisco.
Q: A few years back we used to work together but I don’t actually remember how you got into GIS. You have a master’s in it right?
A: I do! I have a Master’s in Geography and Regional Studies. I got into GIS through a chance encounter with a geography professor whom I passed in a hallway on campus. Somehow we started talking about geography. I was undeclared at the time. I was trying to decide between GIS and intro to computer science for a general requirement. He told me a bit about GIS and geography, and that really won me over. Who knows, I could have been a computer scientist if I hadn’t met him!
Q: You didn’t leave computers entirely though, you’re pretty slick with code… In fact, you’re a great promoter of Python. Which came first – GIS or coding?
A: My first programming/scripting language was Matlab. I learned it while I was working on my master’s studying space-based synthetic aperture radar data in the Florida Everglades. Through learning Matlab, I learned the basics of programming logic. When I started using desktop GIS every day for work, it got me thinking about ways I could be using programming for spatial analysis, which led me down the path to Python. Since then, I use it almost every day, and not just for spatial analysis.
Q: What other tasks do you use Python for?
A: Lately I’ve been using it to prepare transit data for travel demand models. Since many of the inputs of the models are text-based, Python lends itself well to these types of tasks. It can also come in handy for automating things you’d rather not do manually. For example, I had an Excel spreadsheet with multiple worksheets that needed to be saved as individual CSVs. Instead of exporting them one by one, I wrote a script to iterate through each worksheet and save it as a CSV. Kind of a mundane example, but it’s this type of thing that I think can save lots of time at the end of the day.
Q: Speaking of time, you did a transportation study and saved time by scripting some node-based analysis of road segments to bicycle accident occurrences. I saw your talk at the ESRI UC where you talked about becoming one of the points. Do you know if that study has been reviewed by any of the traffic safety folks out in your area, has it helped any?
A: That was one of the first projects where I got a chance to develop a custom script tool for ArcGIS. The tool uses a roadway network and collision data to pinpoint high incident collision areas that might need attention. The tool was applied most recently by Placer County here in California to run a collision analysis of their entire county-maintained roadway network, which used to be a manual review process. They used some of the results to apply for grants and received several grants funding highway safety projects. Another benefit of the tool is that the county can continue to use it with new data in their safety programs.
Q: You’ve taught workshops on Python and even done some online workshops. Do you have any more in the future, or are you branching out to something different?
A: I’m planning some internal Python training here at Fehr & Peers for our planners and engineers who’d like to learn more about it. I’m always happy to talk with others about Python, so I hope there are more opportunities out there for workshops. I’m still learning too, so I’m always on the lookout for workshops and meetups others are hosting. In terms of branching out, I’ve recently been diving into JavaScript. There’s a library I’ve been learning about called D3 that has some great spatial as well as non-spatial capabilities. I’m still in the “stumbling through it” phase, but luckily there’s a great user community online and here in the Bay area that’s eager to share knowledge.
Q: A few months back you attended my first #maptimeSF with me; now that you’ve moved out to San Francisco I see you get to go to #maptimeSF more often. For someone who is thinking about attending their first Maptime, how do you think it helped you as an advanced GISer?
A: Maptime is a meetup that’s happening in many cities around the world where folks can get together, learn about maps, make maps, talk about maps, or maybe just hang out with friends. One of the things I love about Maptime is that it’s open to all skill levels and backgrounds. People are encouraged to ask questions and learn from each other. It’s a very welcoming environment. I’ve learned a lot about how others outside of my industry are using geospatial data and technologies. It’s also encouraging to see a thriving interest and enthusiasm for maps.
Q: Hearing about your work in transportation is really interesting. The water side still misses you. What are you up to at Fehr & Peers? Any interesting projects you can share with us?
A: I have so many great memories from my time with the Department of Water Resources in West Sacramento — it’s where I really started to get my feet wet (pun intended) with Python! It’s also where I learned to drive a boat. I definitely miss the field work collecting bathymetry data in the Sacramento-San Joaquin delta (picture below for proof), and of course the people too!
Amy Smith collecting bathymetry data in the Sacramento-San Joaquin delta
One of the great things about GIS is that it’s applicable in many different industries. Transportation planning has a lot of great uses for GIS too. One of the more recent projects I worked on focused on improving cyclist and pedestrian access to transit stations. The project had a large data organization component that involved gathering available spatial data and organizing it in a consistent way so that we could use it in a series of network analyses. We looked at some of the ways that a well-connected network might help improve access to transit, making it easier for people to walk and bike to stations. I’m currently working on a project, also transit-related, that involves improving transit in an area that doesn’t have a lot of existing transit. It can be a challenge to anticipate how new facilities will affect travel in an area if you don’t have many observations on how people are currently using transit. In this case, we’re identifying places that have developed transit networks and that share similar characteristics with the study area that’s considering improving or expanding their transit system. Both of these projects are very much rooted in spatial analysis, but also require local knowledge. Another fun part of my job is getting to know new areas and talking with people to learn about qualities specific to their region that might not be obvious from just looking at the data.
Q: If people are looking to check out some of your cool stuff, where can you be found online?
A: I tweet about spatial topics, transportation, and my endless appetite for spinach @wolfmapper.
Q: Geohipster Amy Smith is awesome! How do you feel about being part of spreading the geohipster gospel?
A: I’m a big fan of GeoHipster! I was trying to disguise myself a bit by using a seriffed font, but I think you found me out anyway. 🙂
Q: Speaking of transportation, I’ve got to wrap this interview up so I can cycle to work. Is there anything else you would like to share with #geohipster readers?
A: I recently learned about a spatial data format called topoJSON that’s one of my new favorite things. I found out about it at a recent Maptime on D3 and have been reading more about it on Mike Bostock’s wiki. Also, I’ll be helping host a webinar on transit planning with Code for America next month. Tune in if you’re interested!