Jim Hughes is a mathematician at Commonwealth Computer Research, Inc. in Charlottesville, Virginia. He is a core committer for GeoMesa which leverages Accumulo and other distributed database systems to provide distributed computation and query engines. He is a LocationTech committer for GeoMesa, JTS, and SFCurve, and serves as a mentor for other LocationTech projects. He serves on the LocationTech Project Management Committee and Steering Committee. Through work with LocationTech and OSGeo projects like GeoTools and GeoServer, he works to build end-to-end solutions for big spatio-temporal problems. He holds a PhD in algebraic topology from the University of Virginia.
Jim was interviewed for GeoHipster by Todd Barr.
Q: Your background is that of a mathematician. How did you find geospatial, or did it find you?
A: Geospatial definitely found me! I started at CCRi in the summer of 2012. That fall, I was working on the code base that eventually became GeoMesa, our Hadoop-based open-source geospatial database. I liked the project enough that I requested to work on it more, and after a few other rotations, I made it back to the project and have been working on it ever since. During my time with GeoMesa, I’ve had a chance to participate in the OSGeo and LocationTech open source communities at code sprints, and also to attend conferences like FOSS4G NA. The conferences and code sprints have been a great way to learn the eco-system while meeting a bunch of great people!
Q: One of your projects is GeoMESA, one of geospatial’s first real applications to deal with Big Data. How did it evolve? Was it a grouping of client requests, or something created in house?
A: CCRi focuses on solving interesting data science and machine learning problems. A customer asked us to transition one of our spatio-temporal analytics from a single server infrastructure to the cloud. We had been using PostGIS for our geospatial data management and processing, and we asked if that (or an analogue) was available in the cloud. When the customer said that the only database was Accumulo, and that it didn’t do geospatial, we wrote the code to make it do just that. After a few months, we realized that this software was a compelling product on its own. From there, GeoMesa has evolved in response to direct use cases.
Q: Most of us in the geo community have ideas about what we want to bring to market. From your experience on GeoMESA, do you have any lessons learned or warnings for those of us who want to do this?
A: Have great documentation and demos! Standing up a distributed database is tough, and adding software to that can be challenging. We’ve used a simple ‘quickstart’ project to show how to use the GeoTools DataStore API to write and read with GeoMesa. The documentation also explains how to set up GeoServer. When sample code and docs aren’t enough, be ready to respond to questions from users. We field questions on mailing lists, Gitter, and Stack Overflow. From those questions, you can get a sense of what folks are using your product for. Recently, I had some great questions about one of GeoMesa’s less well-known features. Those questions can drive simplification of deployments and improvements to documentation. If users can see what they are getting, see it work for them, and get help along the way, they are going to be happy.
Q: Over the past few years there has been an increased presence of the federal government at FOSS4G. What is your take on the adoption of open source spatial technologies both widely across the federal government and with your clients?
A: Having the US federal government involved is great! At that level of government, they have ‘big data’ and a vision to fund and drive innovative work for storing and processing the data. NGA and other agencies are definitely ‘getting it’ by funding and fostering open source technologies. When our work can be shared publicly, many organizations benefit; everyone can get the same code up, running, and working together to achieve more than what we were able to previously.
Q: As you sit on the board of LocationTech, who recently announced LocationCon, is this the first move of LocationTech to finally leave the shadows and become a driving force in the FOSS4G community?
A: I’d say that LocationTech has been moving forward the geospatial software community for a few years now. Some of that has admittedly been ‘behind the scenes’…As the logistic organizer for FOSS4G NA 2015 and 2016, they increased inclusivity for women at the conferences through both a code of conduct and a scholarship program.
Also, behind the scenes, LocationTech reviews its projects’ dependencies. Through that process, LocationTech projects like uDig, GeoMesa, and GeoGig have had lots of GeoTools code reviewed and, in a number places, those project teams have worked with the GeoTools team to clarify licensing and clean up code.
In 2016 and 2017 GeoMesa, GeoTrellis, and GeoGig all completed incubation. These projects represent complex libraries and products which address several areas of innovation in geospatial software. On the basic library level, Spatial4J was the first project to incubate, and JTS is close to graduating. Those two projects are libraries that are widely used: Spatial4J came out of Lucene’s spatial indexing needs, and JTS has been foundational to a number of Java projects. LocationTech is home to basic libraries (like JTS, Spatial4J, and SFCurve) and complex products such as GeoMesa, GeoTrellis, and GeoGig.
Q: CCRi is known for their “Friday afternoon parking lot BBQ.” What is your favorite style of barbeque?
A: Finally, an easy question! I am a big fan of pulled pork and pork ribs. For sauce, I favor spicy and sweet options over vinegar and mustard-based ones.
Q: How do you define a “geohipster” and do you consider yourself one?
A: I suppose a ‘geohipster’ is a geo-/gis- professional or enthusiast who is up on the new trendy, cool technologies (perhaps bearded and wearing plaid?) in the geo-domain. I’ve been working on big-geo and streaming geo data for the few years, so if that’s en vogue, then sure, I can be a ‘geohipster’. From my interest in many of the low-level libraries and the math/geometry behind the field, the moniker ‘geonerd’ might be better.