Hidden Layers

How AI is Helping Map the Coastline | EP.8

In episode 8, Ron sits down with Jeff Perry, a prominent engineering scientist in the Department of Aerospace Engineering and Engineering Mechanics at The University of Texas at Austin. Together Ron and Jeff talk about how AI is used in mapping coastlines, using data from the ICESat-2 (Ice, Cloud, and Land Elevation Satellite 2) satellite to map the ocean’s floor and much more.

Resources:

3D Geospatial Laboratory website: https://magruder3dgl.com/

UT Austin's Center for Space Research: https://www.csr.utexas.edu/

SlideRule Earth: https://www.slideruleearth.io/web/

SlideRule Earth's GitHub repositories: https://github.com/ICESat2-SlideRule

‍

Ron Green: Welcome to Hidden Layers, where we explore the tech and the people behind artificial intelligence. I'm your host Ron Green, and I'm excited to be joined today by Jeff Perry to discuss how he and his team of scientists are using advances in AI and computer vision specifically to map the ocean coastline in unprecedented detail. Jeff is a senior engineering scientist at the Department of Aerospace Engineering and Engineering Mechanics at the University of Texas at Austin. Jeff is currently the technical lead for the 3D Geospatial Laboratory at the Center for Space Research and has a background in computer vision, machine learning, and software engineering. Prior to the Center for Space Research, Jeff was a technical lead at the Applied Research Laboratories and at the Center for Perpetual Systems at the University of Texas. He also has worked as a developer in the software industry where he specialized in image processing, software optimization, and video compression. Welcome Jeff.

Jeff Perry: Thanks for having me.

Ron Green: Well, I'm excited to dig into this. I don't know much about bathymetry. I believe that's the right term for essentially mapping the topology of the ocean floor. So maybe let's kick off. Could you tell us a little bit about your work currently at the Center for Space Research?

Jeff Perry: So our lab, the 3D geospatial lab, we deal primarily with 3D data, which looks a little different than most data than we're used to dealing with. And it primarily comes from, well, it can come from many places. It can come from satellites. It can come from terrestrial mounted systems. It could come from drones. Seen a lot of that lately. And it comes in generally a couple flavors. One of those would be from LIDAR. And more commonly now, we see a lot of EO, as we call it, electro-optical imagery, photogrammetric data.

Ron Green: Okay, is that data coming from LIDAR systems like satellites and drones, etc?

Jeff Perry: Yeah, LiDAR systems can be mounted. A lot of them are mounted on satellites.

Ron Green: Really quickly, can you explain what LiDAR is for everybody?

Jeff Perry: Sure. LIDAR, it's a laser optical system. So we call this laser telemetry. So it's a way to measure distance using a laser. And it basically will pulse photons, send a pulse of millions or billions of photons. And it measures the time of flight. It takes for it to leave the device and return, bounce off of something and come back. And if it knows that time, and it knows the speed of light, it can measure extremely accurately how far away something is.

Ron Green: And my understanding of LIDAR is that it allows us to do things we really couldn't do before if we were just dealing with the sort of optical light, visible light. For example, we can penetrate like rainforest canopies and things like that. Is that correct?

Jeff Perry: That's absolutely correct.

Ron Green: Okay, and so most of this data I understand that you're operating on these days comes from a satellite. Can you tell us a little bit about that?

Jeff Perry: Yes, so again, it comes from all different sources, but one in particular, the ISAT satellite, it flies 300 miles above the Earth, and it's going, it's making a rotation around, it's traveling around the Earth from the North Pole down and back again, and it takes 90 minutes to complete one revolution. So it's going very quickly.

Ron Green: And it's moving around the earth, I believe, in a way that allows it to scan every part of the earth. So it's literally looking at every square mile of the earth.

Jeff Perry: Well, yes. Over time. Over time. It sends down, you know, laser of course is very small and, you know, when it leaves the satellite, it's extremely small. It spreads out as it goes down. So the footprint is about 11 meters. But as the satellite goes around the earth, it moves around a little, it changes course so that it'll cover every part of the earth. And it takes about 90 days then to get a full, a full mapping of the earth. And then it starts, it starts over.

Ron Green: It just starts over. Okay, so that's incredible. You're saying this thing is 300 miles above the earth?

Jeff Perry: Yes.

Ron Green: Um and but the the surface area of the projection in what was it 11 meters?

Jeff Perry: Yes.

Ron Green: That that just that's unfathomable. So the light is getting back to the satellite with enough fidelity that you're able to use that to map the ocean coastline. Let's talk about that a little bit. How does that work?

Jeff Perry: So, this is, you know, the, so it's ICESat, so it was originally, the ice is, it was, it was meant to, one of the purposes of the original intent was to measure changes in the cryosphere, the polar regions. But what they found is that it also penetrates, I mean they already knew it penetrates forests and they knew that, you know, it's measuring the whole earth. It never turns off, right? It's constantly collecting data in real time. But what they were surprised to learn is that, not just that it penetrates water, but that it goes down pretty deep. And in some cases it can go 50 or 60 meters. So, so, you know, the coastal regions, coastal regions are at down to say 200 meters that's where 90% of marine life lives.

Ron Green: Okay, so that makes sense. So even though it can't go near the bottom of the ocean, we're not that interested in that generally from some transportation. Yeah, some people from a transportation or a sea life perspective, right? Most of our focus is along the coastline where it works pretty well.

Jeff Perry: Yeah, it works well. So recently, there's just this huge increase in the amount of data available because of this. And it's an area that, if you think about it, the coastal regions are areas that it was hard to map out until this technology because shallow water, you can't send a ship in a lot of these areas. It's too shallow. So it's exposing areas that it might have been too dangerous to send a boat, or you can't wait out and measure it. So yeah, it's exposing these very interesting areas of the world that, previously, we couldn't really.

Ron Green: That's fantastic. So let's talk a little bit about the data before we transition into talking about computer vision and how you're leveraging that. So the data coming from the ISAT2 satellite. Is it I believe it's like point cloud data? Can you talk about what the data looks like and how the volume that you're dealing with from a from a data perspective?

Jeff Perry: Sure. So, I mean, generally, the volume is more than you can, you know, more than again, this thing's constantly collecting data.

Ron Green: And is that data publicly available?

Jeff Perry: It is publicly available, yes. From that particular, there's lots of data. This particular instrument, NASA publishes it. And I can talk more about that later, but the, so point cloud data, it's very, you know, a lot of the data that we deal with is two-dimensional imagery of that nature. This data looks different in that. One analogy I like to make is to describe this is imagine you had like a machine gun, and you stood there, and I don't recommend doing this, but you know, and you could just fire it, bah, wherever a bullet hits, right? If you're out in the middle of next to a, wherever a bullet hits, imagine that you could magically know exactly in 3D space relative to where you're standing what the X, Y, Z coordinates of those bullets are, right? So now, so it's each data point represents a surface on, you know, a surface on manifolds in the world. So the important point is though, is that, and why I say machine gun is that it's randomly spaced. It's not, you know, imagery is a nice regular grid and the position of the data is implied it's part of the format. And we just, what we express in that data are the values at those locations. This is, we call it sparse, a sparse representation where you only represent, you know, certain points and in a way that's kind of more randomly arranged.

Ron Green: Right. I would imagine the atmosphere, cloud cover, maybe even if the seas were rough at a particular time and location, does all of that affect the quality of the LiDAR signal you're getting back?

Jeff Perry: Yes, so for 3D data that comes from a LiDAR system in general has the problem of, again, back with the analogy before, if things are occluded, whatever your vantage point is, you're going to occlude things that are behind it so that doesn't get represented. So now, specifically with the data coming from the iSAT -2 satellite, for example, it's especially challenging because what you just said, cloud cover, you'll get, and again, I have examples of continuous data and all of a sudden you'll just have a break in it. Well, it's because a cloud is going over and it blocks it, it occludes what's beneath it, and so things like that. Now, there's all these other factors in this case for iSAT that affect the quality of the data. So like turbidity of the water is going to affect it because those photons are having to travel through water and come aerosols in the air, as I already mentioned, not just cloud cover but just any type of aerosols. The fact that, so this is the devices counting photons that bounce back. Well, if it's daylight, there's a lot of stray photons out there. If it's nighttime, there aren't as many. And so you may get tons, some of these images, you may get examples of the data that I have, you can see this background scatter and you'll know, oh, this was collected during the day because there's all this noise in the background at nighttime, not so much.

Ron Green: Oh, that makes perfect sense. Okay, and you've mentioned that you've got some images. We were looking at those before we started recording. So for anybody who's listening to this, we will put these images into the video on YouTube. So at any point, if you want to understand better what Jeff's talking about, check out that, check out those videos. So one more question on the data collection. So if the satellite is moving across the globe and covering every area on a 90-day basis, does that mean if you have really strong cloud cover in an area, you basically have to wait three months to get that data again?

Jeff Perry: Um, well, uh, that's yeah. And, and I, I mean, I should be clear. It's not, it doesn't really cover every area of the globe. I mean, the satellite, the flight path, the flight path of the, of the satellite repeats after 90 days, but it can, the, and it has actually six lasers on it. So it has, uh, um, uh, you know, six different kind of, we always say they're pencil lines drawn, but you can direct the satellite. You can change its attitude and point it different as it flies over. Um, before, uh, sorry, I didn't, I, your question.

Ron Green: No, no, that's perfect. That's exactly what I was kind of curious about. Okay, so you've been working in this field for a really long time. I'd love to kind of segue to talk about how AI and advances in computer vision are making your life easier and how it's allowing us to really kind of understand this complex data better. Before we jump into that, maybe describe the old techniques. So if you were dealing with this type of data, let's say 10 years ago, what were the, what was the normal approach and what were the challenges with that?

Jeff Perry: So back then, we would just hand select the features, do feature engineering, and just use random forests, XGBoost, or something. But the time -consuming part was picking out those features, then determining feature importances. Does this work? Does this not? Is this not doing anything? And like that?

Ron Green: Right. Yeah. You're describing the old days very perfectly. You would use your intuition to try to craft features for a model, but then very frequently, those features weren't very robust, right? And so I imagine that now with modern computer vision architectures, you're much more able to just kind of almost throw the raw data into the system and let it handle feature extraction, et cetera. Let's talk about that a little bit.

Jeff Perry: Sure, so as, you know, again, the data we're looking at are very complicated and it gets even more complicated when we're looking at things that are of this non -stationary nature.

Ron Green: And what does that mean just for our listeners?

Jeff Perry: It means that the statistical properties of the data changes often dramatically as you move through, just spatially, as you go from this kilometer to the next kilometer or even this meter to this meter, the statistical properties, say, of the noise might change. And so that makes it very challenging because you tend to, you know, people have their favorite data sets and they say, oh, it works well here, it works well here, and now you go try it somewhere else and the whole thing falls apart because, so, yeah, feature engineering doesn't work as well on the, and especially on these huge, so we're trying to create a global product. I mean, that's really the, one of the key points here. And I just feel like it wouldn't be possible without machine learning if you were having to do feature engineering, it's just going to slow you down. Now what we can do is build images from the data and just feed the image directly into or have even, you know, you talked about CNNs, so the work I do with bathymetry is CNN -based. Typically, I use a residual network. I'll build up an image and feed that in, let the machine find the features itself. But we also, on the 3D data, we're developing techniques usually through transformers that operate on the 3D data itself. We don't even have to build up 2D imagery where we're designing algorithms that operate on the points, on this sparse point cloud, if that makes sense.

Ron Green: Absolutely, that's fascinating. So you're even removing the step of almost any type of pre-processing, right? If you're using some type of transformer-based computer vision model, you just essentially need to tokenize or patchify the input data, and then you let the model do its thing. I love it. Getting labeled data is pretty expensive, and you're dealing with an enormous amount of data. What are some of the things you're doing there around either augmentation or simulation to handle that?

Jeff Perry: So yes, that's a that's a that's a great question. So well, augmentation, usually we do that as a matter of course for most data because it makes the models more robust to do mirroring and scaling and things of that nature. Semi supervised approaches where we we have labeled data, but then we let the model help us label more data. So so we use the model to say, you know, at the very minimum, we can use a model to label that and then we go back and fix it. That might be a little biased, but but, or better yet, we can have the model tell us these areas, I'm very uncertain of you need a human to go do it. Right. And this area, I'm, pretty certain I've got that.

Ron Green: You're looking at the model's confidence levels to figure out where it's uncertainty is sort of maximized when focusing on having humans label that data.

Jeff Perry: Yeah, or another example. So this would be more self -supervised. So I have colleagues, Forrest Corcoran and Chris Parrish are at Oregon State University. So they've developed a model that's self-supervised and what it does. So as I was saying, clouds obscure things, right? And what you'd like to do, if you know what inpainting or data imputation, right, is inpainting as a type of data imputation. So they've developed an imputation model where they take the data and then they'll just go remove patches, right? But they already have, they know what's there. And so that's how they label it. They already have the labels. And then they go, that way now they can apply it to areas where there's cloud cover and fill in what would be there.

Ron Green: Brilliant okay I wasn't sure where you're going there but so so KUNGFU.AI we do a lot of augmentation of self supervised learning as sort of very frequently a pre-training method right to get a model sort of self trained on a domain but you're actually using this technique for that and also for impeding in areas where you might have cloud cover so you can kind of figure out missing data points automatically that's brilliant yeah I love it I love it what ways are you collaborating with with other disciplines out there like oceanographers?

Jeff Perry: Slide Rule Earth is an AWS hosted site, it's developed by NASA, and it's like a rapid process delivery of science data. It's an interactive tool, you can get on it and select a region of the earth and say, give me data from that region. And it's scalable too, you don't have to use the web interface, there's a, you could use an API like we typically, but there's also even a Python module for it, so you can program, it's fully programmable, so you can go and get this data, comes to, so again, iSAT2 is continuously publishing data, right? It gets updated constantly, and this is kind of like a window into that, where all the different products that it produces, you can programmatically interact with Slide Rule to get that data, the kind of data you want in the area that you want, it's brilliant.

Ron Green: Oh that sounds amazing. What ways are the results of your research being leveraged by oceanographers or other disciplines?

Jeff Perry: People are using these data from ISAT, say, and then machine learning to study coastal habitats, like seagrass growth, or coverage, or coral reefs, and how they're changing. Again, laser penetrates foliage, right? So they can see beneath the forest canopy and say, oh, there's a lot of fuel here that could potentially cause fires.

Ron Green: A lot of undergrowth.

Jeff Perry: Yeah. So and then, again, back to the coastal waterways, I mean, some of the maps that people are using to navigate coastal regions, those maps might be decades old, right? They haven't been updated. And so now this is opening up whole new opportunities for people to find, say, new routes through these waterways.

Ron Green: All right, so we like to wrap up with a fun question, which is if you could use artificial intelligence somehow in your daily life to make your life better, what would it be, Jeff?

Jeff Perry: Um, so, yeah, that's a silly example, but, but I, you know, it would probably improve my life a lot if someone could just tell me what, how to meal plan, you know, just to say, just to look at, you know, tell me what to go to the grocery store to buy and what to cook that night. That would, I have a real problem with that.

Ron Green: I love it. I love it.

Jeff Perry: My family would really appreciate that.

Ron Green: That's terrific. That's terrific. Well, this has been a fascinating conversation. I want to thank you so much for coming on.

Jeff Perry: Oh, thank you for having me.