About a year ago, we encountered our first Geiger-mode lidar files. Geiger-mode lidar is a very high-density collection technology that has been in military use for more than a decade but is now making its way into commercial and civilian mapping applications. The technique is cool and worth reading up on. For us, the datasets exhibited a new level of point density and noise and seemed like a probable fit for MrSID compression.
Along with its particular ancestry, Geiger-mode lidar came with a file format that was just as unfamiliar to us, the Binary Point File (BPF). As per its name, this is a binary format that was originally developed for speed and space improvements over an ASCII-based predecessor. It has had a couple of significant revisions to become more streamable, more compressible, and to carry more metadata. The file format definition is fairly short (around 40 pages), at least compared to some other military format definitions I can think of. But still a bit more complex than LAS’ and it carried some extra weight as the format preserves compatibility with some of its earlier (and less-well-defined) revisions. We were not excited about the prospect of rolling our own implementation.
Naturally, we looked for third-party implementations but found a surprising dearth. In fact, we only found a single API framework that supported the format. PDAL (Point Data Abstraction Library) is a rich open-source (BSD) toolkit for reading, writing and processing point cloud data, and it included a driver for BPF that was already in active use and even had two years of community maintenance behind it.
The straightforward integration would be to write a bridge class, translating PDAL’s read/write API and data structures to our internal ones. But the twist came for us as we dug into the details and realized we were facing a conflict of architectural assumptions. PDAL is able to construct and run full (and potentially very complex) lidar processing pipelines. These can be defined using a declarative JSON syntax, using building blocks from a wide array of available transform operations and file drivers. Building these pipelines, managing the flow and buffering of points is a big part of what PDAL is good at. However, our internal compression pipeline was already established and had its own way of pulling and processing points. In short, we both wanted to drive.
This could have been an insurmountable obstacle in a closed-source API. But since the source is open, we were able to work around the problem directly. Making a handful of private methods on pdal::BpfReader into public ones (viz. ready(), processOne() and done()), enabled us to simulate enough of PDAL’s point pump to use the BPF format driver as a point iterator within our own framework.
In the end, this experiment in compressing Geiger-mode lidar was fruitful and the BPF implementation above evolved into released functionality as Geiger-mode support in GeoExpress. PDAL is a uniquely-enabling, foundational piece of that support.