Monday, February 9, 2009

Benefits of 64-bit Architecture in Geospatial Imaging

I asked Hammad Khan, an ERDAS IMAGINE Software Engineer, to write a discussion of 32- and 64-bit processing in ERDAS IMAGINE. Thank you, Hammad.
--

Recently, we have noticed a lot of discussion in our community centered on the benefits of 64-bit computing. Having discussed 64-bit processor architecture with customers and product managers here at ERDAS, I felt it may help to share some thoughts on how 64-bit computing impacts our work in the geospatial imaging world.

As opposed to their more common 32-bit counterparts, 64-bit processors have two major advantages: the ability to access more memory, and an increased number of General Purpose Registers (GPRs). Memory is used by programs to store data needed for calculations. Registers, by contrast, are used to store specific values (a single number, for example) for very quick access in calculations. Data is moved from memory to registers prior to calculations being performed. Another advantage of 64-bit processors worth mentioning is the introduction additional SSE/SSE2 registers, which are registers used for highly-optimized, highly-repeated calculations.

What does this mean for software designed for geospatial analysis? By nature, geospatial analysis algorithms must run on imagery, data which are large, and getting larger. As such, geospatial analysis algorithms rarely hold entire datasets in memory, and are almost always written using techniques such as tiling data access to allow the program to run to completion.

For example, an ADS-40 sensor will frequently generate data that is hundreds of gigabytes in size. Further, we are beginning to see multiple-terabyte images pop up, such as the data used by the Oregon Imagery Explorer project at the Oregon State University. Processing data of this magnitude will require tiled access or another clever data access technique; regardless of how many address bits are available, the entire dataset has a good chance of not fitting in memory.

To illustrate, most traditional classification algorithms may be performed by loading a tile of the image into memory, processing the tile, writing the output, and then loading the next tile. This approach may be (and, in ERDAS IMAGINE, is) optimized by loading the next tile to be worked on while the current tile is being processed. Notable exceptions to this tile-based data-access approach include terrain generation, where generation of a TIN cannot be neatly broken up: features in one tile may, and frequently do, influence features in adjacent tiles. This limitation can be circumvented by using a spatial index (say, using a quad-tree) to intelligently access appropriate tiles; regardless, loading the entire image into memory is not a recommended option.

As computation seems to be our focus above, does this imply that geospatial analysis can then benefit from the computational speedup offered by the additional registers available on 64-bit processors? It is true that the ability to have more data directly ready for access will speed up calculations. However, frequently, the processor is not the bottleneck in executing algorithms. Disk access speeds, and data transport speeds over the associated buses, are frequently seen as bottlenecks. Note that the speed here is primarily getting data read from the disk into memory, not the speed of getting the data from memory to processor registers.

A number of techniques are available to speed up algorithms in the 32-bit world. Examples include pipelining data read and processing, where the next tile is accessed while the current one is being processed; multi-threading processing, where the algorithm is split to be performed in parallel; and using multiple processes wherever possible (say, simultaneously processing multiple independent frames of an RPF dataset) to take advantage of dual- and quad-core architectures. These represent just some of the many tools available to optimize geospatial algorithms. Moving to a 64-bit architecture is another available tool.

This is not to say that moving to a 64-bit architecture will have no benefit whatsoever. Optimization of an algorithm must take a holistic view of the algorithm. The appropriate tools for optimization must then be selected to address bottlenecks specific to the process in question. As such, the benefits of a 64-bit architecture must be kept in perspective.

While there is no doubt that in the near future, we will move to a 64-bit processing environment, and while personally, I look forward to the day where a 32-bit architecture presents primary limitations to our processes, we have found that our bottlenecks lie elsewhere. As such, certainly within our problem space, other optimization techniques offer more bang for the proverbial buck.

We continue to work to make ERDAS IMAGINE the strongest package available for working with geospatial imagery. We aggressively remove bottlenecks as we find them; in addition to the significant performance improvements seen over the previous release, scalability over computing resources will be especially notable in the upcoming release. In fact, of all the releases I have been involved with, and I am particularly excited about the upcoming release. We hope to demo some of the new features at ASPRS in March, and hope you find it as exciting as we do.