Friday, May 23, 2008

Native and fast lossless compression within ERDAS IMAGINE

While ERDAS IMAGINE supports lossless compression through several different means (MrSID, JPEG2000, Packbits, etc.), this discussion focuses on IMAGINE's native run length encoding (RLE) compression. Typical RLE algorithms compress adjacent duplicate pixel values and are a common lossless compression. The implementation of RLE in ERDAS IMAGINE uses a dynamic range run length compression (DR-RLE) rather than the typical RLE. DR-RLE not only compresses adjacent duplicate pixel values but also compresses unused pixels values within the data’s defined data range. This has been the case since the earliest days of .img for both athematic (continuous) and thematic data.

What do we mean when we say it compresses the dynamic range of the data? An example provides a good description. Many modern data types are collected as 11 or 12-bit data, yet we must store it as 16-bit data. If we have an 11-bit dataset, all possible pixel values between value 2048 (unsigned 11-bit +1) and value 65,535 (unsigned 16-bit max) are not used by the dataset, but must take up disk space. When we store these 11-bit data in the .img data format using a DR-RLE compression, all unused pixel values are compressed to a small fraction of its initial 16-bit size.

Elevation data can also benefit from the .img lossless compress because few DEMs need values anywhere near the 16-bit maximum value of 65,535. Thematic data also can benefit. Often have 100 or less categories (classes) and thus must be stored 8-bit. When encoded using DR-RLE, unused values are compressed.

One gottcha discovered by Donn Rodekohr at Auburn Univ., floating-point data stored in the .img DR-RLE should not be processed with intensive math functions in the modeler. Using these floating-point data in complicated models are shown to have some degradation in processing speed. We will address this issue in future versions of IMAGINE. So, keep your DR-RLE data storage limited to integer data for the time-being.

Below are a few size comparisons. From my previous GeoTIFF post (http://field-guide.blogspot.com/2008/05/what-is-wrong-with-my-geotiff.html) you will recall that Packbits is TIFF’s RLE compression. Packbits compresssion is not a DR-RLE compression.

DR-RLE comparison using a 512 x 512 single band .img file:
271KB 8-bit uncompressed (with pyramid layers)
271KB 8-bit RLE compressed (with pyramid layers)
527KB rescaled to 16-bit uncompressed (with pyramid layers)
272KB rescaled to 16-bit RLE compressed (with pyramid layers)

Packbits comparison using a 512 x 512 single band .tif file:
258KB 8-bit uncompressed (without pyramid layers)
258KB 8-bit packbits compressed (without pyramid layers)
514KB rescaled to 16-bit uncompressed (without pyramid layers)
514KB rescaled to 16-bit packbits compressed (without pyramid layers)

But what about the speed? When ERDAS implemented DR-RLE back in 1993, we focused on it being efficient. This effort, and because DR-RLE is so very simple to compress and decompress we see some speed gains from DR-RLE compressed img data over uncompressed .img data. This is the case for today’s processors as well. Having an efficient access mechanism, small data footprint stored in an ultra-simple compression & de-compression format helps processing speeds to disk or to screen.

When an .img file is >2.1GB, IMAGINE rolls the data over in an .ige file and the .img file stores metadata, spatial indices and so forth. The DR-RLE compression is not supported within .ige files. The original design of the .ige is for the simple and fast access of very large uncompressed data files. The .ige design does not allow for any file compression whatsoever. (See: http://field-guide.blogspot.com/2010/04/when-does-img-image-roll-over-to-ige.html)

For files >2.1GB which can stand a little value loss, we suggest our own ECW format or LizardTech’s MrSID format. We have added a ECW read capability to IMAGINE Essentials 9.2 and an encoding capability with IMAGINE Professional 9.2. While IMAGINE supports MrSID Generation 2 and Generation 3 (MG2 & MG3 respectively), we must use MG3 for MrSID files >2.1GB and lossless compression. ECW does not support lossless compression.

Are there any side effects (good or bad) when pulling these .img DR-RLE data into ArcGIS?” Not that I have seen in 15 years. In-fact, ESRI and ERDAS both use the same DR-RLE algorithm. ESRI uses it in GRID, ERDAS in .img. As well, ERDAS provides (writes) many of the raster data objects (RDO) in ESRI products, and .img support is one of them (RLE being part of .img support).

Any positive side effects in ArcGIS? Where ESRI uses the ERDAS / ESRI RDO within their product lines, the user will see the same performance improvements in ArcGIS. In other words, other than the issue with floating-point data, rock-n-roll in ArcGIS.

How do you take advantage of the .img dynamic range run length encoding in ERDAS IMAGINE?
Set “RLE” as your “Default Compression” in the "IMAGINE Image Files (Native)" preference category.
Set “Default” as your “Data Compression” in the "Spatial Modeler” preference category. This will cover Import, Save As, Subset (which is currently a modeler function), Mosaic Tool, MosaicPro, Spatial Modeler and all IMAGINE for files less than 2.1GB. We are planning to make RLE 'on' in the preferences in 9.3, if I hear a demand growing from the user community. Comments?

IMAGINE now supports encoding and decoding DR-RLE, JPEG, LZW, TIFF Packbits, MrSID, JPEG2000, ECW, et al. What other compressions do you folks wish to see within the ERDAS IMAGINE suite?

20 comments:

saurabh said...

Hi,
I am working on Quick bird MSS and PAN data,my area is 4600 sq. km and I need to generate a PANSHARPENED image. I am facing difficulties, since the file size is very heavy.
<1> So, what should be the minimum hadware requirement(HARDWARE)
<2> I am facing a unique problem, in the the tiles I have received when I subset the images into workable tiles and run PANSHARPEN process the some output tile image gets a unique band colour, where it is difficult to produce a FCC.
The gree color of a FCC gets turned into Blue shade.
Lastly, <3> Plz. suggest me some data compression format and its source so that I can use the image in AUTO CAD for image interpretation. Note: my requirement is that the compressed image to be displayed in FCC combination, so that image interpretation is plausible.
Thank You

Paul said...

Often satellite data vendoes use odd blocking sizes, 1 x 2800 rather than 512 x 512. These odd blocking factors allow them to write fast, but is horrible for reading and processing.

Also you will need pyramind layers (rrds, subsampled images) so that when you zoom out the CPU will not have to calculate these every time. Calculate these once and store it on disk.

If you use ECW or SID, these contain rrds inside the data. IMG can as well, but are often alongside. TIFF can do this as well, but most products do not do this (as many do not read them).

I have been in contact with AutoDesk about supporting images created in ERDAS IMAGINE better. They are working on it. If you drop them a line (and have a few friedsn do the same) we may be able to speed up the work.

Amy said...

Hi - In a search for info on *.ige files I came across your blog. I'm trying to pull in the new Alaska NLCD data which comes in an *.img/*.ige combo (~8 gb). I can pull the img file into ArcMap (9.3), but can't do anything with it because (as far as I can tell) Arc can't read the ige file. I've worked with img files before with no problems in Arc, but this ige file is just not working. Ideas? Pull it back into IMAGINE and try a different format? My next best idea is to call the NLCD folks and whine.

Thanks!!

Paul said...

I do not like the .img / .ige solution. It was chosen to allow the customers to break the 32-bit limits so that the .img format can go far beyonf 2.1GB. In effect, the .img file acts as an index and hold the image info metadata. The .ige file contains the actual data. Yet, customers try to use the .ige without the .img file. This will not work. There are some bumps in the road with ESRI moving from the ERDAS/ESRO developed raster data objects (RDO) to GDAL. The GDAL implementation of .img is just not as robust as the tools ERDAS writes.

Cutberto Paredes said...

Hi Paul,

I'm trying to process a bunch of small IMGs using a model, but it seems that it cannot read read ESRI RLE IMGs (created with ArcMap 9.2). I tried changing the settings as you suggest in this post but it didn't work. Any suggestion? Thanks! Cutberto.

Paul said...

Cutberto, If the images can be displayed in the Viewer then the issue my be something else. In 9.2 ESRI started moving from the ERDAS Raster Data Objects (RDO) to GDAL. GDAL is not as robust as RDO, and I am not sure why the move, as ERDAS has given ESRI RDO and they can now use it for free....

Test to see if the Viewer can display the images and please report back.

Cutberto Paredes said...

Hi again. The images can be displayed in the viewer, but first I had to recalculate statistics and change the bin function from default to something else (I chose linear), otherwise the viewer would not display them. Thanks!

Paul said...

I select Direct Binning. Default uses the binning existing in the file already. Well, it seems ESRI is not writing the binning correctly. Let me ask a few questions.
Is the projection OK?
Do you have an original file from ESRI? Did it come with a .aux file? If yes, rename the .aux file and open in the Viewer or GLT. Dies it display correctly? I know ESRI is not creating .aux files correctly, I want to see if that is the issue. Acutally, .img files do not need .aux files. .aux files are for non-.img files.

Paul said...

Also, you might want to go to ERDAS Communities for more expansive help from all the ERDAS users.

http://community.erdas.com/forums/

Cutberto Paredes said...

The files were created by ArcMap with a metadata (XML) file. I tried renaming the file, the viewer did display the file since I had previously changed the binning function, but the model stills not reading it correctly. The projection is read properly by ImageInfo (UTM WGS84, 14N).

The model I’m trying to run is to compute the global mean and save it to a text (tbl) file. When I run the model I get -2.379481275965044e+036 as output (I get the correct mean with other files). I opened the model with a text editor and in the raster declaration, it is declared as uncompressed “COMPRESSION UNCOMPRESSED;”. Can I change this value to declare it as RLE?

I posted this in the forum today. I still waiting to see if someone else has had this problem before.
Many thanks!

Paul said...

The problem is not likely in the RLE Compression. I believe the binning is a major part of the problem. Use Image Command Tool and re-calc stats using direct binning. Do not do anything else; then try that image in the model.

Cutberto Paredes said...

Well, it seems that the problem is the NoData value used by ArcMap
(-340282346638528860000000000000000000000.000000) I changed this manually to 0 and the model computed the global mean correctly. I also changed the binning function, but without changing the NoData value it output a wrong mean.

There is no way of changing the NoData value in batch mode, right? I have a few thousands of files...

Cutberto Paredes said...

Just to make clear, the NoData value is not present in the file, all pixels have valid values. The problem is the value set as NoData.

Paul said...

Try to ignore the value in stats calculation. There is no way to batch change no data.

Have you spoke with ESRI about this problem?

Cutberto Paredes said...

I tried ignoring the value in stats calculation, and also tried to declare the value as background in the script generated from the model (SET DEFAULT STATISTICS...) and it didn't work.

It seems that the only way is to read the file with gdal and write it again setting the mvFlag = 0. I did this in R (rgdal) and it worked.

I will contact ESRI to let them know about this. Thanks a lot for your help.

Paul said...

Cutberto, ERDAS IMAGINE 2010 Image Command Tool will have an option to set the clear, set, or reset the 'NoData Values' in an image in a single, batch, or parallel batch mode. ERDAS IMAGINE 2010 will be released before Christmas 2009.

Cutberto Paredes said...

That's good news! It's even more interesting to read the word 'parallel'

I hope my university (Leicester) will update soon to it.

Thanks!

ashokvardhan said...

Hi,

Could anyone expalin me, how to perform Change Vector Analysis in ERDAS 9.1. I am having IRS data of the years 1998 & 2004. I need to perform change detection on these data, using change vector analysis.

This would help me in making my M.Tech project complete. I have only a week days to close my project.

Jarlath O'Neil-Dunne said...

I would really like to see a 64-bit version of img that would eliminate the .ige file.

Tarun Punetha said...

Hello, I want to do change vector analysis in Erdas imagine for forestry project , can any one tell me the basic steps of CVA technique .