Sunday, April 25, 2010

Single core vs. multi-core image processing (vol 2)

As discussed in Single core vs. multi-core image processing (vol 1), I want to present the times for ECW and JPEG2000 compression in ERDAS IMAGINE 2010 v10.1. I tested using the Export to ECW and Export to JPEG2000 commands, running in batch mode.

As you may recall, with the release of ERDAS IMAGINE 2010 this past December 2009, each IMAGINE Advantage license provides the customer the capability to batch process up to four simultaneous processes. If you have more IMAGINE Advantage licenses brokered using a floating license manager, each floating IMAGINE Advantage license can add four more simultaneous processes. In this experiment, I tested with one IMAGINE Advantage for 1 – 4 processes and another floating IMAGINE Advantage license for 5 – 8 processes.

With ERDAS IMAGINE 2010 v10.1 the latest version of the ECW JPEG2000 Codec SDK, v4.1 is used. This new version of the SDK has focused in JPEG2000 improvements. This new version of the ECW JPEG2000 Codec SDK will be made available to the market soon.

As discussed in vol 1, thanks to Dell, Inc. for providing the test system and SAM Inc for inspiring the software performance upgrades. The test system is configured as follows:

Dell Precision T7500
Processor
: Dual Quad Core, 2.13GHz with 4MB Cache
System RAM: 4GB 1066MHz
Internal Controller: C4 SATA Non-RAID
Internal Disks: Four 1.5TB 7200 RPM SATA with 3.0Gbps, 16MB Data Burst Cache

External Storage: 15 Bay SAS/SATA Array
Array Controller: PERC 6/E SAS RAID, 256MB Memory
Array Disks: Ten 750GB 7200 RPM 3.0Gbps (note above, array can hold 15 disks)
Configuration: RAID 0

Operating System: Microsoft Windows Server 2008 x64, R2

I ran tests using the 2095 uncompressed strip GeoTIFFs of 5000 x 5000 x 3-bands x 8-bit data (each file is 73,282MB). The target compression ratio for both ECW and JPEG2000 was 15:1.

When processing one image at a time, there was a lot of unused processing capacity. For example, when compressing one ECW file at a time 700MB of total RAM was used; 25% of all eight cores were used; and 20% of the disk bandwidth was used.

By using this extra capacity with multiple simultaneous compressions we more effectively use our total available computing power. When compressing eight ECW files simultaneously 1.3GB of total RAM was used; 75% of all eight cores were used; and 90% of the disk bandwidth was used.

While the ERDAS IMAGINE batch engine does not limit its multi-core processing to the number of cores on the system, I stopped at eight because the disk capacity was filling up. I am convinced I could have pushed the number of simultaneous processes to 10 (or more) if I had added five more disks to fill the array container (but these additional five disks were not available).

ERDAS ER Mapper users will have slightly faster times when compressing one image, But, when running a batch of many images and more especially when running simultaneous processes, ERDAS IMAGINE will be significantly faster. In the future, we will incorporate into ERDAS IMAGINE what makes ERDAS ER Mapper faster when processing one image. Below are the times:





# Processes ECW TimeJPEG2000 Time

1

2:51:15

5:46:44

2

1:29:27

2:44:44

3

0:43:48

1:58:24

4

0:41:19

1:47:03

5

0:41:37

1:06:55

6

0:35:49

0:55:03

7

0:33:01

0:45:51

8

0:31:53

0:43:14



Looking at these numbers, we can see that a new day is rising for the remote sensing and GIS user community. Think for a moment, one ECW image was being completed every 0.9 seconds when running eight processes, and one JPEG2000 image was being completed every 1.2 seconds. This means you complete a large compression task in less time than it takes to each you to go eat lunch.

If you need a reason for your boss to spend money on an updated system, in the long term a fast processing configuration like the one I have tested is likely the least expensive productivity enhancement they can buy. All they need to do is determine how fast solutions are needed. Is compressing 2095 images over lunch fast enough?

Volume 1 Post Reference: http://field-guide.blogspot.com/2010/04/single-core-versus-multi-core-image.html

Wednesday, April 21, 2010

Single core vs multi-core image processing (vol 1)

For many years the remote sensing and photogrammetry community have demanded more performance in processing of image data. (I still remember how excited I was when upgrading from an IBM XP with a 10MB hard disk to a Compaq 386 with a 300MB hard disk.) While, some functions continue to be limited by raw processing performance, most applications can receive massive benefit from a higher performance disk configuration. The next few post will discuss this topic.

Thanks to Dell, Inc. for the test system they provided. This is a cost-effective, yet solid performance server configuration. The test system is as follows:

Dell Precision T7500
Processor: Dual Quad Core, 2.13GHz with 4MB Cache
System RAM: 4GB 1066MHz
Internal Controller: C4 SATA Non-RAID
Internal Disks: Four 1.5TB 7200 RPM SATA with 3.0Gbps, 16MB Data Burst Cache

External Storage: 15 Bay SAS/SATA Array
Array Controller: PERC 6/E SAS RAID, 256MB Memory
Array Disks: Ten 750GB 7200 RPM 3.0Gbps
Configuration: RAID 0

Operating System: Microsoft Windows Server 2008 x64, R2

Since some people will read this blog months after these posts, I will not post prices. Notwithstanding my reluctance to mention prices, I will say that the performance boost will pay for itself. I suggest you look for the price this system on Dell’s website.

This effort actually began a few years ago over the Christmas Holidays when Chuck Patterson (SAM Inc. IT), Dell, Haiyan Qu (ERDAS Customer Support) and I got together to understand the performance of Dell’s EqualLogic Storage Array Network (SAN) device running ERDAS IMAGINE processes. That discussion was in the middle of ERDAS’ development of multi-core batch processing (former working name BatchX). With the completion of the multi-core batch processing effort at ERDAS, Dell and I restarted the testing with standard Disk Arrays. We will expand to SANs over the next few months.

As a benchmark to the speed of the system, I ran tests using the standard copy command in a .bat command file to determine the read-write speed of the configuration. Here are the speeds recorded on the system using 2095 uncompressed GeoTIFFs of 5000 x 5000 x 3-bands x 8-bit data (each file is 73,282MB):

0:14:26 Disk Array to Disk Array
0:23:01 Internal Disk to Disk Array
0:27:20 Disk Array to Internal Disk
0:40:01 Internal Disk, different disks
0:53:56 Internal Disk, same disks

As expected, the RAID 0 Disk Array really boosts performance. To be noted, RAID 0 is not redundant, and therefore long-term data storage on this configuration is not a good idea. But, processing data (especially writing) on a RAID 0 gives you powerful a performance boost.

In the next few posts, I will present file creation speeds of JPEG2000 and ECW in ERDAS IMAGINE 2010.1’s implementation of the ECW JPEG2000 Codec SDK v4.1 using this configuration. I hope you will find posts these useful.


Volume 2 post: http://field-guide.blogspot.com/2010/04/single-core-vs-multi-core-image.html

Friday, April 16, 2010

When does a .img image roll-over to a .ige image?

The HFA .img format was designed by ERDAS in the early 1990’s. At that time, Microsoft’s operating system file size limit on disk of 2,147,483,648 bytes (2.1GB) was considered huge. Today, 2.1GB is not considered huge.

In 1999, before Microsoft upgraded their OS to support a true 32-bit file size on disk (2^32 - 1) or 4.2GB), ERDAS released a work-around (tricking the OS) using the .ige spill-over image files (an uncompressible 64-bit offset data file). ERDAS has plans (but no time-table at present) to move to a 64-bit offset .img file, eliminating the need for the .ige files (and allowing DR-RLE compresssion of .img files >2.1GB).

Until the time when a 64-bit offset .img file with DR-RLE support is supported, the mapping community must deal with the question, "when will this file spill-over to a .ige file?" To help the ERDAS IMAGINE community with this challenge, I want to take a moment to explain when spill-over to the .ige file occurs. This will allow you to plan for the transition from a .img to a .ige file when creating new files.

One major thing to understand, when determining if an image needs a .ige file ERDAS counts pixels, not bytes. This is done because when using the DR-RLE compression ERDAS cannot know prior to writing out the image the size of the final compressed image. Therefore, counting pixels rather than bytes is a safe method.

Below are helpful calculations for understanding the transition zone…

Nevertheless, these calculations do not account for all the elements contributing to the final file size. The number of data blocks (determined by the block size) and metadata such as stats, histogram, projection, and attributes (thematic data) will also play into the final file size of a .img file. Therefore, you are not likely to store the full pixel numbers noted below before the rollover.

The formula for calculating the total number of pixels is:
Total bytes allowed / bytes per pixel = total # pixels
For 1-bit data: 2^31 / 0.125 = 17,179,869,184 pixels (17.180 gigapixels)
For 2-bit data: 2^31 / 0.25 = 8,589,934,592 pixels (8.590 gigapixels)
For 4-bit data: 2^31 / 0.5 = 4,294,967,296 pixels (4.294 gigapixels)
For 8-bit data: 2^31 / 1 = 2,147,483,648 pixels (2.147 gigapixels)
For 16-bit data: 2^31 / 2 = 1,073,741,824 pixels (1.073 gigapixels)

Take the square root to get an approximate square array size for the 1-band panchromatic or thematic (color palette) image:
Sqrt of 17,179,869,184 gives a 131,072 X 131,072 x 1-band x 1-bit image
Sqrt of 8,589,934,592 gives a 92,681 x 92,681 x 1-band x 2-bit image
Sqrt of 4,294,967,296 gives a 65,536 x 65,536 x 1-band x 4-bit image
Sqrt of 2,147,483,648 gives a 46,341 x 46,341 x 1-band x 8-bit image
Sqrt of 1,073,741,824 gives a 32,768 x 32,768 x 1-band x 16-bit image

For a 3-band image, the formula is (less than 8-bit not used for multi-layer files):
(Total bytes allowed / # bands) / bytes per pixel = pixels in each band
For 8-bit data: (2^31 / 3) / 1 = 715,827,883 pixels
For 16-bit data: (2^31 / 3) / 2 = 357,913,941 pixels

Take the square root to get an approximate array size for the 3-band image:
Sqrt of 715,827,883 gives a 26,755 x 26,755 x 3-band x 8-bit image
Sqrt of 357,913,941 gives a 18,919 x 18,919 x 3-band x 16-bit image

For a 4-band images:
For 8-bit data: sqrt ((2^31 / 4) / 1) gives a 23,170 x 23,170 x 4-band x 8-bit image
For 16-bit data: sqrt ((2^31 / 4) / 2) gives a 16,384 x 16,384 x 3-band x 16-bit image

Related Blog Post, Fast and Lossless Compression