kinchungwong/dotnet_imaging_discussion_20160215.md

## dotnet_imaging_discussion_20160215.md

      
    Raw
  

              dotnet_imaging_discussion_20160215.md
            
          
    https://github.com/dotnet/corefx/issues/5921
Typical use cases


Intensive versus non-intensive users

One possible metric is to measure the total number of megapixels of images loaded into memory at any moment


Performance or throughput requirements
Library interoperability requirements
Functional requirements

2D primitives drawing
Fidelity and quality: anti-aliasing support
Hardware acceleration support


Library interoperability


For engineering purpose, there should be a list of libraries for which we would like to achieve interoperability.
From there, we would like to highlight any unusual requirements or potential difficulties.
We would also need an importance rating for each library, so that libraries that are too difficult to interoperate and are not important enough to justify the work will be dropped.

Relative importance of library interoperability and image file formats (image MIME types)


Several image MIME types have a de-facto reference implementation, typically written in C language:

JPEG: IJG JPEG, also known as libjpeg
PNG: libpng
TIFF: libtiff


The significance of reference implementations


Forward-backward compatibility.
Edge-case behavior.
Handling of out-of-spec images which were written by proprietary encoder implementations.

Keep in mind that there are many proprietary implementations of image encoders and decoders. Some of these implementations do not use any code from publicly available sources.
For example, historically there were many proprietary TIFF encoding/decoding libraries not based on libtiff. Although these are less relevant nowadays, business imaging users may have large repositories of archived TIFF images generated by these early implementations that need to remain accessible via any newer TIFF-capable imaging library.


Use of reference implementation may require dependency on unmanaged code (typically written in C).

(See: unsafed code)


Porting of the reference implementation to a pure managed implementation requires a lot of discipline and resources that would only be possible if there is commercial support.

Example: Libtiff.NET


Types of interoperability

Deep copy (with or without conversion)


With suitable code, deep copy can convert between any memory layout and pixel format. Therefore, deep copy is considered the baseline fallback strategy for library interoperability.
In order to convert an image of size WxH from one library’s image instance into another library’s image instance, it is necessary to have both instances allocated, and then all the pixel data will be copied over or converted.

Memory usage

Memory usage will be momentarily doubled for the duration of conversion.
After the conversion, the question of whether the original image instance could be released is specific to particular use-cases.


Performance

Such conversion is typically bottlenecked by CPU-to-RAM bandwidth when executed using a single CPU core.
See also: (elementary bitmap memory operations - blitting and pixel format conversion)


On-demand slice copy (with or without conversion)


Each request involves either one row of pixels, or a 2D subarray (subrectangle) of pixels.
The consuming library can tailor on-demand slice copy to a portion of the image.
In other words, if only a small portion of the image is consumed, the amount of data copied or converted will be smaller than the full image size.
There is a method-call overhead, which is platform-dependent, language-dependent and code-dependent.
Method-call overhead can be reduced via inlining.
However, some coding styles will prevent inlining.
A tradeoff exists between the relative overhead and the data size per request (granularity).

Shared access to managed array


Only applicable if all image manipulation code are purely managed.

Shared direct access to pinned managed array by unmanaged code


Pinning

Performance
Impact on garbage collection
Impact on software stability (memory fragmentation)


Shared direct access to unmanaged memory

Memory layout


If two image libraries use the same underlying image memory layout, it is possible to implement an image object bridge so that an image instance from one library can be passed into a consuming function in the other library, without making a deep copy.
Some code change may be needed. The bottomline is that if the memory layouts are different, copying of data is often necessary.
Dimensions

First dimension: channel, if multi-channeled image format
Second dimension: horizontal (pixels on same row)
Third dimension: vertical (rows of pixels)


Padding

Some legacy libraries or platform interfaces require each row of pixels to have a starting address that is aligned.
In case of a library that requires 32-bit alignment, this means some Bgr24 bitmaps may need a padding of 1, 2, or 3 bytes at the end of each row, depending on the pixel width of the Bgr24 bitmap.


Contiguous versus noncontiguous memory layout
Tiled, Sparse or Lazily-allocated bitmap memory

These are examples of noncontiguous memory layout.


Elementary bitmap memory operations

Blitting (memory copy)


If the pixel format is blittable (no data manipulation required), the native memory copy operation (“memcpy”, “Array.Copy”, “Buffer.BlockCopy”) can be used to achieve optimal performance.

Pixel format conversion


Use of SIMD code is necessary for optimal performance, for both computational (arithmetic and bit manipulations) and for bandwidth reasons (movement between CPU execution units and the L1 cache).

Transposition


Transposition is needed if the two image libraries have transposed memory layouts.
A lot of image libraries lay out each row of pixels sequentially; therefore, transposition is rarely a problem for the majority of image libraries.
Transposition is also involved in the 90-degree and 270-degree rotation of bitmaps.
Bitmap transposition algorithms need to be coded carefully in order to squeeze the best throughput from a given CPU-Cache-RAM configuration.

Use of “unsafe code” / “unverifiable code” in managed code


“Unsafe code” refers to code whose safety cannot be verified.
https://msdn.microsoft.com/en-us/library/t2yzs44b.aspx
Some users and use-cases have a requirement or strong preference for verifiable code. Others might not care at all.
If non-managed code is used (such as the reference implementations of image encoder/decoder libraries written in C), typically the framework as a whole do not qualify as being verifiable.


Immutability / access control


Immutability is a desirable (“good to have”) feature, not an absolutely necessary (“must have”) feature.
A lot of image processing frameworks were implemented in C, which does not have any notion of immutability.
Immutability is important as a feature because it affects adoption.
The lack of immutability support in a framework undermines the perceived safety (dependability) of any software that is built on top of such framework.

Handle-based immutability (C++ style const-correctness)


A C++ member function can qualify its access to the instance (via the implicit “this”) as either non-const or const.
A C++ function can qualify the access to each function argument as either non-const or const.
A C++ program would fail to compile (being “ill-formed”) if a const-qualified reference is being passed into a function call that requires the reference to be non-const, unless a const-cast is applied.
The combined effect of the C++ rules is that:

If method M has obtained a mutable reference to object O, it can pass object O to functions that promise not to modify O, or functions that don’t make such promise (i.e. potentially modifying O)
If method M has obtained a const-qualified reference to object O, it can only pass object O to functions that make the promise.


Object-based immutability (freezable)


Any method M can request a freezable object O to make an irreversible transition into frozen state. Once frozen, any attempt to modify the data of O will fail.

Support for pixel formats


For business document imaging, bitonal image format (1 bit per pixel; black-white) is an important pixel format due to its ability to pack 8 pixels into a single byte and still allowing fast access to individual pixels.
For multi-channel pixel formats, some imaging libraries favor one byte-ordering over another (between RGB and BGR). Byte order conversion would be required for interoperability. (See: interoperability - copy with conversion.)
A lot of imaging libraries favor the convention for the BMP file format, in which the blue channel occupies the first byte of each pixel.
However, some image formats, such as TIFF, puts the red channel in the first byte. The reference implementation of the TIFF codec library thus returns RGB data in that byte order.

Support for large images


Business document images can range from normal-sized pages (A4 or Letter-size; 300 or 600 DPI) to large-format engineering drawings (36 inches by 48 inches).

Style of memory management (“tight” / “frugal” / “RAII”)


Paged memory is not a perfect memory-management solution for bitmap-intensive applications.

Because of the memory access patterns of bitmaps (every row of pixels is accessed sequentially, without skipping, as opposed to “hotspots” that favor caching), the use of paged-to-disk memory in a bitmap-intensive application will degrade its performance.
The performance bottleneck is the sequential disk read-write bandwidth.


When dealing with large images with a tight memory constraint, bitmap-intensive applications have a legitimate need to obtain a guarantee that a bitmap that is no longer needed is “completely released”, before starting a new image operation that involves large memory allocations.

The nature of the guarantee is that the application needs to make sure the new image operation will not cause “thrashing”.
This guarantee is satisfied by:

C++ RAII
C# IDisposable.Dispose - but only if the underlying array is freed (or, the memory pressure is removed)


Additional issues for 32-bit process environments

Fragmentation of process address space (in 32-bit process environments)


Use of non-contiguous memory layout may increase fragmentation, but will make the framework more resilient in the face of memory fragmentation.
In other words, use of non-contiguous memory layout is favorable if all image manipulation code were written to take advantage of non-contiguous memory layout.
(See also: noncontiguous memory layout; tiling, sparse or lazily-allocated bitmap memory)