GB2619496A

GB2619496A - Non-linear image contrast enhancement

Info

Publication number: GB2619496A
Application number: GB2207969.3A
Authority: GB
Inventors: Carl Jamieson Parker Steve; Alistair Porteous Michael; John Fry Edward; Paul Segar Jeremy
Original assignee: Rheinmetall Electronics UK Ltd
Current assignee: Rheinmetall Electronics UK Ltd
Priority date: 2022-05-30
Filing date: 2022-05-30
Publication date: 2023-12-13
Anticipated expiration: 2042-05-30
Also published as: GB202207969D0; GB2619496B

Abstract

Disclosed is a method for enhancing the contrast in an image. The method starts by applying a gain to a received image by multiplying the pixel value of pixels in the image by a ratio of a target mean output pixel value and a mean pixel value for the pixel values of the pixels in the image 201. Then adjusting the distribution of pixel values of the gain-adjusted image by, determining a pixel difference value by calculating the difference between each gain-applied pixel value and the target mean output pixel value 202. Then determining an output image by mapping the pixel difference values into output image pixel values, using a mapping function that expands the distribution of pixel difference values close to the target mean output pixel value and compresses the distribution of pixel values far away from the target mean output pixel value 203. The method may calculate a histogram of the values of the pixels in the image, such that the mean value for pixel values is calculated using the histogram.

Description

Non-linear Image Contrast Enhancement

FIELD

[0001] The present disclosure relates to methods for enhancing contrast in an image such as, but not limited to, enhancing the visibility of low contrast features in predominantly dark scenes.

BACKGROUND

[0002] A number of techniques have previously been employed to improve the contrast in images by enhancing the differences in brightness between objects in the image. Many of these techniques involve what is known as histogram equalisation. Histogram equalisation typically consists of increasing the global contrast of an image by redistributing the distribution of pixel values across the dynamic range of the possible pixel values.

[0003] Standard histogram equalisation may be modified in a number of ways, for example by implementing adaptive histogram equalisation consisting of computing several histograms, each corresponding to a distinct section of the image, and using them to redistribute the brightness values of the image. Adaptive histogram equalisation often tends to over-amplify the contrast in certain portions of the image, and so may cause overall harshness in the image. Therefore, adaptive histogram equalisation may be further modified to limit the amount of contrast amplification, resulting in a technique known as contrast limited adaptive histogram equalisation (CLAHE).

[0004] A number of prior disclosures are found in the following publications: [0005] US 2016/0267631 Al discloses an adaptive contrast enhancement method [0006] WO 2018/161078 Al discloses an image adjustment and standardisation method.

[0007] US 8,285,040 B1 discloses a method and apparatus for adaptive contrast enhancement of image data.

[0008] US 2003/0174887 Al discloses an apparatus and method to enhance a contrast using histogram matching.

SUMMARY

[0009] The presently taught approaches are defined in the appended set of claims.

[0010] According to a first aspect of certain examples, there is provided a method for enhancing contrast in an image, the method comprising: applying gain to a received image by multiplying the pixel value of pixels in the image by a ratio of a target mean output pixel value and a mean pixel value for the pixel values of the pixels in the image; and adjusting the dynamic range of the gain-adjusted image by: determining a pixel difference value by calculating difference between each gain-applied pixel value and the target mean output pixel value; and determining an output image by mapping the pixel difference values into output image pixel values, using a mapping function that expands the distribution of pixel difference values close to the target mean output pixel value and compresses the distribution of pixel values far away from the target mean output pixel value.

[0011] According to a second aspect of certain examples, there is provided a computer implemented method comprising: calculating a mean pixel value of an input image, the input image comprising a plurality of pixels each having a respective pixel value; obtaining an first intermediate image by multiplying the pixel value for each of the plurality of pixels of the input image by the ratio of a target value for the mean of an output image to the mean pixel value of the input image; obtaining a second intermediate image by calculating the difference between the pixel values of the first intermediate image and the target value for the mean of an output image; obtaining each output image pixel value of the output image by: when the corresponding pixel value of the second intermediate image is greater than or equal to zero, mapping the corresponding pixel value of the second intermediate image to a value using a first mapping function and adding this value to the target value for the mean of the output image, and when the corresponding pixel value of the second intermediate image is less than zero, mapping the corresponding pixel value of the second intermediate image to a value using a second mapping function and subtracting this value from the target value for the mean of the output image; and outputting the output image comprising the output image pixel values.

[0012] According to a third aspect of certain examples, there is provided a system comprising: an image capture device configured to generate an input image; an image processing unit configured to: apply gain to a received image by multiplying the pixel value of pixels in the image by a ratio of a target mean output pixel value and a mean pixel value for the pixel values of the pixels in the image; and adjust the dynamic range of the gain-adjusted image by: determining a pixel difference value by calculating difference between each gain-applied pixel value and the target mean output pixel value; and determining an output image by mapping the pixel difference values into output image pixel values, using a mapping function that expands the distribution of pixel difference values close to the target mean output pixel value and compresses the distribution of pixel values far away from the target mean output pixel value.

[0013] It will be appreciated that features and aspects of the presently taught approaches described above in relation to the various aspects of the invention are equally applicable to, and may be combined with, embodiments of the approaches according to other aspects as appropriate, and not just in the specific combinations described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] Embodiments of the presently taught approaches will now be described, by way of example only, with reference to the accompanying drawings, in which: [0015] Figure 1 schematically illustrates an example computing system.

[0016] Figure 2 schematically illustrates an example workflow for applying the disclosed techniques to an input image.

[0017] Figure 3 illustrates a number of different example curves for pixel values above the mean depending on different values of the linearity factor.

[0018] Figure 4 illustrates a number of different example curves for pixel values below the mean depending on different values of the linearity factor.

[0019] Figure 5 illustrates an example of the input to output pixel mapping performed by the image processing unit on a single 8-bit input image.

[0020] Figure 6 illustrates an input image (left) and an output image (right) after applying the present teachings.

[0021] Figure 7 illustrates the calculation of using upper and lower threshold factors for the cumulative histogram to obtain upper and lower bounds for calculating the mean of the input image.

[0022] Figure 8 illustrates a temporal filter for use in applying weights to parameters when applying the disclosed techniques to a sequence of video frames.

DETAILED DESCRIPTION

[0023] Aspects and features of certain examples and embodiments are described herein.

Some aspects and features of certain examples and embodiments may be implemented conventionally and these are not discussed / described in detail in the interests of brevity. It will thus be appreciated that aspects and features of apparatus and methods discussed herein which are not described in detail may be implemented in accordance with any conventional techniques for implementing such aspects and features.

[0024] The present disclosure relates to image processing methods and techniques for improving the visibility of low contrast features in predominantly dark scenes in digital images.

The disclosed methods operate on digital images captured by a digital image sensor such as may be found in a digital camera (e.g. a stills camera, a video camera, a CCTV security camera or the like). Such digital images comprise an array of numbers with each number representing a brightness value (or grey-level value) for each element of the array (i.e. pixel).

The range of brightness values each pixel may take may be termed the "bit-depth", "bit-width", and/or the "dynamic range" of the image. Commonly, the dynamic range output by such a digital camera is 8-bit, such that each pixel may take 28 = 256 possible values with a range of 0-255, where conventionally 0 is used to represent the minimum brightness (which may appear black) and 255 is used to represent maximum brightness (which in a greyscale image may appear white). It will be understood, however, that other dynamic ranges are available, for example 4-bit, 10-bit, 12-bit and 16-bit, and that the present disclosure is not limited to any particular numerical value of dynamic range. The maximum value of a given pixel for a particular bit-depth according to this convention may conveniently be represented as 219 -1 where b is the bit-depth of the image. It will also be understood that the methods disclosed herein are disclosed in the context of greyscale images for ease of explanation, but the disclosure is not limited in this respect and is equally applicable to colour images.

[0025] Colour images may be represented by a number of pixel arrays, commonly three in number, with each array corresponding to one of the properties used to record the colour images. One common colour image approach uses three pixel arrays, one corresponding to a brightness value for each of red, green or blue channels in the image; this is known as an RGB image and the dynamic range of an RGB image is represented by the dynamic range of each colour array (which is typically the same for each colour array). Alternative formats of representing colour images are known, such as HSV (for hue, saturation, value), wherein the "H" array represents the hue of the pixels, the "S" array represents the saturation of the pixels, and the "V" array represents the brightness of the pixels. A number of other colour formats are also available. Colour images represented in a particular colour format can be converted to other colour formats. For example, colour images in the RGB format can be converted into HSV using well-known mathematical functions.

[0026] When the present techniques are applied to colour images, the dynamic range processing is applied to the values in the image that correspond to the brightness. Thus in an RGB image the dynamic range processing would be applied to each colour array as each colour array records brightness information. On the other hand, in an HSV image the dynamic range processing would be applied to the V array as this represents brightness for the image. Thus in some implementations based on colour images, it may be appropriate to convert from a format which uses multiple arrays to record brightness (such as RGB format) to a format which uses fewer or only one array to record brightness (such as HSV format) since the image processing to enhance dynamic range can be performed only on the brightness-related array(s) in the same way as when processing a greyscale image. In such an approach, the processed brightness-related array (the V array in HSV format) may later be recombined with the other image arrays (the H and S arrays in HSV format) to form a processed colour image.

In this way, as well as others known to the skilled person, the following disclosure may be applied to both greyscale and colour images.

[0027] As illustrated in Figure 1, the present approaches may be implemented using a system 100 comprising an image capture device 105, image processing unit 110, and display 125.

[0028] As explained above, an input image may be captured from any suitable image capture device 105 such as a digital camera. According to the disclosure, the input image may be sent to an image processing module or image processing unit 110. The image processing unit 110 may be implemented as part of the image capture device 105, as a standalone computing unit directly connected to the image capture device 105, or as a remote computing unit configured to receive the input image from the image capture device 105 via an intermediate connectivity arrangement such as a connectivity network. Such direct connection and/or connectivity network may include physical (wired) and or wireless links.

[0029] Also, it should be noted that the image may be stored between a time of capture by the image capture device 105 and a time of being processed by the image processing unit 110. Such storage may be by way of caching, for example in a memory of the image capture device 105, a memory of the image processing unit 110, and/or a memory of an intermediate entity such as a network fabric element of a connecting network. Such storage may also or alternatively be by way of storage over time such as in a persistent memory such as a memory device, and such a memory device may be used to transfer the image(s) captured by the image capture device 105 to the image processing unit 110.

[0030] In one example embodiment, and as illustrated in Figure 1, the image processing unit may comprise a system-on-chip (SoC) device, the SoC device comprising field programmable gate array (FPGA) firmware 115 and a central processing unit (CPU) 120. For example, and in very particular examples, the SoC device may comprise a Xilinx Zynq chip and the CPU may comprise an ARM-based CPU. However it will be appreciated that the present teachings are not limited in this respect and other hardware based upon different technologies is also appropriate (including, for example, other Xilinx architectures, Altera architectures, x86 architectures, SPARC architectures or the like).

[0031] A display 125 may then be used to output an image captured by the image capture device. The display 125 may be implemented as part of the image processing unit 110, as a standalone display unit directly connected to the image processing unit 110, or as a remote display configured to receive the processed image from the image processing unit 110 via an intermediate connectivity arrangement such as a connectivity network. Such direct connection and/or connectivity network may include physical (wired) and or wireless links.

[0032] Additionally or alternatively, the system 100 may include an image recognition unit 130 configured to utilise the output image for the purposes of detection, recognition and identification. This may be achieved by using any suitable image recognition algorithm such as an Al-based algorithm or otherwise. The image recognition unit 130 may generate image metadata for consumption by another system, for example a vehicle control unit 135 used for operating or aiding in operating a vehicle. Such a vehicle control unit 135 may for example be similar to systems that use imaging-type inputs from conventional imaging, LIDAR or RADAR to control or assist with controlling a vehicle. In this case, the output image itself may not be output to display 125 for viewing by a human user, but may instead be passed to image recognition unit 130 and/or vehicle control unit 135.

[0033] As with the transfer of the captured image from the image capture device 105 to the image processing unit, the image may be stored between a time of processing by the image processing unit 110 and a time of being displayed by the display 125. Such storage may be by way of caching, for example in a memory of the image processing unit 110, a memory associated with the display 125, and/or a memory of an intermediate entity such as a network fabric element of a connecting network. Such storage may also or alternatively be by way of storage over time such as in a persistent memory such as a memory device, and such a memory device may be used to transfer the image(s) processed by the image processing unit 110 to the display 125.

[0034] The image capture device 105 may be configured to capture an input image during use. The input image may be captured as a still image or captured as a frame in a video sequence. As discussed above, the input image comprises an array of pixels. This array of pixels may be represented as ip(x, y), whereby the array ip has dimensions X by Y (both being positive, non-zero whole numbers), and x and y represent a coordinate position within the array.

[0035] After capture by the image capture device 105 the input image ip(x y) is provided to the image processing unit 110. From a general perspective, image processing unit 110 may be configured to generate a histogram of the input image using the FPGA firmware 115 and apply a mapping function (otherwise referred to as a look-up table, or [UT) to the input image calculated by software implemented on the CPU 120. Generation of a histogram may comprise calculating a number of pixel values that fall within a particular histogram bin. In this context, a histogram bin defines a predetermined range of pixel values, the predetermined range also being conventionally known as the bin size. Thus, the full range of possible input values is segmented into a number of histogram bins of a predetermined size. The number of input pixel values falling within each histogram bin is accordingly referred to as the bin frequency for that particular bin. The size and number of bins can be selected depending upon the desired implementation. For example, a histogram with fewer, larger bins requires less memory to be stored, and may therefore be useful for managing FPGA resource usage, as well as managing the amount of data transferred between the FPGA firmware 115 and CPU 120. In the present example, this approach to distribution of functionality within the image processing unit is adopted since the determination of a histogram is based upon a pixel-by-pixel processing of the input image and the mapping function is based upon a number of entries governed by the size of the dynamic range of the input image. This approach is particularly appropriate for the present example as the firmware 115 may be configured to act on a pixel-by-pixel level for what may typically be more than one million pixels per input image, whereas the software implemented in the CPU 120 may be configured to act on the number of entries for mapping based on a dynamic range of the input image which may typically be 256 for an 8-bit image. Hence, as there may be three or more orders of magnitude difference between the number of pixels in the input image and the number of entries in the mapping function, efficient use of bandwidth may be achieved since the CPU software can efficiently be programmed to handle the lower processing bandwidth associated with calculation of the mapping function, and the FPGA firmware 115 can efficiently handle the much higher processing bandwidth associated with performing the histogram calculation and applying the mapping function to the input image.

[0036] As discussed above, at a general level image processing unit 110 may provide a mapping between the pixels of the input image received from the image capture device 105 and an output image for display on display 125. A number of parameters may be provided to image processing unit 110 to adjust this mapping as explained in more detail below. These parameters may be "fixed" such that they cannot be changed by a user during operation of the system 100, or may alternatively be programmable during use such that they are able to be adjusted on the fly, i.e. they may be run-time programmable. In some examples, a subset of parameters may be fixed and the remainder being run-time programmable in order to simplify operation of the system 100 and offer increased stability during use.

[0037] In some examples, the system 100 is mounted/located on/in a vehicle and thus enable control of the vehicle in low light and/or low contrast conditions. In other examples, the system may form part of a surveillance system, whereby the image capture device 105 is mounted in a fixed location, with the image processing unit 110 and display 125 being located remotely from the image capture device 105, although it will be appreciated that the relative locations

S

of the image capture device 105, image processing unit 110 and the display 125 are not limited to such an arrangement.

[0038] Furthermore, the present techniques can be used in conjunction with a high bit-depth sensor in order to better extract contrast information and to redistribute the information contained in the image over the available dynamic range, allowing for a lower bit-width to be used in subsequent processing and thus optimising bandwidth usage in the subsequent processing. For example, a high bit-depth sensor may capture 10-bit, 12-bit, 16-bit, etc. images, as opposed to lower bit-depth sensors which may capture 8-bit or lower images. As display devices are commonly configured to output an 8-bit image, the present techniques are able to extract details and scene information from a high bit-depth sensor in order to display the output image on a lower bit-depth display device.

[0039] As an illustrative example, a 12-bit colour sensor may be used to capture a 36-bit RGB input image. In this example, this input image is converted into the YCbCr colour image format and subsampled into a 24-bit YCbCR 4:2:2 image. This subsampled image is again reduced to obtain a 16-bit YCbCr 4:2:2 image. By reducing the pixel representations to 16-bit or lower provides a more efficient use of bandwidth when subsequently storing and transferring image data as part of the image processing technique.

[0040] However, as the input signal bit-width increases, so the memory required to hold both the histogram and the mapping [UT increases and can consume significant amounts of the available Block RAM on an FPGA device. In order to mitigate this situation, the bin size of the histogram may be made larger such that each bin contains not just instances of a single pixel value, but also a number of adjacent values (e.g. 2, 4, 8 etc.). This may be achieved by discarding a certain number of least significant bits (LSBs) from the pixel value when transforming the input pixel values into a corresponding histogram bin frequency. For example, the first two LSBs may be discarded when calculating the histogram, effectively combining adjacent values sharing the same two LSBs.

[0041] With reference to Figure 2, the operation of image processing unit 110 will now be explained in greater detail [0042] As explained above, a user of the system 100 may set a number of parameters for controlling the mapping. One of these parameters includes a target value for the mean output pixel value, herein referred to as opMeanPixVal, of the output image. The numerical value of opMeanPixVal lies within the range of allowable values, e.g. between 0 and 255 for 8-bit images. To keep the process independent of the bit-width of the images, in some examples the user may configure the image processing unit by setting a control value, referred to herein as opMeanFactor, which is defined as the fraction of opMeanPixVal compared to the full dynamic range of the output image, where: {opMeanFactor E ZIO opMeanFactor 11. The value of opMeanFactor may be determined empirically and in some examples, opMeanFactor may be set to 0.4 as this has been found in practice by the inventors through modelling with a variety of source images, and practical experimentation using real systems to give good contrast enhancement results and a natural looking output image with high levels of detail.

[0043] As illustrated in step 201 of Figure 2, the image processing unit 110 applies a gain to the input image. In the present examples, applying a gain to the input image comprises multiplying each pixel value in the image array by a number referred to as the gain factor. According to the present examples, the gain factor is obtained by first calculating the mean pixel value ipMeanPixV al of the input image. The mean pixel value of the input image may, for example, be calculated from a histogram of the input image generated by the FPGA firmware 115, the mean being calculated from this histogram by the software implemented in the CPU 120. The gain factor is then calculated as the ratio of the target mean output pixel value and the mean input pixel value. This ratio, which in the nomenclature used herein can be represented as op M eanPixVal/ ipM eanPixV al, may then be applied to the pixel values of the input image ip(x, y). In this example, the gain factor is applied to the input image to obtain an array of amplified pixel values ipAmp(x,y) which may be represented as: opM eanP ixV al ipAmp(x,y) = round FipMeanPixVai ip(x,y)1 [0044] In this example, the "round" function converts a non-integer value to the integer value closest to it. Depending upon the magnitude of the gain factor, the pixel values of the array ipAmp(x,y) may now exceed the maximum valid output pixel value of 2h -1, and the lower part of the dynamic range may not be efficiently utilised. Therefore, according to the present examples, the pixel values of ipAmp(x,y) above and below the target output mean are mapped into the respective available ranges [0045] As illustrated in step 202 of Figure 2, in the present examples this mapping is performed by determining pixel difference values by calculating the difference between each gain-applied pixel value and the target mean output pixel value. In other words, the difference AipAmp(x,y) from opMeanPixVal for each pixel of the amplified input image ipAmp(x,y) may be calculated as: AipAmp(x,y) = ipAmp(x,y) -opMeanPixVal [0046] As illustrated in step 203 of Figure 2, the pixel difference values are then mapped into output image pixel values. The present examples achieve this by mapping the pixel difference values AlpAmp(x,y) into the upper and lower valid regions of the output dynamic range using respective mapping functions that expand the distribution of output pixels close to the output mean (which may correspond to image pixels where features are more likely to be of interest), but compress the distribution at the extremes of the range i.e. for pixel values that are close to the minima and maxima of the dynamic range.

[0047] In some examples, different mapping functions may be used for positive and negative values of AipAmp(x,y), the functions being denoted by upperNonLinLut and lowerNonLinLut respectively. These functions may be adjusted by the user by configuring the image processing unit 110 with control parameters upper N onLinF actor and lowerNonLinF actor which define the degree of linearity of the two mapping functions. According to this example, a lower value for each of these parameters may denote a more linear response, and a higher value may denote a less linear response.

[0048] For pixel values that fall within the range of AipAmp(x,y) 0, the form of the function upperNonLinLut may take the normalised form of: normalised LUT -tanh (u. Tr) [0049] In this form, the parameter u represents either of the respective linearity factors upperNonLinFactor and lowerNonLinFactor, and x represents the normalised input pixel value ranging from 0 to 1. In this context, the normalised input pixel value represents the fraction of the input pixel value compared to the available range of input pixel values, ranging from the mean input pixel value to the maximum available pixel value for the upper non-linear function, and to the minimum input pixel value for the lower non-linear function. As an illustrative example, consider an 8-bit input image having a mean pixel value of 128 For the calculation of the upper non-linear function, the normalised range of 0 to 1 corresponds to the range of 128 to 255. Thus, in this example, an input pixel value of 200 would have a normalised input pixel value of (200-128)/(255-127)=0.57.

[0050] Figure 3 illustrates a number of example curves depending on different example values of the linearity factor upperN onLinF actor of 0.1, 1, 2, and 10 for pixel values above the mean.

A corresponding approach can also be used for pixel values below the mean as illustrated in Figure 4 where the mapping is mirrored. As can be seen in Figures 3 and 4, the curves may be approximately linear when the corresponding non-linear factor is close to zero, e.g. 0.1, thus resulting in an approximately linear mapping for either of the upper or lower non-linear tanh (u. tr. x) functions. Therefore, according to the examples set out above, the output pixel values of the output image op(x,y) are calculated as: op(x,y) = opMeanPixVal + upperNonLinLut[I,AipAmp(x,y)I] for aipAmp(x,y) 0 op(x,y) = opMeanPixVal -lowerNonLinLut[laipAmp(x,y)11 for aipAmp(x,y) < 0 [0051] Figure 5 illustrates an example of the net effect of the processing performed by the image processing unit 110 on a single 8-bit input image. In this example, lowerNonLinFactor is close to zero, denoting an approximately linear response, whereas upperNonLinFactor is larger (-1) to compress samples in the range [-100, 255]. This type of response may be chosen if pixels in the range [100, 255] are sparsely populated in the image, which is wasteful in terms of the information that may be conveyed to the user by the display device. In other examples, different values for these parameters may be used, for example based upon image properties such as pixel value ranges which are sparsely populated, and/or the distance from the target mean to the maximum and/or minimum possible value.

[0052] For example, Figure 6 shows the effect of applying the non-linear mapping of Figure 5 to an input image (the left-hand part in Figure 6) to obtain an enhanced output image (right-hand part in Figure 6). As can be seen, the contrast and detail of the image has been increased significantly while maintaining a natural look such that the presence of a person within the image frame becomes clearly visible.

[0053] The precise shapes of the mapping functions upperNonLinLut and lowerNonLinLut may be calculated dynamically in the software implemented on the CPU 120 and are, according to some examples, governed by two factors: the non-linear parameters upperNonLinFactor and lowerNonLinFactor; and the statistical properties of the histogram of the input image. The statistical properties of the histogram may be characterised by: ipMeanPixVal as described above, as well as additional properties cum! istLoThreshFactor and cunthistifiThreshF actor. These additional properties act to discount input pixels that are approximately equal to the minima or maxima of the dynamic range from the calculation of the mean pixel value of the input image as explained below.

[0054] Experience has shown that when determining the image gain, by calculating ipMeanPixVal, special consideration may be given to input pixels that are approximately equal to the minima or maxima of the dynamic range. Input images that have large numbers of pixels in these ranges can adversely bias the gain factor and result in image contrast enhancement performance that may be less optimal than could otherwise be achieved. Consequently, according to some examples, upper and lower limits may be set, namely ipMeanLoThreshPirVal (lower threshold pixel value) and ipMeanHiThreshPixVal (upper threshold pixel value). Accordingly, pixels with values lower than or equal to ipM eanLoThreshPixV al or higher than or equal to ipMeanH iThreshPixV al may be discounted from calculation of ipMeanPixVal. Default values may be fixed for the upper and lower thresholds. For example, the lower threshold may be set as approximately 1% of the dynamic range, and the upper threshold as 99% of the dynamic range: ipMeanLoThreshPixVal = round(0.01 x (2' -1)) and ipMeanHiThreshPixVal = round(0.99 x (2' -1)). For example, for an 8-bit image (b=8) these pixel threshold values would be 3 and 253 respectively, or for a 10-bit image (b=10), these pixel threshold values would be 10 and 1013 respectively.

[0055] According to some embodiments, a robust way of measuring the smallest and largest pixel values of the input image to be used in the calculation of the mean, i.e. ipM eanLoThreshPixV al (lower threshold pixel value) and ipMeard iThreshPixVal (upper threshold pixel value), may be to calculate the lower and upper pixel values that correspond to two user defined positions in a cumulative histogram of the input image, herein referred to as cumHist. For example, the user may define normalised cumulative histogram values of cumHistLoThreshFactor = 0.01 and cumHistHiThreshFactor = 0.99 which would ensure that only 1% of the image is darker and brighter than these pixel values respectively. Compared to using the minimum and maximum pixel values of the input image to derive these limits, cumulative histogram values may be used because they can provide more robust results compared to maxima or minima values which are easily skewed by noise in images.

[0056] According to some examples, the software implemented in the CPU 120 computes the normalised cumulative histogram cumHist and determines the two nearest pixel values cumaistHiPixVal and cumHistLoPixVal that correspond to the following conditions: cumHist(cumHistHiPixVal)ea cumHistHiThreshFactor; cumllist(cuml istLoP ixVal) cum! istLoThreshFactor.

[0057] As would be understood by the skilled person, a cumulative histogram of an image array comprises a running total of the number of pixels that possess incrementally higher pixel values. It thus follows that a normalised cumulative histogram comprises the values of the cumulative histogram divided by the maximum value of the cumulative histogram, such that the minimum value of the normalised cumulative histogram is 0 and the maximum is 1.

[0058] An example of this calculation for an 8-bit input image is illustrated in Figure 7 which depicts the normalised standard histogram with the solid line, and the normalised cumulative histogram in the dashed line. In this example, cumHistLoThreshFactor = 0.01, and cumHistHiThreshFactor = 0.98, resulting in values of cumHistLoPixVal = 9, and cumHistHiPixVal = 96. The values cumHistLoThreshFactor and cumHistHiThreshFactor may either be fixed or user-programmable. When fixed, these values may for example be provided with defaults of 0.001 and 0.999 respectively. This would allow 0.1% of the corrected image frames to be set to 0 and 0.1% to saturate. However, when user-programmable cumHistLoThreshFactor and cumHistHiThreshFactor may be changed if these defaults are inappropriate for the images produced by the particular image sensor used in image capture device 105. The values cumHistLoPixVal and cumHistHiPixVal may be calculated in software on the CPU 120 and, in accordance with the examples set out above, are the statistical quantities used in conjunction with lowerNonLinFactor and upperNonLinFactor for governing the shapes of the mapping functions lowerNonLinLut and upperNonLinLut respectively.

[0059] When using image sensors with high bit-depths as discussed above, an additional processing step may be implemented based on the assumption that across a small range the mapping function is approximately linear. Therefore, linear interpolation is used to find intermediate values of the mapping function for higher bit-widths. Thus, it may be possible to achieve storage gains by performing such storing of less than all values and then interpolating between the stored values. For example, implementations may provide for storing only every second, fourth, eighth etc. value of the mapping function; then intermediate values may be obtained via linear interpolation when needed.

[0060] Once the mapping functions have been applied to the input image as explained above to obtain op(x,y), the contrast-enhanced output image op(x,y) may then be communicated to display 125 to be rendered and viewed by a user.

[0061] The techniques described above are not limited only to single images, but may also be applied to videos comprising a sequence of image frames as explained below.

[0062] A potential downside of applying gain related video processing based on parameters estimated from single frames is that the temporal dynamics of the scene, or any fluctuations in noise, may result in the corrected output video experiencing sudden changes in brightness or contrast that manifest themselves as objectionable flashes or flicker. This can be particularly problematic when the images are dark and a large gain value is needed (whereby calculation of the amplified input image comprises a division by a small value).

[0063] According to some examples, image flashing may be alleviated by temporally filtering the calculated values of: ipMeanPixVal, cumHistLoPixVal and cumflistfliPirVal before they are used to calculate lowerNonLinLut and upperNonLinLut. The corresponding filtered values are referred to herein as: ipMeanPixValFilt, cumHistLoPixValFilt and and cumHistHiPixValFilt. Temporally filtering the parameters may therefore result in a trade-off between the ability to respond to sudden changes in illumination and the suppression of unwanted flicker in the output video. By applying a temporal filter, historical values are weighted depending on their age and then summed to form a temporally-filtered value for use on the newest frame.

[0064] The temporal filtering may be defined by a temporal filter profile. An example temporal filter profile is illustrated in Figure 8. As illustrated, the profile of the temporal filter may consist of a substantially flat section 810, where historical values are given an equal weighting, followed by a Gaussian section 820, where older samples are given a lower weighting. The duration of the uniform section 810 of the filter may be defined by a parameter herein referred to as uniformSectionLength and the duration of the Gaussian section 820 may be defined by a parameter herein referred to as gaussianSectionLength. The roll-off of the Gaussian section 820 is defined by a parameter herein referred to as am which is the duration in video frames for the filter response to fall to 50%. Other filter approaches may also be used, such as a filter that applies an equal weighting across a finite set of historical samples, a filter that reduces the weight of all historical values based upon age (where the reduction in which may be linear or non-linear in nature, such as a Gaussian weight reduction).

[0065] The temporal filter may be adjusted to more closely represent a standard Gaussian filter by setting the uniform section 810 to a low value. This may advantageously give a smoother response when there are sudden changes in illumination.

[0066] For some applications, historic values may not be available, for example in specific testbed environments. In these instances, the temporal filter may be deactivated by setting a Boolean parameter, herein referred to as temporalFitterEnable to false. As an alternative to including a Boolean parameter to enable or disable the temporal filter, the profile of the filter may be set to include only a flat section such that the filtered parameters are left unchanged by application of the filter. As will be appreciated, however, utilising a Boolean parameter removes the need to apply a filter at all, thus conserving processing bandwidth.

[0067] A summary of the various parameters discussed above for controlling the operation of image processing unit 210 is presented below in Table 1. Wien referring to "Min" and "Max" values for given parameters in Table 1, it will be understood that these values are not to be construed as absolute minima or maxima, but rather suggested values which may improve the performance of the techniques disclosed herein. In Table 1, the value of b is set to 10 for the sake of providing example default values, but is not limited to this value.

:NOW Brightness control float 0 1 GAG sMa.n:Fncor Prz-Vai Wloy nit,;djusted for int 0 2,_.: rouncie0.01):(2.' --111 * 'corner case' = 512 operation ipfileserieliP - Mao be a Pjusterl for = 513 -2.'' -- I am i price Or:x(2' 1)) oraeraocn = 1023 = 1013 %es/al:face Lower contrast float 0 0.00(110 con owl (0.01% o rouge block) coruliistli() foresh). iclr Upp=r contras) Poet 0 10 g)qopep owe n No n Un I. acron control float 0 10 (0.01% of image so to: a)1) upper e; Fru to Lower contrast Poo: 0.01 son fog 0.50 Lippe( ron)resz con trial zm.ifors5ectanntrnqrh Duration of uniform irs 10 S part of temporal filter in frames) sodas) '.en2g-ti: Thration of itl!- f) : 0 s I. a in a ' (c. s, Gaussian tempo( a: it 50 iiher response (in frames) Rog ff of temporal hirer fi tfi;noon-fir/Ur)) oblie Enables or disables boolean 1 fain poral filtering [0068] Therefore, from one perspective, there has been disclosed an approach to enable global image contrast enhancement to be implemented using real-time FPGA-based signal processing. A hybrid approach may be used which calculates image histograms using FPGA firmware and then applies the contrast adjustment in software. The system may calculate an average pixel value from an input image using a calculated image histogram, may then use the calculated average to adjust each pixel value relative to a target mean pixel value. Finally, the system applies a non-linear value mapping to give more weight to pixels near the target mean pixel value and less weight to pixels near the limits of the possible pixel value range. Thus contrast is enhanced around the mean pixel value (at the expense of reduced contrast away from the mean pixel value).

Claims

CLAIMS1. A method for enhancing contrast in an image, the method comprising: applying gain to a received image by multiplying the pixel value of pixels in the image by a ratio of a target mean output pixel value and a mean pixel value for the pixel values of the pixels in the image; and adjusting the distribution of pixel values of the gain-adjusted image by: determining a pixel difference value by calculating difference between each gain-applied pixel value and the target mean output pixel value; and determining an output image by mapping the pixel difference values into output image pixel values, using a mapping function that expands the distribution of pixel difference values close to the target mean output pixel value and compresses the distribution of pixel values far away from the target mean output pixel value.
2. The method of claim 1, further comprising calculating a histogram of the values of the pixels in the image.
3. The method of claim 2, wherein the mean pixel value for the pixel values of the pixels in the image is calculated using the histogram of the values of the pixels in the image.
4. The method of any preceding claim, wherein the mapping function comprises a first mapping function for positive pixel difference values and a second mapping function for negative pixel difference values.
5. The method of claim 4, wherein the first mapping function is different from the second mapping function.
6. The method of any preceding claim, wherein the mapping function comprises a non-linear mapping curve for dynamic range adjustment.
7. The method of any preceding claim, wherein the shape of the mapping function is calculated dynamically based upon one or more properties of the image including one or more of: a magnitude of the difference between target mean output pixel value and the calculated mean pixel value for the image; and a sign of the difference between target mean output pixel value and the calculated mean pixel value for the image.
8. The method of claim 7, wherein the one or more properties of the image include statistical properties of a histogram of the image.
9. The method of any preceding claim, wherein pixels of the image with a value below a lower threshold value or above an upper threshold value are discounted from the calculation of the mean pixel value for the pixel values of the pixels in the image.
10. The method of claim 9, wherein the lower threshold and upper threshold are calculated based on a cumulative histogram of the image.
11. The method of claim 10, wherein the lower threshold and upper threshold each comprise pixel values corresponding to values where the cumulative histogram reaches a respective predetermined value.
12. The method of any preceding claim, wherein the image is a frame in a sequence of frames of a video.
13. The method of claim 12, wherein a temporal filter is used to apply weights to historical values of properties of frames in the sequence.
14. The method of claim 13, wherein the properties comprise a mean pixel value of each frame, and an upper threshold value and a lower threshold value each derived from a cumulative histogram of the respective frame.
15. The method of claim 13 or 14, wherein the temporal filter comprises at least two sections comprising: a flat section which acts on the most recent frames; and a Gaussian section which acts on frames older than a predetermined number of frames, wherein frames in the flat section receive higher weights than those in the Gaussian section.
16. The method of any of claims 12 to 15 wherein the sequence of frames is provided from an image capture device of a vehicle to enable controlling the vehicle in low-contrast lighting conditions.
17. A computer implemented method comprising: calculating a mean pixel value of an input image, the input image comprising a plurality of pixels each having a respective pixel value; obtaining an first intermediate image by multiplying the pixel value for each of the plurality of pixels of the input image by the ratio of a target value for the mean of an output image to the mean pixel value of the input image; obtaining a second intermediate image by calculating the difference between the pixel values of the first intermediate image and the target value for the mean of an output image; obtaining each output image pixel value of the output image by: when the corresponding pixel value of the second intermediate image is greater than or equal to zero, mapping the corresponding pixel value of the second intermediate image to a value using a first mapping function and adding this value to the target value for the mean of the output image, and when the corresponding pixel value of the second intermediate image is less than zero, mapping the corresponding pixel value of the second intermediate image to a value using a second mapping function and subtracting this value from the target value for the mean of the output image; and outputting the output image comprising the output image pixel values.
18. A system comprising: an image capture device configured to generate an input image; an image processing unit configured to: apply gain to a received image by multiplying the pixel value of pixels in the image by a ratio of a target mean output pixel value and a mean pixel value for the pixel values of the pixels in the image; and adjust the dynamic range of the gain-adjusted image by: determining a pixel difference value by calculating difference between each gain-applied pixel value and the target mean output pixel value; and determining an output image by mapping the pixel difference values into output image pixel values, using a mapping function that expands the distribution of pixel difference values close to the target mean output pixel value and compresses the distribution of pixel values far away from the target mean output pixel value.
19. The system of claim 18, further comprising a display configured to render the output image.
20. The system of claim 18 or 19, further comprising an image recognition unit configured to generate image metadata.
21. The system of any of claims 18 to 20, wherein the image processing unit comprises a central processing unit, CPU, and field programmable gate array, FPGA, firmware.
22. The system of claim 21, wherein: the CPU is configured to determine the mapping function; and the FPGA firmware is configured to determine the output image by mapping the pixel difference values into output image pixel values using the mapping function.
23. The system of any of claims 18 to 22 further comprising a vehicle, wherein the image capture device, image processing unit, and display are located on the vehicle and are configured to enable control the vehicle.
24. The system of any of claims 18-23 further configured to carry out the method of any of claims 1 to 17.