Remove Noise from Images in C#

General document noise is random noise — unnecessary marks on a document — that tends to be a result of a scanning process. The reason could be that a low-quality scanner with a bad analog-to-digital converter was used, or that the quality of the document itself was degraded due to being scanned and printed multiple times. Regardless of the cause, this random noise makes documents difficult to read.

This article explains how to clean up noise in your documents.

Information

Don’t preprocess documents before recognizing text with OCR. The GdPicture.NET OCR engine preprocesses documents automatically with better results than manual preprocessing.

Bitonal Despeckling

Bitonal despeckling removes random dots (often called salt-and-pepper noise) from binary black and white documents. The images below show what a document looks like before and after bitonal despeckling.

Before bitonal despeckling After bitonal despeckling

To remove salt-and-pepper noise from a document, follow the steps below.

  1. Create a GdPictureImaging object.

  2. Select the image by passing its path to the CreateGdPictureImageFromFile method of the GdPictureImaging object.

  3. Remove the noise with the FxBitonalDespeckle method or the FxBitonalDespeckleMore method of the GdPictureImaging object. These methods take the following parameters:

    1. The image ID.

    2. Specify whether to fix text after removing noise. Set to true for documents with dots per inch (DPI) lower than 200. Otherwise, set to false.

  4. Save the output in a new image with the SaveAsPNG method of the GdPictureImaging object.

  5. Release the image resource with the ReleaseGdPictureImage method of the GdPictureImaging object.

The example below removes salt-and-pepper noise from the document:

using GdPictureImaging gdpictureImaging = new GdPictureImaging();
// Load the image from a file.
int imageId = gdpictureImaging.CreateGdPictureImageFromFile(@"C:/temp/source.png");
// Remove salt-and-pepper noise.
gdpictureImaging.FxBitonalDespeckle(imageId, false);
// Save the output in a new image.
gdpictureImaging.SaveAsPNG(imageId, @"C:/temp/output.png");
gdpictureImaging.ReleaseGdPictureImage(imageId);
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
    ' Load the image from a file.
    Dim imageId As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:/temp/source.png")
    ' Remove salt-and-pepper noise.
    gdpictureImaging.FxBitonalDespeckle(imageId, False)
    ' Save the output in a new image.
    gdpictureImaging.SaveAsPNG(imageId, "C:/temp/output.png")
    gdpictureImaging.ReleaseGdPictureImage(imageId)
End Using
Used Methods and Properties

Related Topics

Removing Isolated Dots

The images below show what a document looks like before and after removing isolated dots.

Before removing isolated dots After removing isolated dots

To remove isolated dots from a document, follow the steps below.

  1. Create a GdPictureImaging object.

  2. Select the image by passing its path to the CreateGdPictureImageFromFile method of the GdPictureImaging object.

  3. Remove the isolated dots by passing the image ID to one of the following methods of the GdPictureImaging object:

    • The FxBitonalRemoveIsolatedDots2x2 method removes 4-pixel-sized isolated black dots in 8 directions (horizontally, vertically, and diagonally).

    • The FxBitonalRemoveIsolatedDots4 method removes 1-pixel-sized isolated black dots in 4 directions (horizontally and vertically).

    • The FxBitonalRemoveIsolatedDots8 method removes 1-pixel-sized isolated black dots in 8 directions (horizontally, vertically, and diagonally).

  4. Save the output in a new image with the SaveAsPNG method of the GdPictureImaging object.

  5. Release the image resource with the ReleaseGdPictureImage method of the GdPictureImaging object.

The example below removes isolated dots from the document:

using GdPictureImaging gdpictureImaging = new GdPictureImaging();
// Load the image from a file.
int imageId = gdpictureImaging.CreateGdPictureImageFromFile(@"C:/temp/source.png");
// Remove isolated dots.
gdpictureImaging.FxBitonalRemoveIsolatedDots2x2(imageId);
// Save the output in a new image.
gdpictureImaging.SaveAsPNG(imageId, @"C:/temp/output.png");
gdpictureImaging.ReleaseGdPictureImage(imageId);
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
    ' Load the image from a file.
    Dim imageId As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:/temp/source.png")
    ' Remove isolated dots.
    gdpictureImaging.FxBitonalRemoveIsolatedDots2x2(imageId)
    ' Save the output in a new image.
    gdpictureImaging.SaveAsPNG(imageId, "C:/temp/output.png")
    gdpictureImaging.ReleaseGdPictureImage(imageId)
End Using
Used Methods and Properties

Related Topics

Removing Parasite Noise and Speckle without Affecting Content

The images below show what a document looks like before and after removing parasite noise and speckle.

Before removing parasite noise and speckle After removing parasite noise and speckle

To remove parasite noise and speckle from a document, follow the steps below.

  1. Create a GdPictureImaging object.

  2. Select the image by passing its path to the CreateGdPictureImageFromFile method of the GdPictureImaging object.

  3. Remove the parasite noise and speckle with the RemoveBlob method of the GdPictureImaging object. This method removes blobs within the size range and fill percentage you specify. The RemoveBlob method takes the following parameters:

    • The minimum width of blobs.

    • The minimum height of blobs.

    • The maximum width of blobs.

    • The maximum height of blobs.

    • The minimum percentage of black pixels within the blob compared to the bounding rectangle around the blob.

    • The maximum percentage of black pixels within the blob compared to the bounding rectangle around the blob.

  4. Save the output in a new image with the SaveAsPNG method of the GdPictureImaging object.

  5. Release the image resource with the ReleaseGdPictureImage method of the GdPictureImaging object.

Noise is usually small compared to text size. This means that specifying low values for blob size normally removes the noise and keeps the text. However, higher DPI means a larger image, larger content size, and larger noise. For this reason, experiment with the size values to obtain the best results.

To start, use the following parameters on a generic 200 DPI image:

  • Set the minimum width and height to 1.

  • Set the maximum width and height to 12.

  • Set the minimum fill percentage to 1 and the maximum fill percentage to 100.

If not all the noise is removed with these settings, increase the maximum height and width. If some of the text is removed, reduce the maximum size values.

The example below removes parasite noise and speckle from the document:

using GdPictureImaging gdpictureImaging = new GdPictureImaging();
// Load the image from a file.
int imageId = gdpictureImaging.CreateGdPictureImageFromFile(@"C:/temp/source.png");
// Remove parasite noise and speckle.
gdpictureImaging.RemoveBlob(imageId, 1, 1, 12, 12, 1, 100);
// Save the output in a new image.
gdpictureImaging.SaveAsPNG(imageId, @"C:/temp/output.png");
gdpictureImaging.ReleaseGdPictureImage(imageId);
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
    ' Load the image from a file.
    Dim imageId As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:/temp/source.png")
    ' Remove parasite noise and speckle.
    gdpictureImaging.RemoveBlob(imageId, 1, 1, 12, 12, 1, 100)
    ' Save the output in a new image.
    gdpictureImaging.SaveAsPNG(imageId, "C:/temp/output.png")
    gdpictureImaging.ReleaseGdPictureImage(imageId)
End Using
Used Methods and Properties

Related Topics

Combining Different Methods

For some images and documents, you can obtain the best results by combining all three methods explained above.

The advantage of removing isolated dots over bitonal despeckling is that it doesn’t affect the text in low-DPI images. On the other hand, it doesn’t clean as much noise.

Calling RemoveBlob removes most noise with no effect on the text, but this requires knowing how to correspond the DPI of the image and the size of the content to the size of the noise, making it trickier to use.

Removing Noise Vigorously

In some images, the amount of noise nearly exceeds the amount of data. The images below show what a document looks like before and after a vigorous cleanup.

Before a vigorous cleanup After a vigorous cleanup

To remove noise that’s too extreme or sporadic and doesn’t follow a pattern or size, follow the steps below.

  1. Create a GdPictureImaging object.

  2. Select the image by passing its path to the CreateGdPictureImageFromFile method of the GdPictureImaging object.

  3. Remove noise vigorously with the FxBitonalVigorousDespeckle method of the GdPictureImaging object. This method cleans so vigorously that some data might be affected, but the output is likely better than the original. This method takes the following parameters:

    1. The image ID.

    2. Specify whether to check for and retain dots of the letters i and j. Setting this parameter to true retains most of the dots of the letters i and j, but it results in slightly slower processing and retains a little more noise around text.

  4. Save the output in a new image with the SaveAsPNG method of the GdPictureImaging object.

  5. Release the image resource with the ReleaseGdPictureImage method of the GdPictureImaging object.

The example below removes parasite noise and speckle from the document:

using GdPictureImaging gdpictureImaging = new GdPictureImaging();
// Load the image from a file.
int imageId = gdpictureImaging.CreateGdPictureImageFromFile(@"C:/temp/source.png");
// Remove the noise.
gdpictureImaging.FxBitonalVigorousDespeckle(imageId, true);
// Save the output in a new image.
gdpictureImaging.SaveAsPNG(imageId, @"C:/temp/output.png");
gdpictureImaging.ReleaseGdPictureImage(imageId);
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
    ' Load the image from a file.
    Dim imageId As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:/temp/source.png")
    ' Remove the noise.
    gdpictureImaging.FxBitonalVigorousDespeckle(imageId, True)
    ' Save the output in a new image.
    gdpictureImaging.SaveAsPNG(imageId, "C:/temp/output.png")
    gdpictureImaging.ReleaseGdPictureImage(imageId)
End Using
Used Methods and Properties

Related Topics