对于我的项目,我试图用python中的openCV二进制化图像。 我使用openCV的自适应高斯阈值来转换图像,结果如下:
我想将二进制图像用于OCR,但它太吵了。 有没有办法从python中删除二进制图像中的噪音? 我已经尝试过openNV中的fastNlMeansDenoising,但它并没有什么区别。
PS也是更好的二值化选择
For my project i'm trying to binarize an image with openCV in python. I used the adaptive gaussian thresholding from openCV to convert the image with the following result:
I want to use the binary image for OCR but it's too noisy. Is there any way to remove the noise from the binary image in python? I already tried fastNlMeansDenoising from openCV but it doesn't make a difference.
P.S better options for binarization are welcome as well
最满意答案
您应该首先将参数调整为自适应阈值,以便使用更大的区域。 这样就不会分割出噪音。 每当输出图像的噪声高于输入图像时,您就知道自己做错了什么。
我建议使用一个自适应阈值来使用一个结构元素(在输入灰度值图像上)和一个足够大的结构元素来删除所有文本。 此结果与输入图像之间的差异恰好是所有文本。 然后,您可以对此差异应用常规阈值。
You should start by adjusting the parameters to the adaptive threshold so it uses a larger area. That way it won't be segmenting out noise. Whenever your output image has more noise than the input image, you know you're doing something wrong.
I suggest as an adaptive threshold to use a closing (on the input grey-value image) with a structuring element just large enough to remove all the text. The difference between this result and the input image is exactly all the text. You can then apply a regular threshold to this difference.
更多推荐
发布评论