Vous êtes sur la page 1sur 55

Submitted by Name Saikat Das Pallab Kundu Abhik Misra Swarup Ganguly Sayan Bhowal Roll Number 13001061012

13001061033 13001061027 13001061016 13001061022

Under the guidance of Ms. Poulami Dutta

Submitted for the fulfillment for the degree of Bachelor of Technology in Computer Science and Engineering Under WBUT

Techno India EM 4/1, Salt Lake, Sector V, Kolkata 700 091.

ACKNOWLEDGEMENT We would like to express our sincere gratitude to Ms. Poulami Dutta of the department of Computer Science and Engineering, whose role as project guide was invaluable for the project. We are extremely thankful for the keen interest he / she took in advising us, for the books and reference materials provided for the moral support extended to us. Last but not the least we convey our gratitude to all the teachers for providing us the technical skill that will always remain as our asset and to all non-teaching staff for the cordial support they offered.

Place: Techno India, Salt Lake Date:

.. .

..
2

Department of Computer Science and Engineering Techno India, Salt Lake Kolkata 700 091 West Bengal, India. Approval This is to certify that the project report entitled Image Segmentation using Edge Detection Method prepared under my supervision by Saikat Das (13001061012), Pallab Kundu (13001061033), Abhik Misra (13001061027), Swarup Ganguly (13001061016), Sayan Bhowal (13001061022) be accepted in fulfillment for the degree of Bachelor of Technology in Computer Science and Engineering which is affiliated to West Bengal University of Technology. It is to be understood that by this approval, the undersigned does not necessarily endorse or approve any statement made, opinion expressed or conclusion drawn thereof, but approves the report only for the purpose for which it has been submitted. .. Ms. Poulami Dutta Lecturer, Department of Computer Science and Engineering Techno India, Salt Lake Ms. Mousumi Bhattacharya Sr. Lecturer and In-Charge, Department of Computer Science and Engineering Techno India, Salt Lake

Contents
1. Introduction..7 1.1 Applications .11

2. Problem Definition ..19

3. Planning20 3.1 Work on Image...21 3.1.1 Digital image..21 3.1.2 Binary Images...22 3.1.3 Gray scale image.23 3.2 Color image formats......................25 3.2.1 GIF......25 3.2.2 JPEG / JFIF.....25 3.2.3 PNG.....26 3.3 RGB Model.....27

4. Design Issue ...28 4.1 Platform & Software specification ..28

4.2 Proposed Algorithm..28 4.2.1 Previously Proposed Algorithm.29 4.2.2 Currently Proposed Algorithm..33

5. Coding...37

6. TESTING..43

7. Conclusion ..49

8. Future Scope

. .50

10. List Of Figures 1. Example figure showing some important properties that should be captured by a

segmentation method.. 2.

3. An RGB image and segmentation found by our segmentation method(regions are white bordered and averaged inside) 10 3. Image of a fingerprint..15 4. Brake light hue range (unshifted)17
5

5.Positive Example..18 6.Negative Example.....19 7. Digitization of a continuous image..22 8. Conversion of RGB image into Grey scale image...24 9.Conversion of an image....24 10.Additive Color Mixing....27 11.Original Image for previous algorithm...43 12.Gray scale image for previous algorithm....44 13.Edge Detected image for previous algorithm....44 14.Segmented Image for previous algorithm.45 15.Squirrel.jpg....46 16.Color map..47 17.Segmented image..48

11. Bibliography.. 52

INTRODUCTION

The problem of image segmentation and grouping remain great challenges for computer vision. Since the time of the Gestalt movement in psychology, it has been known that perceptual grouping plays a powerful role in human visual perception. A wide range of computational vision problems could in principle make good use of segmented images, were such segmentations reliably and efficiently computable. For instance intermediatelevel vision problems such as stereo and motion estimation require an appropriate region of support for correspondence operations. Spatially non-uniform regions of support can be identified using segmentation techniques. Higher-level problems such as recognition and image indexing can also make use of segmentation results in matching, to address problems such as figure-ground separation and recognition by parts.

Our goal is to develop computational approaches to image segmentation that are broadly useful, much in the way that other low-level techniques such as edge detection are used in a wide range of computer vision tasks. In order to achieve such broad utility, we believe it is important that a segmentation method have the following properties:
7

1. Capture perceptually important groupings or regions, which often reject global aspects of the image. Two central issues are to provide a precise characterization of what is perceptually important, and to be able to specify what a given segmentation technique does. We believe that there should be precise definitions of the properties of a resulting segmentation, in order to better understand the method as well as to facilitate the comparison of different approaches. 2. Be highly efficient, running in time nearly linear in the number of image pixels. In order to be of practical use, we believe that segmentation methods should run at speeds similar to unique edge detection or other low-level visual processing techniques, meaning nearly linear time and with low constant factors. For example, a segmentation code runs at several frames per second can be used in video processing applications.

While the past few years have seen considerable progress in eigenvector-based methods of image segmentation these methods are too slow to be practical for many applications. In contrast, the method described in this paper has been used in large-scale image database applications as described in. While there are other approaches to image segmentation that are highly efficient, these methods generally fail to capture perceptually important non-local properties of an image as discussed below. The segmentation technique developed here both captures certain perceptually important nonlocal image characteristics and is computationally efficient {running in O(n log n) time} for n image pixels and with low constant factors, and can run in practice at video rates.

Figure 1: Example figure showing some important properties that should be captured by a segmentation method. As with certain classical clustering methods our methods is based on detection of edges from the original picture and dividing the pictures into components and showing the segments using the component numbers. It is in the development phase since we have already discovered some issues while considering implementation which we will have to address in future. It is established since the Gestalt movement in psychology that perceptual grouping plays a fundamental role in human perception. Even though this observation is rooted in the early part of the 20th century, the adaptation and automation of the segmentation (and, more generally, grouping) task with computers has remained so far a tantalizing and central problem for image processing. Vision is widely accepted as an inference problem, i.e., the search of what caused the observed data. In this respect, the grouping problem can be roughly presented as the transformation of the collection of pixels of an image into a visually meaningful partition of regions and objects. This postulates implicitly the existence of optimal segmentation(s) which we should aim at recovering or approximating, and this task implies casting the perceptual formulation of optimality into a formalized, well-defined problem. A prominent trend in grouping focuses on graph cuts, mapping image pixels onto graph vertices, and the spatial relationships between
9

pixels onto weighted graph edges. The objective is to minimize a cut criterion, given that any cut on this graph yields a partition of the image into (hopefully) coherent visual patterns. Cut criteria range from conventional to more sophisticated criteria, tailored to grouping. These are basically global criteria; however, the strategies adopted for their minimization range through a broad spectrum, from local to global optimization through intermediate choices Global optimization strategies have the advantage to directly tackle the problem as a whole, and may offer good approximations at possible algorithmic expenses though pixels with homogeneous properties and they are iteratively grown by combining smaller regions or pixels, pixels being elementary regions. Region growing/merging techniques usually work with a statistical test to decide the merging of regions. A merging predicate uses this test, and builds the segmentation on the basis of (essentially) local decisions. This locality in decisions has to preserve global properties, such as those responsible for the perceptual units of the image.

10

Figure 2: An RGB image and segmentation found by our segmentation method(regions are white bordered and averaged inside) In the grassy region below the castle is one such unit, even when its variability is high compared to the other regions of the image. In that case, a good region merging algorithm has to find a good balance between preserving this unit and the risk of over-merging for the remaining regions. Fig. 2b shows the result of our approach. As long as the approach is greedy, two essential components participate in defining a region merging algorithm: the merging predicate and the order followed to test the merging of regions. There is a lack of theoretical results on the way these two components interact together, and can benefit from each other. This might be partially due to the fact that most approaches use assumptions on distributions, more or less restrictive, which would make any theoretical insight into how region merging works restricted to such settings and, therefore, of possibly moderate interest. Our aim in this paper is to propose a path and its milestones from a novel model of image generation, the theoretical properties of possible segmentation approaches to a practical, readily available system of image segmentation, and its extensions to miscellaneous problems related to image segmentation. First, the key idea of this model is to really formulate image segmentation as an inference problem. It is the reconstruction of regions on the observed image, based on an unknown theoretical (true) image on which the true regions we seek are statistical regions whose borders are defined from a simple axiom. Second, we show the existence of a particular blend of statistics and algorithmic to process observed images generated with this model, by region merging, with two statistical properties. With high probability, the algorithm suffers only one source of error for image segmentation: over-merging, that is, the fact that some observed region may contain more than one true region. The algorithm does not suffer neither under-merging, nor the most frequenthybrid cases where observed regions may partially span several true regions. Yet, there is more: With high probability, this over merging error is, as we show, formally small as the algorithm manages accuracy in segmentation close to the optimum, up to low order terms. The algorithm has some desirable features: It relies on a simple interaction between a merging predicate easily

11

implementable, and an order in merging approximable in linear time. Furthermore, it can be adapted to most numerical feature description spaces (RGB, HSI, L _ u _ v _, etc.). Third, we provide a C-code implementation of this last algorithm, which is a few hundred lines of C, and experiments on various benchmark images, as well as comparisons with other algorithms. Last, we show how to extend the algorithm to naturally cope with hard noise and/or significantly occluded images at very affordable algorithmic complexity. Though running the algorithm does not require tuning its parameters, the control of a statistical complexity parameter makes it possible to adjust the segmentation scale in a simple manner.

o Practical Applications of Segmentation: Some of the practical applications of image segmentation are: Medical Imaging

Image segmentation is crucial for multimedia applications. Multimedia databases utilize segmentation for the storage and indexing of images and video. Image segmentation is used for object tracking in the new MPEG-7 video compression standard. It is also used in video conferencing for compression and coding purposes. These are only some of the multimedia applications in image segmentation. It is usually the first task of any image analysis process, and thus, subsequent tasks rely heavily on the quality of segmentation. The proposed method of color image segmentation is very effective in segmenting a multimedia-type image into regions. Pixels are first classified as either chromatic or achromatic depending on their HSI color values. Next, a seed determination algorithm finds seed pixels that are in the center of regions. These seed pixels are used in the region growing step to grow regions by comparing these seed pixels to neighboring pixels using the cylindrical distance metric. Merging regions that are similar in color is a final means used for segmenting the image into even smaller regions. o Locate tumors and other pathologies o Measure tissue volumes
12

o Computer-guided surgery o Diagnosis o Treatment planning o Study of anatomical structure Tongue diagnosis is an important diagnostical method of traditional Chinese medicine. The accuracy of tongue diagnosis can be improved by tongue characterization. Tongue area segmentation are important contents of preprocess of tongue image The use of computer vision technology to achieve the objective of tongue diagnosis of Chinese medicine is of great significance. The first task of objective analysis of tongue images is to extract tongue area out. The first step is to use median filter to remove noise image. The second step is transforms the image color space to HIS color space, Then the utilization dual Snakes algorithms obtain the accurate and complete tongue image. Through testing, the method has proved to be satisfaetory for tongue image segmentation.

The anatomical variations and unpredictable nature of surgeries make the visibility very important, especially to correctly diagnose problems intraoperatively. In this paper, hyper spectral imaging is proposed as a visual supporting tool to detect different organs and tissues during surgeries. This technique can aid the surgeon to find ectopic tissues and to diagnose tissue abnormalities. Two cameras were used to capture images within 4001700 nm spectral range. The high-dimensional data were classified using a Support Vector Machine (SVM). This method was evaluated for the detection of the spleen, colon, small intestine, urinary bladder and peritoneum in abdominal surgeries on pigs.

Locate objects in satellite images (roads, forests etc.):The work at Control Data Corp. took two real images as input, warped one to correspond to the other spatially, and transformed the intensity values to account for wide area variations. Subtraction of the images indicated regions of changes. This work involved the development of real-time special purpose systems
13

to perform the matching, warping, and differencing for change detection in a variety of imagery domains (X-ray, radar, and visible light). Also transform regions of the image based on intensity and contrast. The basic algorithm: (1) For each point on a regular grid in the data base image, find the maximum correlation value for its neighborhood in the input image. This system assumes that the images are already approximately registered, so that the search for the exact matching point is in a limited area. The processing begins on one edge of the image and steps across the image, allowing a linkage between adjacent grid points to determine approximate matches within featureless areas. Match locations are interpolated to find the maximum correlation position with accuracy much better than one pixel. (2) Four grid points forming a square in the data base image map to four points forming a quadrilateral in the input image. The points within the quadrilateral are transformed to fit the input square by interpolation. This basic technique can be refined to find matches along the sides of the quadrilateral. (3) A two-dimensional histogram plotting the image intensity value of an individual pixel in one image versus the value in the second image (assuming that the two images are rectified spatially) should lie along the 45o axis. If the mass of points lie along a different angle, then the intensity values are adjusted. This intensity rectification is applied over local areas of the image rather than globally to account for local, but large-scale variations in intensity. Small anomalies will still appear, but these should correspond to true differences in the two images, and thus to changes in the scene. (4) By subtracting the rectified image from the data base image, changes between the two views are apparent. An analysis of the two-dimensional histogram, as used for the intensity rectification, indicates the type of changes that have occurred (objects added or objects removed).

Face recognition :Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been
14

identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.

Fingerprint recognition:The first phase of the fingerprint verification process is the fingerprint enrollment phase. It is very important to know the size and quality of the image that the fingerprint sensor in use takes, so we can have an idea on how we are going to preprocess it. From this image, minutiae are extracted and stored in a data base. This process repeats, resulting in the generation of a live template. The measurement of our success is going to include some steps to achieve our goal: Several images are going to be taken of the same fingerprint to cover various aspects of the image. (Position, dryness, humidity, dust, brightness, darkness, etc.). There is going to be a limited number of persons fingerprints in a database to be recognized; if a person enters its finger and its fingerprint is not in the database, it has to be rejected. A threshold is going to be set for the acceptance or rejection of a specific fingerprint.

15

Figure 3: Image of a fingerprint

Traffic control systems:-

A method of individual vehicle detection using grayscale images acquired from a high position is proposed for guidance of incoming vehicles to vacant cells in a parking lot and other similar purposes. With the proposed method, each image region corresponding to a cell is fragmented according to density (gray level), and the distribution of segment area is analyzed to decide if a vehicle is present. Reference images taken in vacant state are not needed, hence the method can be easily applied to parking lots in continuous service. Shape features are not employed, hence detection is performed independent of car shape. The proposed method was tested on an actual outdoor parking lot during 4 days with different weather conditions from sunrise through sunset. The results confirmed the efficiency of the proposed method, with the detection rate being over 98.7%.

Brake light detection:Given a forward-facing color image, first extract pixels in a range of hue, saturation, and brightness that corresponds total lights. Connected pixels are grouped
16

(segmented) into regions. Region pairs are then classified by likelihood of being the outer two tail lights of a vehicle. Another round of segmentation is performed with relaxed saturation and light, which is often dimmer and smaller than the outer tail lights. Each region pair is compared to each relaxed region for likely candidates, based on several assumptions about the geometry of brake lights. Finally, the best candidates are returned as region triplets corresponding to brake lights.

Region Segmentation by Color By assuming that the light from brake lights occupies a narrow range of hue, saturation, and brightness. Given these ranges, I construct a binary image where white corresponds to the pixels of the color image within the ranges. Since the range of hue that Im interested is chiefly red and the digital representation of red is 0 (of 255), I first shift the hue of the image by 128. Given this shift, manual measurements of brake light hue yielded a hue range of [110, 170], also pictured unshifted below.

Figure 4 : Brake light hue range (unshifted).

Two separate instances of region segmentation by color are performed. First, segmentation with saturation range [160, 255] and brightness range [160, 255] produces a narrow- range image that is expected to contain the regions corresponding to the outer two tail lights of the vehicles. The second segmentation with saturation range [96, 255]
17

and brightness range [128, 255] produced a wide-range image that is expected to contain the region corresponding to the central brake light of the vehicle. Each segmentation is performed on the hue-shifted color image. The first segmentation is necessarily a subset of the second. With the binary image in hand, segmentation is relatively simple. First, a morphological closure (dilation and erosion with radius 1) is performed to remove any holes in regions or to connect regions that should be connected. Then, sets of 4-connected pixels are grouped together and assigned to distinct regions. For each region, I store the coordinates (x, y) of the center, the number of pixels n, and the maximum distance r from any point in the region to the center, which I define as the regions radius.

Pairing Regions By considering each pair of regions in the narrow-range image as a candidate for the outer two tail lights of a vehicle. These pairs are pruned by three rules based on several asavgn > negn and c exists, then return the region triplet a, b, c as an instance of a brake light. This logic relies on the intuition that the central brake light should be the largest region in the positive area, but it should also be smaller than the outer tail lights. Also, there should not be much noise between the two outer tail lights, since there will not be lights in this part of the vehicle. In practice, the area parameters were manually fixed as = 0.1, = 0.25, =0.05, = 0.5, and = 0.1. A set of positive- and negative-example images of brake lights were collected in the University Heights area of Cleveland, Ohio. In each image, the presence and location of brake lights was correctly determined. In the following images, the positive-example images have brake lights circled, and the negative-example images are unchanged.

18

Figure 5: Positive example.

Figure 6: Negative example.

19

1. PROBLEM DEFINITION

Segmentation: In the analysis of the objects in images it is essential that we can distinguish between the objects of interest and "the rest." This latter group is also referred to as the background. The techniques that are used to find the objects of interest are usually referred to as segmentation techniques - segmenting the foreground from background.

Segmentation is an important aspect of image processing. In computer vision, segmentation refers to the process of partitioning a digital image into multiple regions (comprising sets of pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics. In digital imaging a pixel is the smallest piece of information in an image. Pixels are normally arranged in a regular two-dimensional grid, and are often represented using dots, squares, or rectangles. Each pixel is a sample of the original image, where more samples typically provide a more accurate representation of the original. The intensity of each pixel is variable; in color system each pixel has typically three or four components such as red, green, and blue or cyan, magenta, yellow and black.

The result of image segmentation is a set of regions that collectively cover the entire image or a set of contours extracted from the image. Each of the pixels in a region is similar with respect to some characteristic or computed property such as color, intensity or texture. Adjacent regions are significantly different with respect to the same characteristic. There is no universally applicable segmentation technique that will work for all images and no segmentation technique is perfect.
20

2. PLANNING 3.1 Image on work

An image defined in the "real world" is considered to be a function of two real variables, for example, a(x, y) with a as the amplitude (e.g. brightness) of the image at the real coordinate position (x, y). An image may be considered to contain sub-images sometimes referred to as regions-of-interest, ROIs, or simply regions. This concept reflects the fact that images frequently contain collections of objects each of which can be the basis for a region. In a sophisticated image processing system it should be possible to apply specific image processing operations to selected regions. Thus one part of an image (region) might be processed to suppress motion blur while another part might be processed to improve color rendition.

The amplitudes of a given image will almost always be either real numbers or integer numbers. The latter is usually a result of a quantization process that converts a continuous range (say, between 0 and 100%) to a discrete number of levels. In certain image-forming processes, however, the signal may involve photon counting which implies that the amplitude would be inherently quantized. In other image forming procedures, such as magnetic resonance imaging, the direct physical measurement yields a complex number in the form of a real magnitude and a real phase. For the remainder of this book we will consider amplitudes as real or integers unless otherwise indicated. Image processing usually refers to digital image processing but optical and analog image processing is also possible.

3.1.1 Digital image


21

A digital image a[m, n] described in a 2D discrete space is derived from an analog image a(x, y) in a 2D continuous space through a sampling process that is frequently referred to as digitization. The mathematics of that sampling process will be described in Section 5. For now we will look at some basic definitions associated with the digital image. The effect of digitization is shown in Figure.The 2D continuous image a(x, y) is divided into N rows and M columns. The intersection of a row and a column is termed a pixel. The value assigned to the integer coordinates [m, n] with {m=0, 1, 2, ... M-1} and {n=0,1,2,...,N-1} is a[m, n]. In fact, in most cases a(x, y)--which we might consider to be the physical signal that impinges on the face of a 2D sensor--is actually a function of many variables including depth (z), color ( ), and time (t).

Figure 7: Digitization of a continuous image. The image shown in Figure 2 has been divided into N = 16 rows and M = 16 columns. The value assigned to every pixel is the average brightness in the pixel rounded to the nearest integer value. The process of representing the amplitude of the 2D signal at a given coordinate as an integer value with L different gray levels is usually referred to as amplitude quantization or simply quantization.

3.1.2

Binary Images

22

A binary image is a digital image that has only two possible values for each pixel. Typically the two colors used for a binary image are black and white though any two colors can be used. The color used for the object(s) in the image is the foreground color while the rest of the image is the background color.

Binary images are also called bi-level or two-level. This means that each pixel is stored as a single bit (0 or 1). The names black-and-white, B&W, monochrome or monochromatic are often used for this concept, but may also designate any images that have only one sample per pixel, such as grayscale images. In Photoshop parlance, a binary image is the same as an image in "Bitmap" mode.

Binary images often arise in digital image processing as masks or as the result of certain operations such as segmentation, thresholding, and dithering. Some input/output devices, such as laser printers, fax machines, and bi-level computer displays, can only handle bilevel images. A binary image is usually stored in memory as a bitmap, a packed array of bits. A 640480 image requires 37.5 KB of storage.

Binary images can be interpreted as subsets of the two-dimensional integer lattice Z2; the field of morphological image processing was largely inspired by this view.

3.1.3

Gray scale image


23

In photography and computing, a grayscale or grayscale digital image is an image in which the value of each pixel is a single sample, that is, it carries only intensity information. Images of this sort, also known as black-and-white, are composed exclusively of shades of gray, varying from black at the weakest intensity to white at the strongest. Grayscale images are distinct from one-bit black-and-white images, which in the context of computer imaging are images with only the two colors, black, and white (also called binary images). Grayscale images have many shades of gray in between. Grayscale images are also called monochromatic, denoting the absence of any chromatic variation. Grayscale images are often the result of measuring the intensity of light at each pixel in a single band of the electromagnetic spectrum (e.g. infrared, visible light, ultraviolet, etc.), and in such cases they are monochromatic proper when only a given frequency is captured. But also they can be synthesized from a full color image; see the section about converting to grayscale.

24

Figure 8: Conversion of RGB image into Grey scale image

(a)

(b)

(c)

Figure 9: Conversion of an image (a) Color image (b) binary image (c) Grey scale image

3.2 Color image formats 25

Image file formats are standardized means of organizing and storing images. This entry is about digital image formats used to store photographic and other images; (for disk-image file formats see Disk image). Image files are composed of either pixel or vector (geometric) data that are rasterized to pixels when displayed (with few exceptions) in a vector graphic display. The pixels that constitute an image are ordered as a grid (columns and rows); each pixel consists of numbers representing magnitudes of brightness and color.

3.2.1 GIF

GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the GIF format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and cartoon style images. The GIF format supports animation and is still widely used to provide image animation effects. It also uses a lossless compression that is more effective when large areas have a single color, and ineffective for detailed images or dithered images.

3.2.2 JPEG / JFIF

JPEG (Joint Photographic Experts Group) is a compression method; JPEG-compressed images are usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG compression is (in most cases) lossy compression. The JPEG/JFIF filename extension in DOS is JPG (other operating systems may use JPEG). Nearly every digital camera can
26

save images in the JPEG/JFIF format, which supports 8 bits per color (red, green, blue) for a 24-bit total, producing relatively small files. When not too great, the compression does not noticeably detract from the image's quality, but JPEG files suffer generational degradation when repeatedly edited and saved. Photographic images may be better stored in a lossless non-JPEG format if they will be re-edited, or if small "artifacts" (blemishes caused by the JPEG's compression algorithm) are unacceptable.

3.2.3 PNG

The PNG (Portable Network Graphics) file format was created as the free, open-source successor to the GIF. The PNG file format supports truecolor (16 million colors) while the GIF supports only 256 colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of photographic images, because JPG files are smaller than PNG files. Many older browsers currently do not support the PNG file format, however, with Mozilla Firefox or Internet Explorer 7, all contemporary web browsers now support all common uses of the PNG format, including full 8-bit translucency (Internet Explorer 7 may display odd colors on translucent images ONLY when combined with IE's opacity filter). The Adam7-interlacing allows an early preview, even when only a small percentage of the image data has been transmitted. PNG, an extensible file format for the lossless, portable, well-compressed storage of raster images. PNG provides a patent-free replacement for GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel. PNG is designed to work well in online viewing applications,
27

such as the World Wide Web, so it is fully streamable with a progressive display option. PNG is robust, providing both full file integrity checking and simple detection of common transmission errors. Also, PNG can store gamma and chromaticity data for improved color matching on heterogeneous platforms. Some programs do not handle PNG gamma correctly, which can cause the images to be saved or displayed darker than they should be.

3.3

RGB Model

The RGB color model utilizes the additive model in which red, green, and blue light are combined in various ways to create other colors. The very idea for the model itself and the abbreviation "RGB" come from the three primary colors in additive light models.

RGB color model itself does not define what exactly is meant by "red", "green" and "blue", so that the same RGB values can describe noticably different colors on different devices employing this color model. While they share a common color model, their actual color spaces can vary considerably.

28

Figure 10:
Additive color mixing: adding red to green yields yellow; adding yellow to blue yields white.

24-bit representation When written, RGB values in 24 bpp are commonly specified using three integers between 0 and 255, each representing red, green, and blue intensities, in that order. For example:

(0, 0, 0) is black (255, 255, 255) is white (255, 0, 0) is red (0, 255, 0) is green (0, 0, 255) is blue (255, 255, 0) is yellow (0, 255, 255) is cyan (255, 0, 255) is magenta

16-bit mode There is also a 16 bpp mode, in which there are either 5 bits per color, called 555 mode, or an extra bit for green (because the eye can see more shades of green than of other colors), called 565 mode. The 24 bpp mode is typically called Truecolor, while the 16 bpp mode is called HiColor. 32-bit mode The so-called 32bpp mode is almost always identical in precision to the 24bpp mode, there are still only eight bits per component, the eight extra bits are simply not used at all (except possibly as an alpha channel). The reason for the existence of 32bpp modes is the higher
29

speed at which most modern hardware can access data that is aligned to byte adresses divisible by 4, compared to data not so aligned.

48-bit mode (sometimes also called 16-bit mode) "16-bit mode" can also refer to 16 bit per component, resulting in 48 bpp. This modes makes it possible to represent 65535 tones of each color component instead of 255. This is primarily used in professional image editing, like Adobe Photoshop for maintaining larger precision when a sequence of more than one image filtering algorithms is used on the image. With only 8 bit per component, rounding errors tend to accumulate with each filtering algorithm that is employed, distorting the end result.

4. DESIGN ISSUES 4.1 Platform & Software specification


I.

Platform Used Windows 32 bit Operating System Software Used MATLAB 7

II.

4.2 Proposed Algorithm

Previously proposed Algorithm: Image Segmentation by Edge Detection

30

o We have tried to extract the objects from the image for detecting edges from original picture. For this we have converted the image into gray scale as filtering operation cannot be applied to colored image.
o

After that we have separated the edges. Then we have applied a special of mask to the Sobel operator to smooth the image and detached the noise as far as possible.

Now by calculating the connected components say it be mx. Here we can give a value between 1 and mx for L or in a loop we can extract all connected components.

In this way we are able to segment the image successfully by detecting edges.

Sobel operator The Sobel operator is used in image processing, particularly within edge detection algorithms. Technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function. At each point in the image, the result of the Sobel operator is either the corresponding gradient vector or the norm of this vector. The Sobel operator is based on convolving the image with a small, separable, and integer valued filter in horizontal and vertical direction and is therefore relatively inexpensive in terms of computations. On the other hand, the gradient approximation which it produces is relatively crude, in particular for high frequency variations in the image. The sobel convolution kernels take the following matrix form

31

Sobel kernel for Y derivative

Sobel kernel for X derivative

These kernels can be represented by the following two equations gx=(z7+ 2*z8+ z9)-( z1+ 2*z2+ z3) gy=(z3+ 2*z6+ z9)-( z1+ 2*z4+ z7) where the Zs are the intensity values of a 3x3 image region in the following form

Simplified description In simple terms, the operator calculates the gradient of the image intensity at each point, giving the direction of the largest possible increase from light to dark and the rate of change in that direction. The result therefore shows how "abruptly" or "smoothly" the image changes at that point and therefore how likely it is that that part of the image represents an edge, as well as how that edge is likely to be oriented. In practice, the magnitude (likelihood of an edge) calculation is more reliable and easier to interpret than the direction calculation.
32

Mathematically, the gradient of a two-variable function (here the image intensity function) is at each image point a 2D vector with the components given by the derivatives in the horizontal and vertical directions. At each image point, the gradient vector points in the direction of largest possible intensity increase, and the length of the gradient vector corresponds to the rate of change in that direction. This implies that the result of the Sobel operator at an image point which is in a region of constant image intensity is a zero vector and at a point on an edge is a vector which points across the edge, from darker to brighter values.

Formulation Mathematically, the operator uses two 33 kernels which are convolved with the original image to calculate approximations of the derivatives - one for horizontal changes, and one for vertical. If we define A as the source image, and Gx and Gy are two images which at each point contain the horizontal and vertical derivative approximations, the computations are as follows:

Where * here denotes the 2-dimensional convolution operation. The x-coordinate is here defined as increasing in the "right"-direction, and the ycoordinate is defined as increasing in the "down"-direction. At each point in the image, the resulting gradient approximations can be combined to give the gradient magnitude, using:

Using this information, we can also calculate the gradient's direction:

33

where, for example, is 0 for a vertical edge which is darker on the left side.

Noise reduction The principle sources of noise in digital images arrive during image acquisition and / or transmission. Performance of imaging sensors in affected by environmental conditions during image acquisition. The quality of sensing elements is also a major factor.

Spatial and frequency property Frequency properties refer to frequency content of noise in the fourier sense , when the fourier spectrum of noise in constant the noise is usually called white noise .In our project we assume that our noise is independent of spatial coordinate and that is unrelated with respect to the image itself.

Gaussian noise Because of its mathematical nature in both the spetial and frequency domain gussian noise models are used frequently in practice. In fact this nature is so convenient that it often results in Gaussian models being used in situations in which they are marginally applicable at best . Among of some other filtering methods we used median filtering in our project to reduce the noise from the image.
34

Median fiter The best known order statistic filter is the median filter which , as its names implies replaces the value of pixel by the median of the intensity levels in the neighborhood of that pixel :

F(x,y) = Median { g(s,t) } [ ( s,t Sx y )] The value of the pixel at (x, y) is included in the computation of the median.. Median filters are quite popular because, for certain types of random noise they provide excellent noise reduction capabilities, with considerably less blurring than linear filters of similar size. Median filters are particularly effective in the presence of both bipolar and unipolar impulse noise.

Currently Proposed Algorithm: Statistical Region Merging

Simplified Description In 4-connexity, there are N < 2|I| couples of adjacent pixels. Let SI be the set of these couples. Let f (p, p) is a real-valued function, with p and p pixels of I. Our segmentation algorithm, SRM (for Statistical Region Merging) is simple. We first sort the couples of SI in increasing order f (.,.), and then traverse this order only once. We make for any couple of pixels (p, p) SI for which R(p) R(p) ( R(p) stands for the current region to which
35

p belongs.) the test (R(p), R(p)), and merge R(p) and R(p) iff it returns true. The objective is obviously to choose f(.,.) so as to approximate A as best as possible. The next section reviews some choices we have made for f(.,.), each of constant time computation. Because we do not update the listing of merging tests after merging two regions, a simple ordering based on radix sorting with color differences as the keys yields a preordering time complexity O(|I| log(g)) linear in |I| for our basic implementations of SRM. The merging steps afterward are space/time computational optimal, which makes SRM also optimal from both standpoints. The execution time of our basic implementation of SRM, which is not optimized, segments our largest images (512512) in about one second on an Intel Pentium IV 2.40 processor.

Formulation For the sake of simplicity, we first state our theoretical results for a single color band (e.g., gray-level). On this basis, the extension of the results to more numerical channels, such as RGB, does not require an involved analysis: It is presented in Section 3.3. Recall that it is enough to give a merging predicate and an order to test region merging, to completely define our segmentation algorithm. Merging Predicate Our first result is based on the following theorem. Theorem 1 (The independent bounded difference inequality,. Let satisfies be a family of n independent r.v. with Xk Ck whenever vectors x and x0 differ only in the kth taking values in a set Ak for each k. Suppose that the real-valued function f defined on coordinate. Let be the expected value of the r.v. f(X). Then, for any 0,

36

From this theorem, we obtain the following result on the deviation of observed differences between regions of I. Here, the notation E(R) for some arbitrary region R is the expectation over all corresponding statistical pixels of I* of their sum of expectations of their Q r.v. for the single color band, and Corollary 1:Consider a fixed couple (R;R) of regions of I. no more than that , the probability is is the observed average of this color band.

Proof:Suppose we shift the value of the outcome of one r.v. among the possible for the couple (R;R). |R;R| is subject to a variation of at most cR=g/(Q|R|) when this modification affects region R (among Q|R| possible), and at most cR=g/(Q|R|) for a change inside R (among Q|R| possible).We get k(ck)2 = Q(|R| (cR)2+ |R|(cR)2 ) = (g2/Q)((1/|R|)+(1/|R|)) Using the fact that the deviation with the absolute value is at most twice that without, and using Theorem 1 (solving for ) brings our result. Suppose we do N merging tests in I. Then, with probability 1- (N), all couples of regions (R,R) whose merging is tested shall satisfy the right member of Corollary 1. Remark that N is small: for a single-pass algorithm, N <|I|2. In our 4connexity setting (each pixel is connected to its north, south, east, and west neighbors when they exist),we even have N <2|I|.What we really need to test the merging of two observed regions R and R is a predicate accurate enough when the pixels of R U R come from the same statistical region of I*. From this standpoint, using Corollary 1 to devise a merging predicate is straightforward: In this case, we have and, thus, with high probability, the deviation does not exceed . The merging predicate on two candidate regions R and R0 could thus be merge R and R iff with b(R;R) a merging threshold. We shall see hereafter that such a predicate is optimistic: Under some assumption, it tends sometimes to favor over merging (i.e., it does more merges than necessary to actually recover I*), but this phenomenon formally remains quantitatively small. For both theoretical and practical
37

considerations, we are going to replace this merging predicate by one slightly more optimistic, i.e., with a larger merging threshold. This one turns out to theoretically incur the same error (up to low order terms), and it gives very good visual results. Let R l be the set of regions with l pixels and Remark that provided regions R and R are not empty,

Here after, we prove a quantitative bound on the error obtained with the largest quantity (the right one) used as merging threshold: it holds for both others as well. The center quantity is the merging threshold we use. An upper-bound on |Rl| makes it quite reasonable with regard to b(R,R). Considering that a region is an unordered bag of pixels (each color channel is given 0; 1; . . . ; l pixels), we may fix |Rl|<=(l+1)min{l,g} (we have l 1 choices for the number of pixels having each color channel, which makes |Rl|<=(l+1)g and then we reduce this large upper-bound by counting the duplicates for l < g). To summarize, our merging predicate is:

Order in Merging The order in which we test the merging of regions follows a simple invariant A which we define as follows: Adef=def when any test between two (parts of) true regions occurs, that means that all tests inside each of the two true regions have previously occurred. It is crucial to note that A does not postulate the knowledge of the segmentation of I*. To make it clear why we should strive to fulfill A, let us first recall the three types of error a segmentation can suffer. First, under-merging represents the case where one or more regions obtained are strict subparts of true regions. Second, over merging represents the case where some regions obtained strictly contain more than one true region. Third, there is the hybrid (and most probable) case where some regions obtained contain more than one strict subpart of true regions. We have already partially outlined this in the preceding section related to the merging predicate: together with P (4), A makes it possible to control the segmentation error from both the qualitative and quantitative standpoints. The next theorem states that only over merging occurs with high probability.
38

In this theorem, we define s*(I) as the set of regions of the ideal (optimal) segmentation of I (defined from I*, see Fig. 3) and s(I) the set of regions in our segmentation of I.

Theorem 2:-

With probability > 1-(O||),the segmentation on I satisfying A is an overmerging of I*,that is Proof :From Corollary 1, with probability > 1- (N)=1-O(|I|), any couple of regions (R,R) coming from the same statistical region of I*, and whose merging is tested, satisfy Since our merging predicate P(R,R) would authorize the merging of R and R. Using the fact that A holds together with this property, we first rebuild all true regions of I*, and then eventually make some more merges: The segmentation obtained is an overmerging of I* with high probability, as claimed. The next theorem shows a quantitative upperbound on the error incurred with respect to the optimal segmentation. We define this error as the weighted average of the (absolute) channel differences over all nonempty intersections of regions between s*(I) and s(I): with E (slanted) denoting the expectation with associated probability measure .

Color Image The merging predicate for the RGB setting is:

Here, Ra denotes the observed average for color channel a in region R. Provided invariant A holds as in Order in merging section, our predicate preserves overmerging, and the same bound as that of Theorem 3 holds on the error if we measure it as the sum of errors over the three color channels.
39

5. Coding: For previously proposed algorithm corresponding to Edge Detection:% This is a program for extracting objects from an image. Written for vehicle number plate segmentation and extraction % U can use attached test image for testing % input - give the image file name as input. eg :- car3.jpg clc; clear all; k=input('Enter the file name','s'); % input image; color image im=imread(k); im1=rgb2gray(im); im1=medfilt2(im1,[3 3]); %Median filtering the image to remove noise% BW = edge(im1,'sobel'); %finding edges [imx,imy]=size(BW); msk=[0 0 0 0 0; 0 1 1 1 0; 0 1 1 1 0; 0 1 1 1 0; 0 0 0 0 0;]; B=conv2(double(BW),double(msk)); %Smoothing image to reduce the number of connected components L = bwlabel(B,8);% Calculating connected components mx=max(max(L)) % There will be mx connected components.Here U can give a value between 1 and mx for L or in a loop you can extract all connected components % If you are using the attached car image, by giving 17,18,19,22,27,28 to L you can extract the number plate completely. [r,c] = find(L==17); rc = [r c];
40

[sx sy]=size(rc); n1=zeros(imx,imy); for i=1:sx x1=rc(i,1); y1=rc(i,2); n1(x1,y1)=255; end % Storing the extracted image in an array figure,imshow(im); figure,imshow(im1); figure,imshow(B); figure,imshow(n1,[]);

For currently proposed algorithm corresponding to Statistical Region Merging:Part 1:% Statistical Region Merging % % Saikat Das, Pallab Kundu, Sayan Bhowal, Swarup Ganguly, Abhik Misra % Statistical Region Merging. IEEE Trans. Pattern Anal. Mach. Intell. 26, 11 (Nov. 2004), 1452-1458. % DOI= http://dx.doi.org/10.1109/TPAMI.2004.110 %Segmentation parameter Q; Q small few segments, Q large may segments Q=32; image=double(imread('squirrel-original.jpg')); %image=double(imread('havok.jpg')); % Smoothing the image
41

h=fspecial('gaussian',[3 3],1); image=imfilter(image,h); %Compute size of image and number of pixels size_image=size(image); n_pixels=size_image(1)*size_image(2); % Compute image gradient [Ix,Iy]=imgGrad(image(:,:,:)); %[Y,I] = MAX(X,[],DIM) operates along the dimension DIM. % Example: If X = [2 8 4 then max(X,[],1) is [7 8 9], % % % 7 3 9] max(X,[],2) is [8 9]

Ix=max(abs(Ix),[],3); Iy=max(abs(Iy),[],3); %Remove the last column of Ix,Iy Ix(:,end)=[]; Iy(end,:)=[]; [trash,index]=sort(abs([Iy(:);Ix(:)])); map=reshape([1:n_pixels],size_image(1:2)); gap=zeros(size(map)); treerank=zeros(size_image(1:2)); size_segments=ones(size_image(1:2)); image_seg=image; n_pairs=numel(index); idx2=reshape(map(:,1:end-1),[],1); idx1=reshape(map(1:end-1,:),[],1);
42

pairs1=[ idx1;idx2 ]; pairs2=[ idx1+1;idx2+size_image(1) ]; for i=1:n_pairs C1=pairs1(index(i)); C2=pairs2(index(i)); %Union-Find structure, here are the finds, average complexity O(1) while (map(C1)~=C1 ) C1=map(C1); end while (map(C2)~=C2 ) C2=map(C2); end % Compute the predicate, region merging test g=256; logdelta=2*log(6*n_pixels); dR=(image_seg(C1)-image_seg(C2))^2; dG=(image_seg(C1+n_pixels)-image_seg(C2+n_pixels))^2; dB=(image_seg(C1+2*n_pixels)-image_seg(C2+2*n_pixels))^2; logreg1 = min(g,size_segments(C1))*log(1.0+size_segments(C1)); logreg2 = min(g,size_segments(C2))*log(1.0+size_segments(C2)); dev1=((g*g)/(2.0*Q*size_segments(C1)))*(logreg1 + logdelta); dev2=((g*g)/(2.0*Q*size_segments(C2)))*(logreg2 + logdelta); dev=dev1+dev2;

predicat=( (dR<dev) && (dG<dev) && (dB<dev) );

43

if ((C1~=C2)&&predicat) % Find the new root for both regions if treerank(C1) > treerank(C2) map(C2) = C1; reg=C1; elseif treerank(C1) < treerank(C2) map(C1) = C2; reg=C2; elseif C1 ~= C2 map(C2) = C1; reg=C1; treerank(C1) = treerank(C1) + 1; end if C1~=C2 % Merge regions nreg=size_segments(C1)+size_segments(C2); tmp=(size_segments(C1)*image_seg(C1)+size_segments(C2)*image_seg( C2))/nreg; image_seg(reg+n_pixels)=(size_segments(C1)*image_seg(C1+n_pixels) +size_segments(C2)*image_seg(C2+n_pixels))/nreg; image_seg(reg+2*n_pixels)=(size_segments(C1)*image_seg(C1+2*n_pix els)+size_segments(C2)*image_seg(C2+2*n_pixels))/nreg; size_segments(reg)=nreg; end end end

% Done, building two result figures, figure 1 is the segmentation map, % figure 2 is the segmentation map with the average color in each segment

44

while 1 map_ = map(map) ; if isequal(map_,map) ; break ; end map = map_ ; end for i=1:3 im_final(:,:,i)=image_seg(map+(i-1)*n_pixels); end figure(1);imagesc(map);figure(2);imagesc(uint8(im_final)) Part 2:function [Ix, Iy] = imgGrad( I , sigma ) % This function outputs the x-derivative and y-derivative of the % input I. If I is 3D, then derivatives of each channel are % available in xd and yd. sob=[-1,9,-45,0,45,-9,1]/60; u=size(I,3) for i=1:size(I,3) Ix(:,:,i)=imfilter(I(:,:,i),sob,'replicate'); Iy(:,:,i)=imfilter(I(:,:,i),sob','replicate'); end

TESTING TEST RESULTS FOR PREVIOUS ALGORITHM

We have tested the code in a particular image. It works fine only on that particular image. The smoothness of the connected components highly depends on the mask used. So it is image specific edge detection process. The image we tested on is

45

an image of a Number Plate of a car. The output of our result is shown in the following images

Figure 11: (Original Image)

Figure 12:
46

(Gray scale Image)

Figure 13: (Edge detected Image)

47

Figure 14: (Segmented Image) TEST RESULTS FOR CURRENT ALGORITHM

The current algorithm is flexible enough to be used with any image, we have tested the algorithm on jpg images,and it also has the ability to segment color images as well as gray scale images,the segmentation parameter Q is the deciding factor for segmentation quality in this algorithm,increasing the value of Q produces finer segmentation while decreasing it produces coarser segmentaion.For our tests we have chosen Q=32.

>>srm Enter Segmentation Parameter Q(32 to 1024):32 Enter Image File Name : squirrel.jpg

48

Figure 15:Squirrel.jpg

49

Figure 16: Color Map

50

Figure 17: Segmented Image

6. CONCLUSION

As evident from the results of our testing the project is still at a very rudimentary stage. Image segmentation is an essential preliminary step in most automatic pictorial pattern recognition and scene analysis problem.

51

Lastly solution to problems in the field of digital image segmentation generally require extensive experimental work involving software simulation and testing with large sets of sample images. Although algorithm development typically is based on theoretical underpinning , the actual implementation of these algorithms almost always requires parameter estimation and , frequently algorithm revision and candidate solutions. Thus selection of a flexible comprehensive and well documented software development is a key factor that has important implication in the cost , development time and portability of image segmentation solutions.

To achieve these objectives we felt that two key ingredient were needed . The first was select image segmentation material that is representative of material covered in a formal course of instruction in this field . The second is to select software tools that are well supported and documented and which have a wide range of application in the real world .

Digital image segmentation is an area characterized by the need for extensive experimental work to establish the viability of proposed solutions to a given problem. Here we outline how a theoretical base and state of the art software can be integrated into a prototyping environment whose objective is to provide a set of well supported tools for the solution of a broad class of problems in digital image segmentation.

An important characteristic underlying the design of image segmentation systems is the significant level of testing and experimentation that normally is required before arriving at an acceptable solution. This characteristic implies that the ability to formulate approaches and quickly prototype candidate solutions generally plays a major role in reducing the cost and time required to arrive at a viable system implementation.

52

3. FUTURE SCOPE There is a lot of space for future modification and upgradation as we have merely begun to scratch. The areas of future modification include I. Designing and implementing a general purpose algorithm that will be useful for

segmenting any image. II. Adapting the algorithm to work with image formats which are currently

incompatible, such as GIF and TIFF formats. III. IV. Designing a Graphical User Interface for the application. Experimenting with other image smoothing filters,such as the Butterworth low

pass filter which is denoted by the equation H(u,v)=1/(1+[D(u,v)/D0]2n An usefull property of this filter is smooth transition between low and high frequencies. Algorithm proposed to be implemented in the future The first region growing method was the seeded region growing method. This method takes a set of seeds as input along with the image. The seeds mark each of the objects to be segmented. The regions are iteratively grown by comparing all unallocated neighbouring pixels to the regions. The difference between a pixel's intensity value and the region's mean, d, is used as a measure of similarity. The pixel with the smallest difference measured this way is allocated to the respective region. This process continues until all pixels are allocated to a region.

Seeded region growing requires seeds as additional input. The segmentation results are dependent on the choice of seeds. Noise in the image can cause the seeds to be poorly placed. Unseeded region growing is a modified algorithm that doesn't require explicit
53

seeds. It starts off with a single region A1 the pixel chosen here does not significantly influence final segmentation. At each iteration it considers the neighbouring pixels in the same way as seeded region growing. It differs from seeded region growing in that if the minimum d is less than a predefined threshold T then it is added to the respective region Aj. If not, then the pixel is considered significantly different from all current regions Ai and a new region An + 1 is created with this pixel. One variant of this technique, proposed by Haralick and Shapiro (1985),is based on pixel intensities. The mean and scatter of the region and the intensity of the candidate pixel is used to compute a test statistic. If the test statistic is sufficiently small, the pixel is added to the region, and the regions mean and scatter are recomputed. Otherwise, the pixel is rejected, and is used to form a new region.

5. REFERENCES / BIBLIOGRAPHY
I.

Digital Image Processing Rafael C. Gonzalez , Richard E. Woods Digital Image Processing using Rafael C. Gonzalez , Richard E. Woods & Steven L. Eddins

II.

III.

IEEE Transactions on Pattern Analysis and Machine Intelligence,Vol.26, No.11,November 2004 Statistical Region Merging by Richard Nock and Frank Nielsen.

IV.

Web Resource :a. http://www.mathworks.com/ b. http://www.wolframalpha.com/ c. http://www.citeseer.com/ d. http://www.wikipedia.org/

54

55

Vous aimerez peut-être aussi