2013年4月24日水曜日

Image Segmentation By Graph Cut

in Japanese

Introduction

 So far, I have considered the image segmentations by the K-means clustering and the Gaussian mixture model(GMM).1,2,3 In this page, I show the image segmentation with the graph cut algorithm.

Formulation

 We define the following energy function: \begin{equation} E(X)=\lambda\;\sum\limits_{v\;\in\; V}\left\{1-p_{m}(v)\right\}+\kappa\;\sum\limits_{(u,v)\;\in \;C}\left(1-\delta_{m,n}\right)\;\left\{\;1-\;\left|e(u)\right|\;\right\} \label{energy} \end{equation} where $v$ is a pixel on an image and $V$ is a set of pixels. It should be noted that $v$ has a position $(x,y)$ and a color $(r,g,b)$. $m$ is a label assigned to each pixel $v \in V$, $C$ indicates a set of pairs of neighboring pixels, $p_{m}(v)$ represents a probability that the pixel $v$ has the label $m$, and $e(v)$ is a differential image. $\lambda$ and $\kappa$ are positive constants, and $\delta_{i,j}$ is the Kronecker delta. $X$ denotes all combinations of $v$ and $m$. The energy $E$ is minimized via the graph cut algorithm to obtain a good segmentation.

 The first term of eq.($\ref{energy}$) is called the data term. Before minimizing the energy, the probability $p_{m}(v)$ is calculated as follows:
  1. A user roughly specifies $M$ regions by moving mouse on the given image.
  2. We get $M$ regions $\{s_{1},\cdots,s_{M}\}$. Each region $s_{m}$ is a set of pixels.
  3. Applying the EM algorithm for each region $s_{m}$, we calculate the probability $p_{m}(v)$. Here the RGB values are used.
  4. We regard $m \in \{1,\cdots,M\}$ as the label assigned to each pixel $v$
The probability $p_{m}(v)$ that we calculate using the OpenCV can be greater than 1. To normalize it, we divide it by the maximum value among $\{p_{1}(v),\cdots,p_{M}(v)\}$. The normalization is performed for all pixels. If the label which is equal to $\mathop{\arg\,\max}\limits_m\;p_{m}(v)$ is assigned to the pixel $v$, the energy $E$ is minimized.

 The second term of eq.($\ref{energy}$) is called the smoothing term. In this study, we utilize the differential image to represent it. The differential image is calculated by applying the sobel filter to the corresponding gray image. We derive two kinds of images $e_{x}(u)$ and $e_{y}(u)$. The former is the differential image along the x direction, and the latter is that along the y direction. When $u=(x,y)$ and $v=(x+1,y)$, $e(u)$ is replaced by $e_{x}(u)$. When $u=(x,y)$ and $v=(x,y+1)$, $e(u)$ is replaced by $e_{y}(u)$. Those absolute values are divided by the maximum of them to make them equal to or less than 1. If the pair $(u,v)$ of the neighboring pixels has the same labels ($m=n$), the smoothing term has no effect on the energy. On the other hand, if the pair has the different labels ($m\neq n$), the term increases the energy. In the latter case, the pair that exists in the area with the large differential values gives rise to the small contribution on the energy. When the pair exists in the area with the small differential values, the contribution is large.

  Based on the discussion so far, to minimize the data term, we only have to assign the label $m$ to the pixel $v$ according to $\mathop{\arg\,\max}\limits_m\;p_{m}(v)$. On the other hand, the minimization of the smoothing term is accomplished by setting the different labels to the pair in the area with the large differential values and by setting the same labels to the pair in the area with the small differential values. The trade-off of them decides the optimal placement $X$.

Demos

 First we roughly specifies two regions, the background and the bird.
The results are shown below. The images of the left column are the results for the current theory, and the right for the previous GMM theory.
GraphCutGMM
The noise in the GMM's results disappears in the graph cut's ones. This is the effect of the smoothing term.

 In the next demo, we roughly specifies three regions, sky, rock, and sea.
The results are shown below.
GraphCutGMM
Also, we can see no noise in the graph cut's results. In two demos, we set $\lambda=\kappa(=100)$.

Development Environment

  1. Mac OS X 10.8.3
  2. Processor : 3.06 GHz Intel Core 2 Duo
  3. Memory : 4GB
  4. Xcode4.6.1 with Apple LLVM 4.2(C++ Language Dialect → GNU++11, C++ Standard Library → libc++)
  5. boost-1.51.0 built by the Apple LLVM 4.1. See see here
  6. opencv-2.4.3 built by the Apple LLVM 4.2. See see here.

My Source Code

Here is my source code. I tested it on the above environment.

External Library

I used the library GCO Version3 which is the implementation of the graph cut algorithm in C++.


Usage

The execution file name is ImageSegmentationWithGraphCut. Running it without any arguments, it prints the following usage statements on the standard output: To the option --ratio, we set $r=\kappa/\lambda$. Therefore, $\kappa=r\lambda$, where $\lambda$ is fixed to 100 in code. The meanings of the other arguments are the same as ones in the previous page.

The following command displays the input image given by --input. The way we specify regions is the same as one in the previous page. The number of regions, however, that we can specify is 6 from 0 to 5. Pressing "e" begins the image segmentation, pressing "c" clears all specified regions, and "q" finishes program.

References

  1. "GrabCut" — Interactive Foreground Extraction using Iterated Graph Cuts
  2. Bust out your own graphcut based image segmentation with OpenCV

1 件のコメント:

  1. このコメントは投稿者によって削除されました。

    返信削除