Programming

通过函数式编程和四叉树模型解决双值图像的色彩处理问题 Using Haskell to solve QuadTree problems

通过函数式编程和四叉树模型解决双值图像的色彩处理问题

二叉树的介绍

四叉树（Quad Tree）是一种树状数据结构，可用于表示图像。在每一个四叉树会有四个子区块。四叉树常应用于二维空间资料的分析与分类。它将资料区分成为四个象限。资料范围可以是方形或矩形或其他任意形状。这种数据结构是由拉斐尔·芬科尔(Raphael Finkel) 与 J. L. Bentley 在1974年发展出来。类似的资料分割方法也称为 Q-tree。

本博文中使用函数式编程(Functional Programming)的Haskell作为编程语言，通过递归思想，实现对由四叉树表示的的双值图像的色彩检测。

背景介绍

假设我们需要存储一张像素大小为 2^{𝑛} ∗ 2^{𝑛} 的正方形双值（黑白双色）图像，一般使用一个bit（0或1）来存储一个像素值，需要2^{𝑛} ∗ 2^{𝑛} bit的存储空间。但如果图像中有大面积的单色区域，这种存储方式可能会造成浪费。

原图像

右下角浪费

左上角的四分格中也浪费

解决方案大致思想

在这种情况下，一个简单的优化方式是将图像分为四个大小为 2^{𝑛-1} ∗ 2^{𝑛-1} 的子图像，我们将其称为“象限”（quadrants）。如果某子图像是纯色图像，我们可以用单一颜色信息（黑色或白色）来表示。但如果它包含不同的颜色（黑色和白色混掺），我们可以再次细分，并继续递归，直到我们得到只有一种颜色的子图像。（此方案理论上可行：如果我们不断递归直至子图像缩小到原始像素的比例，问题一定能解决）。我们将最终数据结构中的这些单一颜色子图像称为“单元”（Cell）。现实中，我们还要考虑图像实际大小，色彩标准，像素维度等多重复杂因素，在此我们忽略。

初始化

首先定义几种Algebraic Data Type，也就是用关键词data来定义的代数数据类型：

1 2	-- 初始化颜色（黑与白） initialise colors' data type data Color = Black \| White deriving (Eq, Show)

这里把子图像定义为四叉树，纯色图像（Cell）和混色图像（Grid）都是子图像的子集，其中混色图像可以继续细分，直至变成纯色图像

1
2

-- 实现四叉树（遵循数学中象限的定义方式：右上角为第一象限，逆时针递进，右下角为第四象限） implement QuadTree (the order of quadrant follows mathematical coordinate system)
data QuadTree = Cell Color Int | Grid QuadTree QuadTree QuadTree QuadTree deriving (Eq, Show)

一个函数 allBlack，它接受一个整数 n 并返回一个全黑的单个细胞的表示。参数 n 代表图像“大小”，但由于任何尺寸的全黑图像看起来都相同，因此可以忽略此参数！但是我没有忽略！！！（后续有用…）

1
2
3

-- returns a Black or White cell of a specified size
allBlack :: Int -> QuadTree
allBlack size = Cell Black size

一个函数 allWhite，它接受一个整数 n 并返回一个全白的单个细胞的表示。参数 n 代表图像“大小”，同上

1 2	allWhite :: Int -> QuadTree allWhite size = Cell White size

一个顺时针函数，它采用四个四叉树并返回四叉树，其四个子树是给定输入，按顺时针顺序排列。

1
2
3

-- The first quadrant is top-right!
clockwise :: QuadTree -> QuadTree -> QuadTree -> QuadTree -> QuadTree
clockwise a b c d = Grid a b c d

一个逆时针函数，它采用四个四叉树并返回四叉树，其四个子树是给定输入，按逆时针顺序排列。

1 2	anticlockwise :: QuadTree -> QuadTree -> QuadTree -> QuadTree -> QuadTree anticlockwise a b c d = Grid a d c b

顺时针排序是指在a b c d顺时针方向的四叉树中，子树b位于a顺时针移动到达的象限内，c位于b顺时针移动到达的象限内，d位于c顺时针移动到达的象限内，a位于从d顺时针移动到达的象限内。反之亦然。

重头戏

接下来我们要进行图像处理，定义一个函数：reverse，它将四叉树作为输入并返回一个新的四叉树作为输出。此函数不应该改变quadtree的结构，但应该改变里面表示黑白颜色的数据内容。
改变规则如下：
当且仅当超过一半的邻边或部分边（这里统称为“邻居”）的单元颜色与此单元自身颜色不一致时，反转此单元的颜色（假设该单元格原先为黑色的话，现在变为白色）。
例如，如果一个单元在输入q中是黑色的，那么它在输出模糊q中应该是白色的，当且仅当在q中它的邻居中白色的单元多于黑色的单元。
此处注意，宏观来看一个完整图像的边界处的单元通常具有较少的邻居。
效果1
效果2

以下思路从宏观到微观依次介绍，目的是为了增强思维连续性，便于读者理解

思路介绍：

完成这个任务有几点要考虑：
1 必须确保图像中的每一个单元都遍历到，才能完全输出最终反转后的图像，不能有遗漏，进一步来讲需要知道哪些单元已经遍历过了，哪些还没到访过。
2 遍历的出发点选在哪？何时结束遍历？
3 遍历的路线是怎样的？

出发点设在整个图像的中心，中心也就是整个图像长度、宽度的各二分之一处（也就是中点，一半的意思），这样就构成了一个坐标系，由此开始递归遍历。
以下是整个程序的start point：启动整个程序，该函数输入一个四叉树，输出处理后的二叉树

-- Original Engine of whole programme
-- use reverseMini to implement the tree
reverse :: QuadTree -> QuadTree
reverse fullTree = reverseMini fullTree (fullTree, fullSize) (fullSize / 2, fullSize / 2, fullSize)
  where fullSize = getWholeSize fullTree

接下来接棒给另一个函数：reverseMini，该函数有好多个输入参数，依次是：
当前所在的四叉树，表示完整图像的四叉树，完成图像四叉树的大小，遍历过程中目前所在的点的坐标（x和y），当前所在四叉树的大小。
输出的四叉树就是返回给reverse的最终答案，以下是细节：

-- According to neighbor's weight to decide if it should reverse the color
-- Current QuadTree -> (initial QuadTree, initial QuadTree size) -> (Current x, y, Current QuadTree size) -> output
reverseMini :: QuadTree -> (QuadTree, Double) -> (Double, Double, Double) -> QuadTree
-- for Cell
reverseMini (Cell ce cellSize) (fullTree, treeSize) (x_now, y_now, _)
  --  more than half of its neighbours have the opposite colour, reverse its colour
  | ce == Black && (neighborWeight > 0) = Cell White cellSize
  | ce == White && (neighborWeight < 0) = Cell Black cellSize
  | otherwise = Cell ce cellSize
  where
    neighborWeight = getNeighbourWeight (fullTree, treeSize) (x_now, y_now, fromIntegral cellSize)
-- for Grid, contiinue recursion
reverseMini (Grid a b c d) (fullTree, treeSize) (x_now, y_now, fullSize) =
  clockwise
    (reverseMini a (fullTree, treeSize) (x_now + offset, y_now - offset, fullSize / 2))
    (reverseMini b (fullTree, treeSize) (x_now - offset, y_now - offset, fullSize / 2))
    (reverseMini c (fullTree, treeSize) (x_now - offset, y_now + offset, fullSize / 2))
    (reverseMini d (fullTree, treeSize) (x_now + offset, y_now + offset, fullSize / 2))
  where
    offset = fullSize / 4

因为输入的四叉树可能是Grid也可能是Cell，在此分类讨论：

若输入的是Cell，那么直接判断当前颜色是否与权重匹配，匹配则Cell颜色保持不变，若不匹配则颜色反转。“匹配”的含义后面会讲。
若输入的是Grid，则进一步递归当前四叉树的四个子图像a,b,c,d，同时自身坐标将会发生改变，迁移到下一个子图像的中心，用当前所在四叉树的大小折半来表示子图像的大小（四个子图像大小相等）。

首先讲一下Cell部分的设计，这里设计了一个方法来判断是否有超过一半的邻居与自身颜色不同，此方法把黑色赋权为整数-1，白色赋权为+1。然后通过计算邻居的权重之和的正负来决定是否与之匹配。
在这里把自身颜色为白色且权重和为正数，或者自身颜色为黑色且权重和为负数都称作与之相匹配。

接下来介绍如何通过函数getNeighbourWeight来计算邻居的权重之和。
下面依次介绍输入参数：
表示完整图像的四叉树（原图像），完整图像四叉树的大小，邻居单元的x坐标和y坐标，邻居单元的规格；
输出的是当前邻居的权重之和。
这里因为递归路线的性质需要逆向探测，所以getNeighbour方向和border方向相反，也是最巧妙的地方

-- get neighbors weight and sum the total weight of a cell's all neighbors
-- (initial QuadTree, initial QuadTree size) -> (Neighbor Cell's X coordinate, Cell's Y coordinate, Cell's size) -> output
getNeighbourWeight :: (QuadTree, Double) -> (Double, Double, Double) -> Int
getNeighbourWeight (fullTree, treeSize) (x_cell, y_cell, cellSize) =
      border (getNeighbour "up" (fullTree, treeSize) (x_cell, y_cell, cellSize)) "down"
    + border (getNeighbour "down" (fullTree, treeSize) (x_cell, y_cell, cellSize)) "up"
    + border (getNeighbour "left" (fullTree, treeSize) (x_cell, y_cell, cellSize)) "right"
    + border (getNeighbour "right" (fullTree, treeSize) (x_cell, y_cell, cellSize)) "left"

此函数还需借鉴另外两个函数：getNeighbour和border。
先介绍border函数，其输入参数为：
当下需要遍历的四叉树（也可以不输入，此情况输出值为0），探测方向，
输出的是根据单个Cell颜色得出的权重
border是“落实到基层”的函数，承担了真正根据Cell颜色赋予其权重的任务：白色+1，黑色-1，如果不是最小Cell（是Grid）就继续上下左右递归直到找到相应的Cell。

-- detect one specified border of a certain Gird/Cell and sum its all cells values(weights) according to colors
-- a Quadtree (or Nothing) -> one of the four inner-borders ->
border :: Maybe QuadTree -> String -> Int
-- invalid
border Nothing _ = 0
-- calculate the weight of the Cell
border (Just (Cell ce _)) _
  | ce == White = 1 -- White cells are worth +1
  | ce == Black = -1 -- black cells are worth -1
--  continue to implement recursion for Grid if cell is not arrived
border (Just (Grid a b c d)) direction
  | direction == "up" = border (Just a) "up" + border (Just b) "up"
  | direction == "left" = border (Just b) "left" + border (Just c) "left"
  | direction == "down" = border (Just c) "down" + border (Just d) "down"
  | direction == "right" = border (Just a) "right" + border (Just d) "right"

接下来说getNeighbour，由于该函数大量调用了getGrid，因此我们需要先了解getGrid函数。
getGrid函数作为getNeighbour的辅助函数，是程序中最复杂的一个函数，它根据getNeighbour提供的坐标变化来进一步遍历。
输入参数介绍：
四叉树，四叉树的大小（起初是完整图像，后续是便利目标的子图像），当前所在点的X，Y坐标，当前所处四叉树的大小，目标四叉树（即将要遍历的）的中点坐标以及大小

该函数有几点功能：
检测当前所处Grid是否为边界Grid，若遇到边界则停止遍历，否则决定下一步遍历的移动方向。
这里的边界线探测还是用坐标的数值来推理得出，把最小的cell大小假设成1，判断过程可以理解为尝试性地迈出0.5步，如果迈出后的坐标大于临界点（即完整图像边框线）时，则停止移动。
若还可以继续移动，则需要给出下一个到访的cell的中心坐标。

-- helper function for getNeigbor: according to inputed coordinates and inputed size returns the Grid and also detect the boundary.
-- Quadtree -> its size -> (current X coordinate, Y coordinate, size) -> (target's X coordinate, Y coordinate, size) ->
getGrid :: QuadTree -> Double -> (Double, Double, Double) -> (Double, Double, Double) -> Maybe QuadTree
getGrid (Cell ce cellSize) _ _ _ = Just (Cell ce cellSize)
getGrid (Grid a b c d) treeSize (x_now, y_now, sizeNow) (grid_x, grid_y, grid_size)
  -- detect boundary
  -- if it is already near the border, stop moving!
  | grid_x < 0.0 || grid_x > (treeSize - 0.5) || grid_y < 0.0 || grid_y > (treeSize - 0.5) = Nothing
  | sizeNow == grid_size = Just (Grid a b c d)
  | x_now > grid_x && y_now > grid_y = getGrid b treeSize (x_now - offset, y_now - offset, sizeNow / 2) (grid_x, grid_y, grid_size)
  | x_now < grid_x && y_now > grid_y = getGrid a treeSize (x_now + offset, y_now - offset, sizeNow / 2) (grid_x, grid_y, grid_size)
  | x_now > grid_x && y_now < grid_y = getGrid c treeSize (x_now - offset, y_now + offset, sizeNow / 2) (grid_x, grid_y, grid_size)
  | x_now < grid_x && y_now < grid_y = getGrid d treeSize (x_now + offset, y_now + offset, sizeNow / 2) (grid_x, grid_y, grid_size)
  -- offset means how far the coordinate system should move
  where
    offset = sizeNow / 4

最后我们介绍一下getNeighbor，输入参数为：
移动方向（进一步传导给getGrid），表示完整图像的四叉树，完成图像四叉树的大小，目标点的X，Y坐标和大小。
其主要作用是如何递归地调用getGrid，巧妙决定移动方向

--  returns the Quadtree representing one of his four direct neighbors (top, bottom, left, right)
-- direction -> (initial QuadTree, initial QuadTree size) -> (Cell's X coordinate, Cell's Y coordinate, Cell's size)
getNeighbor :: String -> (QuadTree, Double) -> (Double, Double, Double) -> Maybe QuadTree
getNeighbor direction (fullTree, treeSize) (x_cell, y_cell, cellSize)
  | direction == "top" = getGrid fullTree treeSize (tree_center, tree_center, treeSize) (x_cell, y_cell - cellSize, cellSize) -- top
  | direction == "bottom" = getGrid fullTree treeSize (tree_center, tree_center, treeSize) (x_cell, y_cell + cellSize, cellSize) -- bottom
  | direction == "left" = getGrid fullTree treeSize (tree_center, tree_center, treeSize) (x_cell - cellSize, y_cell, cellSize) -- left
  | direction == "right" = getGrid fullTree treeSize (tree_center, tree_center, treeSize) (x_cell + cellSize, y_cell, cellSize) -- right
  where
    tree_center = treeSize / 2

下面补充两个辅助函数，第一个用来计算某个四叉树的大小，第二个用来作为颜色的中间转换函数

-- output the size of a QuadTree by calculating the length of one side (eg. top)
getTreeSize :: QuadTree -> Double
getTreeSize (Cell ce size) = fromIntegral size
getTreeSize (Grid a b c d) = getTreeSize a + getTreeSize b

-- detect one specified side of a certain Gird/Cell and sum its all cells values according to colors
-- a Quadtree (or Nothing) -> one of the four inside-borders ->
side :: Maybe QuadTree -> String -> Int
-- invalid
side Nothing _ = 0
-- calculate the weight of the Cell
side (Just (Cell ce _)) _
  | ce == White = 1 -- White cells are worth +1
  | ce == Black = -1 -- black cells are worth -1
--  continue to implement recursion for Grid if cell is not arrived
side (Just (Grid a b c d)) direction
  | direction == "top" = side (Just a) "top" + (side (Just b) "top")
  | direction == "bottom" = side (Just c) "bottom" + (side (Just d) "bottom")
  | direction == "left" = side (Just b) "left" + (side (Just c) "left")
  | direction == "right" = side (Just a) "right" + (side (Just d) "right")

至此，完整的代码已全部展现。

About this Post

This post is written by Rui Xu, licensed under CC BY-NC 4.0.

#Haskell #Functional Programming