604 技術社區[雲棲]

遺傳算法中幾種不同選擇算子及Python實現

前言

本文對遺傳算法中的幾種選擇策略進行了總結, 其中包括:

Proportionate Roulette Wheel Selection
Linear Ranking Selection
Exponential Ranking Selection
Tournament Selection

對於每種選擇策略我都使用Python進行了相應的實現並以內置插件的形式整合進了本人所寫的遺傳算法框架GAFT中。對需要使用遺傳算法優化問題以及學習遺傳算法的童鞋可以作為參考。

項目鏈接:

GitHub: https://github.com/PytLab/gaft
PyPI: https://pypi.python.org/pypi/gaft

遺傳算法中的選擇算子

遺傳算法(genetic algorithms, GAs)是一種自適應的啟發式搜索算法, 它模仿達爾文進化論中的“適者生存”的原則, 最終獲取優化目標的最優解。下圖描述了一個簡單的遺傳算法流程:

對於種群中需要進行雜交的物種選擇方法有很多，而且選擇一種合適的選擇策略對於遺傳算法整體性能的影響將是很大的。如果一個選擇算法選擇多樣性降低，便會導致種群過早的收斂到局部最優點而不是我們想要的全局最優點，也就是所謂的”早熟”。而選擇策略過於發散則會導致算法難以收斂到最優點。因此在這兩點中我們需要進行平衡才能使遺傳算法以一種高效的方式收斂到全局最優點。

GAFT框架中的算子插件

GAFT是我根據自己需求開發的一個遺傳算法框架，相關介紹的博客可以參見《GAFT-一個使用Python實現的遺傳算法框架》,《使用MPI並行化遺傳算法框架GAFT》。該框架提供了插件接口，用戶可以通過自定義算子以及on-the-fly分析插件來放到gaft框架中運行遺傳算法流程對目標問題進行優化。

本部分我稍微介紹下gaft關於遺傳算子相關接口規範，以及編寫能用於gaft的算子的編寫方法。

在gaft中遺傳算子的編寫都是需要繼承框架內置的基類，然後根據基類提供的接口，實現自己的算子。其中基類的定義都在/gaft/plugin_interfaces/operators/目錄下，下麵我以選擇算子為例，介紹下接口。

gaft中選擇算子的基類為GASelection，其中在遺傳算法流程中會調用該類實例的select方法，進而根據算子中的相關選擇策略完成從種群中選取一對物種作為父親和母親產生子代。基類的定義為：

class GASelection(metaclass=SelectionMeta):
'''
Class for providing an interface to easily extend the behavior of selection
operation.
'''
def select(self, population, fitness):
'''
Called when we need to select parents from a population to later breeding.
:param population: The current population.
:type population: GAPopulation
:return parents: Two selected individuals for crossover.
:type parents: Tuple of tow GAIndividual objects.
'''
raise NotImplementedError

select的方法的參數為當前種群population以及相應的適應度函數fitness，其中population需要是GAPopulation對象，fitness也必須是callable的對象。

當然，這些在Python這種動態類型語言中貌似看起來有些雞肋，但是為了能夠更加規範使用者，我利用Python的元類在實例化類對象的時候對接口的實現以及接口的參數類型加以限製。具體的實現都在/gaft/plugin_interfaces/metaclasses.py中，有興趣的童鞋可以看看實現方法。

具體自定義算子的編寫方法我將在下一部分同選擇策略一起貼出來。

不同的選擇策略

本部分我主要對四種不同的選擇策略進行總結並加以gaft插件形式的Python實現。

選擇算子決定了哪些個體將會從種群中被選擇出來用於繁衍下一代種群中的新個體。其主要的原則就是:

the better is an individual; the higher is its chance of being a parent

選擇算子在遺傳算法迭代中將適應度函數引入進來，因為適應度函數式標定一個個體是否足夠“好”的重要標準。但是選擇過程又不能僅僅完全依賴於適應度函數，因為一個種群中的最優物種並不一定是在全局最優點附近。因此我們也應該給相對來說並那麼“好”的個體一點機會讓他們繁衍後代, 避免“早熟”。

Proportionate Roulette Wheel Selection

此輪盤賭選擇策略，是最基本的選擇策略之一，種群中的個體被選中的概率與個體相應的適應度函數的值成正比。我們需要將種群中所有個體的適應度值進行累加然後歸一化，最終通過隨機數對隨機數落在的區域對應的個體進行選取，類似賭場裏麵的旋轉的輪盤。

每個個體ai被選中的概率為:

好了，下麵可以將此算法寫成一個可以gaft中執行的算子。

from random import random
from bisect import bisect_right
from itertools import accumulate
from ...plugin_interfaces.operators.selection import GASelection
class RouletteWheelSelection(GASelection):
def __init__(self):
'''
Selection operator with fitness proportionate selection(FPS) or
so-called roulette-wheel selection implementation.
'''
pass
def select(self, population, fitness):
'''
Select a pair of parent using FPS algorithm.
'''
# Normalize fitness values for all individuals.
fit = [fitness(indv) for indv in population.individuals]
min_fit = min(fit)
fit = [(i - min_fit) for i in fit]
# Create roulette wheel.
sum_fit = sum(fit)
wheel = list(accumulate([i/sum_fit for i in fit]))
# Select a father and a mother.
father_idx = bisect_right(wheel, random())
father = population[father_idx]
mother_idx = (father_idx + 1) % len(wheel)
mother = population[mother_idx]
return father, mother

過程主要分為下麵幾個:

繼承GASelection類
實現select方法
select的參數為GAPopulation實例和適應度函數
根據算法選擇出兩個需要繁衍的物種並返回即可

Tournament Selection

由於算法執行的效率以及易實現的的特點，錦標賽選擇算法是遺傳算法中最流行的選擇策略。在本人的實際應用中的確此策略比基本的輪盤賭效果要好些。他的策略也很直觀，就是我們再整個種群中抽取n個個體，讓他們進行競爭(錦標賽)，抽取其中的最優的個體。參加錦標賽的個體個數成為tournament size。通常當n=2便是最常使用的大小，也稱作Binary Tournament Selection.

Tournament Selection的優勢:

更小的複雜度O(n)
易並行化處理
不易陷入局部最優點
不需要對所有的適應度值進行排序處理

下圖顯示了n=3的Tournament Selection的過程:

可以開始寫成自定義算子在gaft運行了:

from random import sample
from ...plugin_interfaces.operators.selection import GASelection
class TournamentSelection(GASelection):
def __init__(self, tournament_size=2):
'''
Selection operator using Tournament Strategy with tournament size equals
to two by default.
'''
self.tournament_size = tournament_size
def select(self, population, fitness):
'''
Select a pair of parent using Tournament strategy.
'''
# Competition function.
complete = lambda competitors: max(competitors, key=fitness)
# Check validity of tournament size.
if self.tournament_size >= len(population):
msg = 'Tournament size({}) is larger than population size({})'
raise ValueError(msg.format(self.tournament_size, len(population)))
# Pick winners of two groups as parent.
competitors_1 = sample(population.individuals, self.tournament_size)
competitors_2 = sample(population.individuals, self.tournament_size)
father, mother = complete(competitors_1), complete(competitors_2)
return father, mother

下麵兩個介紹的選擇策略都是基於排序的選擇策略，上麵提到的第一種基本輪盤賭選擇算法，有一個缺點，就是如果一個個體的適應度值為0的話，則被選中的概率將會是0, 這個個體將不能產生後代。於是我們需要一種基於排序的算法，來給每個個體安排相應的選中概率。

在Linear Ranking Selection中，種群中的個體首先根據適應度的值進行排序，然後給所有個體賦予一個序號，最好的個體為N, 被選中的概率為Pmax, 最差的個體序號為1, 被選中的概率為Pmin，於是其他的在他們中間的個體的概率便可以根據如下公式得到:

實現代碼:

from random import random
from itertools import accumulate
from bisect import bisect_right
from ...plugin_interfaces.operators.selection import GASelection
class LinearRankingSelection(GASelection):
def __init__(self, pmin=0.1, pmax=0.9):
'''
Selection operator using Linear Ranking selection method.
Reference: Baker J E. Adaptive selection methods for genetic
algorithms[C]//Proceedings of an International Conference on Genetic
Algorithms and their applications. 1985: 101-111.
'''
# Selection probabilities for the worst and best individuals.
self.pmin, self.pmax = pmin, pmax
def select(self, population, fitness):
'''
Select a pair of parent individuals using linear ranking method.
'''
# Individual number.
NP = len(population)
# Add rank to all individuals in population.
sorted_indvs = sorted(population.individuals, key=fitness, reverse=True)
# Assign selection probabilities linearly.
# NOTE: Here the rank i belongs to {1, ..., N}
p = lambda i: (self.pmin + (self.pmax - self.pmin)*(i-1)/(NP-1))
probabilities = [self.pmin] + [p(i) for i in range(2, NP)] + [self.pmax]
# Normalize probabilities.
psum = sum(probabilities)
wheel = list(accumulate([p/psum for p in probabilities]))
# Select parents.
father_idx = bisect_right(wheel, random())
father = population[father_idx]
mother_idx = (father_idx + 1) % len(wheel)
mother = population[mother_idx]
return father, mother

類似上麵的Linear Ranking選擇策略，這種指數排序便是在確定每個個體的選擇概率的時候使用了指數形式的表達式, 其中c為底數，滿足0<c<1:

實現代碼:

from random import random
from itertools import accumulate
from bisect import bisect_right
from ...plugin_interfaces.operators.selection import GASelection
class ExponentialRankingSelection(GASelection):
def __init__(self, base=0.5):
'''
Selection operator using Exponential Ranking selection method.
:param base: The base of exponent
:type base: float in range (0.0, 1.0)
'''
if not (0.0 < base < 1.0):
raise ValueError('The base of exponent c must in range (0.0, 1.0)')
self.base = base
def select(self, population, fitness):
'''
Select a pair of parent individuals using exponential ranking method.
'''
# Individual number.
NP = len(population)
# NOTE: Here the rank i belongs to {1, ..., N}
p = lambda i: self.base**(NP - i)
probabilities = [p(i) for i in range(1, NP + 1)]
# Normalize probabilities.
psum = sum(probabilities)
wheel = list(accumulate([p/psum for p in probabilities]))
# Select parents.
father_idx = bisect_right(wheel, random())
father = population[father_idx]
mother_idx = (father_idx + 1) % len(wheel)
mother = population[mother_idx]
return father, mother

總結

本文對於遺傳算法中四種不同的選擇策略進行了介紹和總結，同時對於本文所寫的遺傳算法框架的自定義算子接口進行了簡要的介紹，針對本文中的選擇策略分別根據接口的要求實現了相應的算子，這些算子也作為GAFT框架的內置算子放入到GAFT中，對於使用GAFT的童鞋可以直接拿來使用。

參考

Shukla, Anupriya, Hari Mohan Pandey, and Deepti Mehrotra. “Comparative review of selection techniques in genetic algorithm.” Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), 2015 International Conference on. IEEE, 2015.

本文作者：iPytLab

來源：51CTO

最後更新：2017-11-02 11:33:51

遺傳算法中幾種不同選擇算子及Python實現

上一篇：給 Web 開發人員推薦的文檔生成工具

下一篇：你不知道的事兒三（CSS）

相關內容

熱門內容

最新內容

遺傳算法中幾種不同選擇算子及Python實現

上一篇： 給 Web 開發人員推薦的文檔生成工具

下一篇： 你不知道的事兒三（CSS）

相關內容

熱門內容

最新內容

上一篇：給 Web 開發人員推薦的文檔生成工具

下一篇：你不知道的事兒三（CSS）