初识聚类算法: DBSACN,初识聚类算法dbsacn,[Python]代码#
初识聚类算法: DBSACN,初识聚类算法dbsacn,[Python]代码#
[Python]代码
# scoding=utf-8import pylab as plfrom collections import defaultdict,Counterpoints = [[int(eachpoint.split("#")[0]), int(eachpoint.split("#")[1])] for eachpoint in open("points","r")]# 计算每个数据点相邻的数据点,邻域定义为以该点为中心以边长为2*EPs的网格Eps = 10surroundPoints = defaultdict(list)for idx1,point1 in enumerate(points): for idx2,point2 in enumerate(points): if (idx1 < idx2): if(abs(point1[0]-point2[0])<=Eps and abs(point1[1]-point2[1])<=Eps): surroundPoints[idx1].append(idx2) surroundPoints[idx2].append(idx1)# 定义邻域内相邻的数据点的个数大于4的为核心点MinPts = 5corePointIdx = [pointIdx for pointIdx,surPointIdxs in surroundPoints.iteritems() if len(surPointIdxs)>=MinPts]# 邻域内包含某个核心点的非核心点,定义为边界点borderPointIdx = []for pointIdx,surPointIdxs in surroundPoints.iteritems(): if (pointIdx not in corePointIdx): for onesurPointIdx in surPointIdxs: if onesurPointIdx in corePointIdx: borderPointIdx.append(pointIdx) break# 噪音点既不是边界点也不是核心点noisePointIdx = [pointIdx for pointIdx in range(len(points)) if pointIdx not in corePointIdx and pointIdx not in borderPointIdx]corePoint = [points[pointIdx] for pointIdx in corePointIdx] borderPoint = [points[pointIdx] for pointIdx in borderPointIdx]noisePoint = [points[pointIdx] for pointIdx in noisePointIdx]# pl.plot([eachpoint[0] for eachpoint in corePoint], [eachpoint[1] for eachpoint in corePoint], 'or')# pl.plot([eachpoint[0] for eachpoint in borderPoint], [eachpoint[1] for eachpoint in borderPoint], 'oy')# pl.plot([eachpoint[0] for eachpoint in noisePoint], [eachpoint[1] for eachpoint in noisePoint], 'ok')groups = [idx for idx in range(len(points))]# 各个核心点与其邻域内的所有核心点放在同一个簇中for pointidx,surroundIdxs in surroundPoints.iteritems(): for oneSurroundIdx in surroundIdxs: if (pointidx in corePointIdx and oneSurroundIdx in corePointIdx and pointidx < oneSurroundIdx): for idx in range(len(groups)): if groups[idx] == groups[oneSurroundIdx]: groups[idx] = groups[pointidx]# 边界点跟其邻域内的某个核心点放在同一个簇中for pointidx,surroundIdxs in surroundPoints.iteritems(): for oneSurroundIdx in surroundIdxs: if (pointidx in borderPointIdx and oneSurroundIdx in corePointIdx): groups[pointidx] = groups[oneSurroundIdx] break# 取簇规模最大的5个簇wantGroupNum = 3finalGroup = Counter(groups).most_common(3)finalGroup = [onecount[0] for onecount in finalGroup]group1 = [points[idx] for idx in xrange(len(points)) if groups[idx]==finalGroup[0]]group2 = [points[idx] for idx in xrange(len(points)) if groups[idx]==finalGroup[1]]group3 = [points[idx] for idx in xrange(len(points)) if groups[idx]==finalGroup[2]]pl.plot([eachpoint[0] for eachpoint in group1], [eachpoint[1] for eachpoint in group1], 'or')pl.plot([eachpoint[0] for eachpoint in group2], [eachpoint[1] for eachpoint in group2], 'oy')pl.plot([eachpoint[0] for eachpoint in group3], [eachpoint[1] for eachpoint in group3], 'og')# 打印噪音点,黑色pl.plot([eachpoint[0] for eachpoint in noisePoint], [eachpoint[1] for eachpoint in noisePoint], 'ok') pl.show()
DBscan.png
imgs/asCode/22213136_fZaP.png
相关内容
- 初识聚类算法: 凝聚层次聚类,初识聚类,[Python]代码#
- 初识聚类算法: 基本K均值,初识聚类k均值,[Python]代码
- Rock-paper-scissors-lizard-Spock Game,,[Python]代码#
- 元芳,你怎么看,元芳,你,O(∩_∩)OSTR =
- wxPython实现sqlite3数据库的gui界面,wxpythonsqlite3,sql.py#!
- 简单的生成html,简单生成html,[Python]代码cl
- Levenshtein字符串相似度,Levenshtein字符串,Levenshtein距
- 抓取网上的小说章节并写入txt文件,抓取网上章节txt
- 批量删除所下载的.git文件夹,批量删除.git文件夹,批量
- 得到N以内的所有的质数,得到N质数,getPrime.pyi
评论关闭