任务3

时间：2019-11-20 00:00:56 阅读：127 评论：0 收藏：0 [点我收藏+]

from sklearn import datasets    #导入一个样本数据
from sklearn. model_selection import train_test_split #数据集的分割，把数据分成训练集和测试集
import numpy as np
import heapq
iris = datasets.load_iris()   #导入iris数据集
X = iris.data
y = iris.target
X_train,X_test,y_train,y_test = train_test_split(X,y,random_state=2003)

#定义求距离函数
def euc_dis(instance1, instance2):
    diff = instance1-instance2   #做差
    diff = diff ** 2             #平方
    dist = sum(diff) ** 0.5      #求和、开方
    return dist

#定义knn分类函数
def knn_classify(X, y,testInstance, k):
    dis=[]
    for i in X:
        dis.append(euc_dis(i,testInstance))  #求 testInstance 与 X 中的每个向量的距离 
    
    maxIndex = map(dis.index,heapq.nsmallest(k,dis))  #求出最小的k个距离的下标
    
    maxY=[]
    for i in maxIndex:
        maxY.append(y[i])   #将样本对应标签添加到maxY数组中
    return max(maxY,key=maxY.count)  #求出现次数最多的标签值
predictions = [knn_classify(X_train,y_train,data,3) for data in X_test]
correct = np.count_nonzero((predictions==y_test)==True)
print("Accruacy is: %.3f" %(correct/len(X_test)))

train_test_split函数用于将矩阵随机划分为训练集和测试集，并返回划分好的并返回划分好的训练集和测试集数据

语法：

X_train,X_test, y_train, y_test =cross_validation.train_test_split(X,y,test_size, random_state)

其中：

X：待划分的样本特征集合

y：待划分的样本标签

test_size：若在0~1之间，为测试集样本数目与原始样本数目之比；若为整数，则是测试集样本的数目。

random_state：随机数种子

X_train：划分出的训练集数据（返回值）

X_test：划分出的测试集数据（返回值）

y_train：划分出的训练集标签（返回值）

y_test：划分出的测试集标签（返回值）

随机数种子：该组随机数的编号，在需要重复试验的时候，保证得到一组一样的随机数。比如每次都填1，其他参数一样的情况下你得到的随机数组是一样的。但填0或不填，每次都会不一样。

随机数的产生取决于种子，随机数和种子之间的关系遵从以下两个规则：

种子不同，产生不同的随机数；种子相同，即使实例不同也产生相同的随机数。

参考博客：https://blog.csdn.net/fxlou/article/details/79189106

heapq模块中的两个函数——nlargest()和nsmallest()可以找出某个集合中最大或最小的N个元素

例如：

>>> import heapq
>>> nums=[6,8,2,0,11,-4,-9,23,4,96,27]
>>> print(heapq.nlargest(3,nums))
[96, 27, 23]
>>> print(heapq.nsmallest(3,nums))
[-9, -4, 0]

求平方diff^2： ① diff ** 2 ——表达式

② import math ——使用内置模块

math.pow(diff, 2)

③ pow(diff, 2) ——使用内置函数

参考博客：https://blog.csdn.net/jerry_1126/article/details/82917405

python中 for 循环，经常用于遍历字符串、列表，元组，字典等

语法

for x in y:
   statements(s)

执行流程：x依次表示y中的一个元素，遍历完所有元素循环结束

参考博客：https://www.cnblogs.com/kiki5881/p/8541887.html

append() 方法用于在列表末尾添加新的对象

语法

list.append(obj)

obj -- 添加到列表末尾的对象

map() 会根据提供的函数对指定序列做映射

第一个参数 function 以参数序列中的每一个元素调用 function 函数，返回包含每次 function 函数返回值的新列表

语法：

map(function, iterable, ...)

实例

def square(x) :            # 计算平方数
     return x ** 2

map(square, [1,2,3,4,5])   # 计算列表各个元素的平方
[1, 4, 9, 16, 25]
map(lambda x: x ** 2, [1, 2, 3, 4, 5])  # 使用 lambda 匿名函数
[1, 4, 9, 16, 25]
 
# 提供了两个列表，对相同位置的列表数据进行相加
map(lambda x, y: x + y, [1, 3, 5, 7, 9], [2, 4, 6, 8, 10])
[3, 7, 11, 15, 19]

参考：https://www.runoob.com/python/python-func-map.html

参考代码看下来，有很多不懂的，理顺了思路，要学习的知识有很多，还好有百度这种东西，果然现在自己知道的还是太少了

现在跟这些函数混了个脸熟，虽然还不太熟悉，以后有机会经常使用,应该能熟能生巧

总之，There is still a long way to go.

任务3

原文：https://www.cnblogs.com/C-ch3-5/p/11894439.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)