假期找了几本关于机器学习的书,将一些比较重要的核心公式整体到这里。
模型描述 特征空间假设, 寻找线性系数 $ theta $ 以希望用一个线性函数逼近目标向量。
逼近的效果好坏叫做 Cost Function , 下面列出的MSE便是其中一种。
Linear Regression
梯度下降 其中 
带有正则项 
sklearn-线性回归 1 2 3 4 5 6 7 8 9 10 11 from  sklearn.linear_model import  LinearRegressionlr = LinearRegression() lr.fit(X, y) lr.intercept_, lr.coef_ from  sklearn.metrics import  mean_squared_errorfrom  sklearn.linear_model import  SGDRegressor
对数线性回归 - Logistic Regression $ sigma(t) $ 是Sigmoid函数
Logistic Regression cost function (log loss)
Logistic cost function partial derivatives 
sklearn-Logistic Regression 1 2 3 4 5 from  sklearn.linear_model import  LogisticRegressionlog_reg = LogisticRegression() log_reg.fit(X, y) 
Softmax Regression 支持向量机 Support Vector Machine
Decision Functions and Predictions 
 
Hard Margin Classification 
 
subject to
Soft Margin Classification 
 
subject to
subject to
LinearSVC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 import  numpy as  npfrom  sklearn import  datasetsfrom  sklearn.pipeline import  Pipelinefrom  sklearn.preprocessing import  StandardScalerfrom  sklearn.svm import  LinearSVC大专栏   Machine Learning span class="line">iris = datasets.load_iris() X = iris["data" ][:, (2 , 3 )]   y = (iris["target" ] == 2 ).astype(np.float64)   svm_clf = Pipeline([         ("scaler" , StandardScaler()),         ("linear_svc" , LinearSVC(C=1 , loss="hinge" )),     ]) svm_clf.fit(X, y) 
Common kernels 
树 从树到森林。
Decision Tree 
Decision Trees 
CART cost function for regression 
 
where
DecisionTreeClassifier 1 2 3 4 5 6 7 8 9 from  sklearn.datasets import  load_irisfrom  sklearn.tree import  DecisionTreeClassifieriris = load_iris() X = iris.data[:, 2 :]  y = iris.target tree_clf = DecisionTreeClassifier(max_depth=2 ) tree_clf.fit(X, y) 
Random Forests RF 在我看来是 Ensemble Learning (集成学习)的经典代表。
以Classifiers举例,对待同样的数据, 不同分类器可能有不同的决策结果。 
Logistic Regression classifier , Random Forest classifier , K-Nearest Neighbors classifier 
自然而然的, 可引入选举策略来作最终决策。
voting of classifier 1 2 3 4 5 6 7 8 9 10 11 12 13 14 from  sklearn.ensemble import  RandomForestClassifierfrom  sklearn.ensemble import  VotingClassifierfrom  sklearn.linear_model import  LogisticRegressionfrom  sklearn.svm import  SVClog_clf = LogisticRegression() rnd_clf = RandomForestClassifier() svm_clf = SVC() voting_clf = VotingClassifier(     estimators=[('lr' , log_clf), ('rf' , rnd_clf), ('svc' , svm_clf)],     voting='hard' ) voting_clf.fit(X_train, y_train) 
Boosting Adaboost Gradient Boosting 效果指标 确定Model收敛的方向, 对连续和离散模型都有若干种Metrics  
Classification $F_1$ 是二者的调和平均
precision_score and recall_score 1 2 from  sklearn.metrics import  precision_score, recall_score
Regression Machine Learning 
原文:https://www.cnblogs.com/lijianming180/p/12037887.html