基于中心對齊多核學習的稀疏多元邏輯回歸算法
doi: 10.11999/JEIT190426 cstr: 32379.14.JEIT190426
-
1.
重慶郵電大學計算機科學與技術(shù)學院 重慶 400065
-
2.
重慶郵電大學網(wǎng)絡(luò)智能研究所 重慶 400065
Sparse Multinomial Logistic Regression Algorithm Based on Centered Alignment Multiple Kernels Learning
-
1.
College of Computer, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
2.
Institute of Web Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
摘要: 稀疏多元邏輯回歸(SMLR)作為一種廣義的線性模型被廣泛地應(yīng)用于各種多分類任務(wù)場景中。SMLR通過將拉普拉斯先驗引入多元邏輯回歸(MLR)中使其解具有稀疏性,這使得該分類器可以在進行分類的過程中嵌入特征選擇。為了使分類器能夠解決非線性數(shù)據(jù)分類的問題,該文通過核技巧對SMLR進行核化擴充后得到了核稀疏多元邏輯回歸(KSMLR)。KSMLR能夠?qū)⒎蔷€性特征數(shù)據(jù)通過核函數(shù)映射到高維甚至無窮維的特征空間中,使其特征能夠充分地表達并最終能進行有效的分類。此外,該文還利用了基于中心對齊的多核學習算法,通過不同的核函數(shù)對數(shù)據(jù)進行不同維度的映射,并用中心對齊相似度來靈活地選取多核學習權(quán)重系數(shù),使得分類器具有更好的泛化能力。實驗結(jié)果表明,該文提出的基于中心對齊多核學習的稀疏多元邏輯回歸算法在分類的準確率指標上都優(yōu)于目前常規(guī)的分類算法。Abstract: As a generalized linear model, Sparse Multinomial Logistic Regression (SMLR) is widely used in various multi-class task scenarios. SMLR introduces Laplace priori into Multinomial Logistic Regression (MLR) to make its solution sparse, which allows the classifier to embed feature selection in the process of classification. In order to solve the problem of non-linear data classification, Kernel Sparse Multinomial Logistic Regression (KSMLR) is obtained by kernel trick. KSMLR can map nonlinear feature data into high-dimensional and even infinite-dimensional feature spaces through kernel functions, so that its features can be fully expressed and eventually classified effectively. In addition, the multi-kernel learning algorithm based on centered alignment is used to map the data in different dimensions through different kernel functions. Then center-aligned similarity can be used to select flexibly multi-kernel learning weight coefficients, so that the classifier has better generalization ability. The experimental results show that the sparse multinomial logistic regression algorithm based on center-aligned multi-kernel learning is superior to the conventional classification algorithm in classification accuracy.
-
算法1:KSMLR問題的回溯ISTA算法 輸入: 初始化步長:$ \tau =1/L $, $ L>0 $, 初始化參數(shù):$ {\alpha }\in {R}^{n\times k} $,初始化核函數(shù)參數(shù):$ \mathrm{\sigma }=2 $, 最大迭代次數(shù):$ \mathrm{I}\mathrm{t}\mathrm{e}\mathrm{r} $ = 500, 回溯參數(shù):$ \beta \in (0,\mathrm{ }1) $ 輸出: 算法最終的參數(shù):$ {{\alpha }}^{t+1} $ 迭代步驟: 步驟1 由樣本$ {{X}}^{\left(i\right)} $計算得到核矩陣$ {k} $; 步驟2 初始化計數(shù)器 $ t\leftarrow 0 $; 步驟3 初始化參數(shù)$ {{\alpha }}^{{t}}\leftarrow {\alpha } $; 步驟4 $ {{\alpha }}^{t+1}={p}_{\tau }\left({{\alpha }}^{t}\right) $; 步驟5 $ \tau =\beta \tau $; 步驟6 當滿足$l\left( {{{{\alpha}} ^{t + 1}}} \right) \le \hat l\left( {{{{\alpha}} ^{t + 1}},{{{\alpha}} ^t}} \right)$或迭代到指定次數(shù)時算
法終止,執(zhí)行步驟7。否則,令t←t+1,并返回到步驟4;步驟7 返回更新完成的算法參數(shù)${{{\alpha}} ^{t + 1}}$。 下載: 導出CSV
算法2:MKSMLR問題的回溯FISTA算法 輸入: 初始化步長:$\tau =1/L$, $ L>0 $, 初始化參數(shù):$ {\alpha }\in {R}^{n\times k} $, 初始化核函數(shù)參數(shù):$ \mathrm{\sigma }=2 $, 最大迭代次數(shù):$ \mathrm{I}\mathrm{t}\mathrm{e}\mathrm{r} $ = 500, 回溯參數(shù):$ \beta \in (0,\mathrm{ }1) $ 輸出: 算法最終的參數(shù):$ {{\alpha }}^{t+1} $ 迭代步驟: 步驟1 由樣本$ {{X}}^{\left(i\right)} $計算得到$ p $個不同的核矩陣; 步驟2 用Align方法計算得到多核學習參數(shù)$ {\mu } $并生成新的核矩陣
$ {{K}}_{c\mu } $;步驟3 初始化計數(shù)器 $ t\leftarrow 0 $; 步驟4 初始化參數(shù)$ {{\alpha }}^{{t}}\leftarrow {\alpha } $, $ {\mu }^{t}\leftarrow 1 $,$ {v}^{t}\leftarrow {{\alpha }}^{{t}} $; 步驟5 $ {{\alpha }}^{t+1}={p}_{\tau }\left({v}^{t}\right) $; 步驟6 ${\mu }^{t+1}=\dfrac{1+\sqrt{1+4({\mu }^{t}{)}^{2} } }{2}$; 步驟7 ${v}^{t+1}={{\alpha } }^{t+1}+\dfrac{ {\mu }^{t}-1}{ {\mu }^{t+1} }({{\alpha } }^{t+1}-{{\alpha } }^{t})$; 步驟8 $\tau= \beta \tau$; 步驟9 當滿足$l\left( {{\alpha ^{t + 1}}} \right) \le \hat l\left( {{\alpha ^{t + 1}},\;{\alpha ^t}} \right)$或迭代到指定次數(shù)時算
法終止,執(zhí)行步驟10。否則,令$t \leftarrow t + 1$,并返回到步
驟5;步驟10 返回更新完成的算法參數(shù)${{{\alpha}} ^{t + 1}}$。 下載: 導出CSV
表 1 分類準確率
數(shù)據(jù)集 SVM SLR WDMLR SML-ISTA SML-FISTA KSMLR MKSMLR Banana 0.9069 – – – – 0.9069 0.9107 COIL20 0.8032 0.9676 0.9832 0.9895 0.9958 0.9977 1 ORL 0.9507 0.9420 0.9545 0.9242 0.9545 0.9000 0.9167 GT-32 – – 0.7823 0.7580 0.7621 0.8044 0.8044 MNIST-S 0.9113 0.9001 0.9109 0.9036 0.9048 0.9360 0.9400 Lung 0.7705 0.9344 0.9104 0.9104 0.9254 0.9180 0.9344 Indian-pines 0.7980 0.8182 0.7599 0.8120 0.8120 0.8218 0.8237 Segment 0.5989 0.9235 0.8268 0.8925 0.9253 0.9538 0.9567 注:表中的“– ”符號表示未能正確分類或分類效果接近于隨機選擇。 下載: 導出CSV
表 2 算法運行時間(s)
數(shù)據(jù)集 SML-ISTA SML-FISTA KSMLR MKSMLR Banana – – 0.78 1.19 COIL20 1.71 0.39 7.61 13.46 ORL 142.05 7.5 10.43 2.73 GT-32 88.19 2.03 37.94 10.77 MNIST-S 0.12 0.14 0.14 22.98 Lung 42.71 1.4 2.12 3.08 Indian-pines 427.62 18.58 68.31 909.1 Segment 21.33 20.71 13.68 33.35 注:表中的“– ”符號表示未能正確分類或分類效果接近于隨機選擇。 下載: 導出CSV
-
ZHOU Changjun, WANG Lan, ZHANG Qiang, et al. Face recognition based on PCA and logistic regression analysis[J]. Optik, 2014, 125(20): 5916–5919. doi: 10.1016/j.ijleo.2014.07.080 WARNER P. Ordinal logistic regression[J]. Journal of Family Planning and Reproductive Health Care, 2008, 34(3): 169–170. doi: 10.1783/147118908784734945 LIU Wu, FOWLER J E, and ZHAO Chunhui. Spatial logistic regression for support-vector classification of hyperspectral imagery[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14(3): 439–443. doi: 10.1109/LGRS.2017.2648515 ABRAMOVICH F and GRINSHTEIN V. High-dimensional classification by sparse logistic regression[J]. IEEE Transactions on Information Theory, 2019, 65(5): 3068–3079. doi: 10.1109/TIT.2018.2884963 CARVALHO C M, CHANG J, LUCAS J E, et al. High-dimensional sparse factor modeling: Applications in gene expression genomics[J]. Journal of the American Statistical Association, 2008, 103(484): 1438–1456. doi: 10.1198/016214508000000869 GALAR M, FERNáNDEZ A, BARRENECHEA E, et al. An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes[J]. Pattern Recognition, 2011, 44(8): 1761–1776. doi: 10.1016/j.patcog.2011.01.017 曾志強, 吳群, 廖備水, 等. 一種基于核SMOTE的非平衡數(shù)據(jù)集分類方法[J]. 電子學報, 2009, 37(11): 2489–2495. doi: 10.3321/j.issn:0372-2112.2009.11.024ZENG Zhiqiang, WU Qun, LIAO Beishui, et al. A classfication method for imbalance data set based on kernel SMOTE[J]. Acta Electronica Sinica, 2009, 37(11): 2489–2495. doi: 10.3321/j.issn:0372-2112.2009.11.024 CAO Faxian, YANG Zhijing, REN Jinchang, et al. Extreme sparse multinomial logistic regression: A fast and robust framework for hyperspectral image classification[J]. Remote Sensing, 2017, 9(12): 1255. doi: 10.3390/rs9121255 LIU Tianzhu, GU Yanfeng, JIA Xiuping, et al. Class-specific sparse multiple kernel learning for spectral–spatial hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7351–7365. doi: 10.1109/TGRS.2016.2600522 FANG Leyuan, WANG Cheng, LI Shutao, et al. Hyperspectral image classification via multiple-feature-based adaptive sparse representation[J]. IEEE Transactions on Instrumentation and Measurement, 2017, 66(7): 1646–1657. doi: 10.1109/TIM.2017.2664480 OUYED O and ALLILI M S. Feature weighting for multinomial kernel logistic regression and application to action recognition[J]. Neurocomputing, 2018, 275: 1752–1768. doi: 10.1016/j.neucom.2017.10.024 徐金環(huán), 沈煜, 劉鵬飛, 等. 聯(lián)合核稀疏多元邏輯回歸和TV-L1錯誤剔除的高光譜圖像分類算法[J]. 電子學報, 2018, 46(1): 175–184. doi: 10.3969/j.issn.0372-2112.2018.01.024XU Jinhuan, SHEN Yu, LIU Pengfei, et al. Hyperspectral image classification combining kernel sparse multinomial logistic regression and TV-L1 error rejection[J]. Acta Electronica Sinica, 2018, 46(1): 175–184. doi: 10.3969/j.issn.0372-2112.2018.01.024 SCH?LKOPF B and SMOLA A J. Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond[M]. Cambridge: MIT Press, 2002. 汪洪橋, 孫富春, 蔡艷寧, 等. 多核學習方法[J]. 自動化學報, 2010, 36(8): 1037–1050. doi: 10.3724/SP.J.1004.2010.01037WANG Hongqiao, SUN Fuchun, CAI Yanning, et al. On multiple kernel learning methods[J]. Acta Automatica Sinica, 2010, 36(8): 1037–1050. doi: 10.3724/SP.J.1004.2010.01037 G?NEN M and ALPAYDIN E. Multiple kernel learning algorithms[J]. Journal of Machine Learning Research, 2011, 12: 2211–2268. GU Yanfeng, LIU Tianzhu, JIA Xiuping, et al. Nonlinear multiple kernel learning with multiple-structure-element extended morphological profiles for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(6): 3235–3247. doi: 10.1109/TGRS.2015.2514161 RAKOTOMAMONJY A, BACH F R, CANU S, et al. SimpleMKL[J]. Journal of Machine Learning Research, 2008, 9: 2491–2521. LOOSLI G and ABOUBACAR H. Using SVDD in SimpleMKL for 3D-Shapes filtering[C]. CAp - Conférence D'apprentissage, Saint-Etienne, 2017. doi: 10.13140/2.1.3091.3605. JAIN A, VISHWANATHAN S V N, and VARMA M. SPF-GMKL: Generalized multiple kernel learning with a million kernels[C]. The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 2012: 750–758. doi: 10.1145/2339530.2339648. BAHMANI S, BOUFOUNOS P T, and RAJ B. Learning model-based sparsity via projected gradient descent[J]. IEEE Transactions on Information Theory, 2016, 62(4): 2092–2099. doi: 10.1109/TIT.2016.2515078 CORTES C, MOHRI M, and ROSTAMIZADEH A. Algorithms for learning kernels based on centered alignment[J]. Journal of Machine Learning Research, 2012, 13(28): 795–828. CHENG Chunyuan, HSU C C, and CHENG Muchen. Adaptive kernel principal component analysis (KPCA) for monitoring small disturbances of nonlinear processes[J]. Industrial & Engineering Chemistry Research, 2010, 49(5): 2254–2262. doi: 10.1021/ie900521b YANG Hongjun and LIU Jinkun. An adaptive RBF neural network control method for a class of nonlinear systems[J]. IEEE/CAA Journal of Automatica Sinica, 2018, 5(2): 457–462. doi: 10.1109/JAS.2017.7510820 BECK A and TEBOULLE M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183–202. doi: 10.1137/080716542 KRISHNAPURAM B, CARIN L, FIGUEIREDO M A T, et al. Sparse multinomial logistic regression: Fast algorithms and generalization bounds[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(6): 957–968. doi: 10.1109/tpami.2005.127 CHEN Xi, LIN Qihang, KIM S, et al. Smoothing proximal gradient method for general structured sparse regression[J]. The Annals of Applied Statistics, 2012, 6(2): 719–752. doi: 10.1214/11-aoas514 LECUN Y, BENGIO Y and HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436–444. doi: 10.1038/nature14539 PéREZ-ORTIZ M, GUTIéRREZ P A, SáNCHEZ-MONEDERO J, et al. A study on multi-scale kernel optimisation via centered kernel-target alignment[J]. Neural Processing Letters, 2016, 44(2): 491–517. doi: 10.1007/s11063-015-9471-0 -
計量
- 文章訪問數(shù): 1881
- HTML全文瀏覽量: 480
- PDF下載量: 68
- 被引次數(shù): 0