基于中心對齊多核學習的稀疏多元邏輯回歸算法

雷大江; 唐建烊; 李智星; 吳渝

doi:10.11999/JEIT190426

基于中心對齊多核學習的稀疏多元邏輯回歸算法

doi: 10.11999/JEIT190426 cstr: 32379.14.JEIT190426

1.
重慶郵電大學計算機科學與技術(shù)學院重慶 400065
2.
重慶郵電大學網(wǎng)絡(luò)智能研究所重慶 400065

基金項目: 重慶市留學歸國人員創(chuàng)新創(chuàng)業(yè)項目支持人選(cx2018120)，國家社會科學基金(17XFX013)，重慶市基礎(chǔ)研究與前沿探索項目(cstc2015jcyjA40018)

詳細信息

作者簡介:
雷大江：男，1979年生，副教授，研究方向為機器學習

唐建烊：男，1993年生，碩士生，研究方向為核機器學習

李智星：男，1985年生，副教授，研究方向為自然語言處理

吳渝：女，1970年生，教授，研究方向為網(wǎng)絡(luò)智能

通訊作者:
雷大江　leidj@cqupt.edu.cn

中圖分類號: TN911.7; TP181
計量
- 文章訪問數(shù): 1881
- HTML全文瀏覽量: 480
- PDF下載量: 68
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-06-11
- 修回日期: 2020-03-28
- 網(wǎng)絡(luò)出版日期: 2020-08-27
- 刊出日期: 2020-11-16

Sparse Multinomial Logistic Regression Algorithm Based on Centered Alignment Multiple Kernels Learning

1.
College of Computer, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Institute of Web Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Funds: The Chongqing Innovative Project of Overseas Study(cx2018120), The National Social Science Foundation of China(17XFX013), The Natural Science Foundation of Chongqing(cstc2015jcyjA40018)

摘要

摘要: 稀疏多元邏輯回歸(SMLR)作為一種廣義的線性模型被廣泛地應(yīng)用于各種多分類任務(wù)場景中。SMLR通過將拉普拉斯先驗引入多元邏輯回歸(MLR)中使其解具有稀疏性，這使得該分類器可以在進行分類的過程中嵌入特征選擇。為了使分類器能夠解決非線性數(shù)據(jù)分類的問題，該文通過核技巧對SMLR進行核化擴充后得到了核稀疏多元邏輯回歸(KSMLR)。KSMLR能夠?qū)⒎蔷€性特征數(shù)據(jù)通過核函數(shù)映射到高維甚至無窮維的特征空間中，使其特征能夠充分地表達并最終能進行有效的分類。此外，該文還利用了基于中心對齊的多核學習算法，通過不同的核函數(shù)對數(shù)據(jù)進行不同維度的映射，并用中心對齊相似度來靈活地選取多核學習權(quán)重系數(shù)，使得分類器具有更好的泛化能力。實驗結(jié)果表明，該文提出的基于中心對齊多核學習的稀疏多元邏輯回歸算法在分類的準確率指標上都優(yōu)于目前常規(guī)的分類算法。
- 稀疏優(yōu)化 /
- 核技巧 /
- 多核學習 /
- 稀疏多元邏輯回歸
Abstract: As a generalized linear model, Sparse Multinomial Logistic Regression (SMLR) is widely used in various multi-class task scenarios. SMLR introduces Laplace priori into Multinomial Logistic Regression (MLR) to make its solution sparse, which allows the classifier to embed feature selection in the process of classification. In order to solve the problem of non-linear data classification, Kernel Sparse Multinomial Logistic Regression (KSMLR) is obtained by kernel trick. KSMLR can map nonlinear feature data into high-dimensional and even infinite-dimensional feature spaces through kernel functions, so that its features can be fully expressed and eventually classified effectively. In addition, the multi-kernel learning algorithm based on centered alignment is used to map the data in different dimensions through different kernel functions. Then center-aligned similarity can be used to select flexibly multi-kernel learning weight coefficients, so that the classifier has better generalization ability. The experimental results show that the sparse multinomial logistic regression algorithm based on center-aligned multi-kernel learning is superior to the conventional classification algorithm in classification accuracy.
- Sparse optimization /
- Kernel trick /
- Multiple kernels learning /
- Sparse Multinomial Logistic Regression(SMLR)

HTML全文

算法1：KSMLR問題的回溯ISTA算法
輸入：
初始化步長：$ \tau =1/L $, $ L>0 $，
初始化參數(shù)：$ {\alpha }\in {R}^{n\times k} $，初始化核函數(shù)參數(shù)：$ \mathrm{\sigma }=2 $，
最大迭代次數(shù)：$ \mathrm{I}\mathrm{t}\mathrm{e}\mathrm{r} $ = 500，
回溯參數(shù)：$ \beta \in (0,\mathrm{ }1) $
輸出：
算法最終的參數(shù)：$ {{\alpha }}^{t+1} $
迭代步驟：
步驟1　由樣本$ {{X}}^{\left(i\right)} $計算得到核矩陣$ {k} $；
步驟2　初始化計數(shù)器 $ t\leftarrow 0 $；
步驟3　初始化參數(shù)$ {{\alpha }}^{{t}}\leftarrow {\alpha } $；
步驟4　 $ {{\alpha }}^{t+1}={p}_{\tau }\left({{\alpha }}^{t}\right) $；
步驟5　 $ \tau =\beta \tau $；
步驟6　當滿足$l\left( {{{{\alpha}} ^{t + 1}}} \right) \le \hat l\left( {{{{\alpha}} ^{t + 1}},{{{\alpha}} ^t}} \right)$或迭代到指定次數(shù)時算　　　　　法終止，執(zhí)行步驟7。否則，令t←t+1，并返回到步驟4；
步驟7　返回更新完成的算法參數(shù)${{{\alpha}} ^{t + 1}}$。

下載: 導出CSV

算法2：MKSMLR問題的回溯FISTA算法
輸入：
初始化步長：$\tau =1/L$, $ L>0 $，
初始化參數(shù)：$ {\alpha }\in {R}^{n\times k} $，
初始化核函數(shù)參數(shù)：$ \mathrm{\sigma }=2 $，
最大迭代次數(shù)：$ \mathrm{I}\mathrm{t}\mathrm{e}\mathrm{r} $ = 500，
回溯參數(shù)：$ \beta \in (0,\mathrm{ }1) $
輸出：
算法最終的參數(shù)：$ {{\alpha }}^{t+1} $
迭代步驟：
步驟1　由樣本$ {{X}}^{\left(i\right)} $計算得到$ p $個不同的核矩陣；
步驟2　用Align方法計算得到多核學習參數(shù)$ {\mu } $并生成新的核矩陣　　　　　$ {{K}}_{c\mu } $；
步驟3　初始化計數(shù)器 $ t\leftarrow 0 $；
步驟4　初始化參數(shù)$ {{\alpha }}^{{t}}\leftarrow {\alpha } $, $ {\mu }^{t}\leftarrow 1 $,$ {v}^{t}\leftarrow {{\alpha }}^{{t}} $；
步驟5　 $ {{\alpha }}^{t+1}={p}_{\tau }\left({v}^{t}\right) $；
步驟6　 ${\mu }^{t+1}=\dfrac{1+\sqrt{1+4({\mu }^{t}{)}^{2} } }{2}$；
步驟7　${v}^{t+1}={{\alpha } }^{t+1}+\dfrac{ {\mu }^{t}-1}{ {\mu }^{t+1} }({{\alpha } }^{t+1}-{{\alpha } }^{t})$；
步驟8　$\tau= \beta \tau$；
步驟9　當滿足$l\left( {{\alpha ^{t + 1}}} \right) \le \hat l\left( {{\alpha ^{t + 1}},\;{\alpha ^t}} \right)$或迭代到指定次數(shù)時算　　　　法終止，執(zhí)行步驟10。否則，令$t \leftarrow t + 1$，并返回到步　　　　驟5；
步驟10　返回更新完成的算法參數(shù)${{{\alpha}} ^{t + 1}}$。

下載: 導出CSV

表 1 分類準確率

數(shù)據(jù)集	SVM	SLR	WDMLR	SML-ISTA	SML-FISTA	KSMLR	MKSMLR
Banana	0.9069	–	–	–	–	0.9069	0.9107
COIL20	0.8032	0.9676	0.9832	0.9895	0.9958	0.9977	1
ORL	0.9507	0.9420	0.9545	0.9242	0.9545	0.9000	0.9167
GT-32	–	–	0.7823	0.7580	0.7621	0.8044	0.8044
MNIST-S	0.9113	0.9001	0.9109	0.9036	0.9048	0.9360	0.9400
Lung	0.7705	0.9344	0.9104	0.9104	0.9254	0.9180	0.9344
Indian-pines	0.7980	0.8182	0.7599	0.8120	0.8120	0.8218	0.8237
Segment	0.5989	0.9235	0.8268	0.8925	0.9253	0.9538	0.9567
注：表中的“– ”符號表示未能正確分類或分類效果接近于隨機選擇。

下載: 導出CSV

表 2 算法運行時間(s)

數(shù)據(jù)集	SML-ISTA	SML-FISTA	KSMLR	MKSMLR
Banana	–	–	0.78	1.19
COIL20	1.71	0.39	7.61	13.46
ORL	142.05	7.5	10.43	2.73
GT-32	88.19	2.03	37.94	10.77
MNIST-S	0.12	0.14	0.14	22.98
Lung	42.71	1.4	2.12	3.08
Indian-pines	427.62	18.58	68.31	909.1
Segment	21.33	20.71	13.68	33.35
注：表中的“– ”符號表示未能正確分類或分類效果接近于隨機選擇。

下載: 導出CSV

參考文獻(28)

ZHOU Changjun, WANG Lan, ZHANG Qiang, et al. Face recognition based on PCA and logistic regression analysis[J]. Optik, 2014, 125(20): 5916–5919. doi: 10.1016/j.ijleo.2014.07.080

WARNER P. Ordinal logistic regression[J]. Journal of Family Planning and Reproductive Health Care, 2008, 34(3): 169–170. doi: 10.1783/147118908784734945

LIU Wu, FOWLER J E, and ZHAO Chunhui. Spatial logistic regression for support-vector classification of hyperspectral imagery[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14(3): 439–443. doi: 10.1109/LGRS.2017.2648515

ABRAMOVICH F and GRINSHTEIN V. High-dimensional classification by sparse logistic regression[J]. IEEE Transactions on Information Theory, 2019, 65(5): 3068–3079. doi: 10.1109/TIT.2018.2884963

CARVALHO C M, CHANG J, LUCAS J E, et al. High-dimensional sparse factor modeling: Applications in gene expression genomics[J]. Journal of the American Statistical Association, 2008, 103(484): 1438–1456. doi: 10.1198/016214508000000869

GALAR M, FERNáNDEZ A, BARRENECHEA E, et al. An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes[J]. Pattern Recognition, 2011, 44(8): 1761–1776. doi: 10.1016/j.patcog.2011.01.017

曾志強, 吳群, 廖備水, 等. 一種基于核SMOTE的非平衡數(shù)據(jù)集分類方法[J]. 電子學報, 2009, 37(11): 2489–2495. doi: 10.3321/j.issn:0372-2112.2009.11.024

ZENG Zhiqiang, WU Qun, LIAO Beishui, et al. A classfication method for imbalance data set based on kernel SMOTE[J]. Acta Electronica Sinica, 2009, 37(11): 2489–2495. doi: 10.3321/j.issn:0372-2112.2009.11.024

CAO Faxian, YANG Zhijing, REN Jinchang, et al. Extreme sparse multinomial logistic regression: A fast and robust framework for hyperspectral image classification[J]. Remote Sensing, 2017, 9(12): 1255. doi: 10.3390/rs9121255

LIU Tianzhu, GU Yanfeng, JIA Xiuping, et al. Class-specific sparse multiple kernel learning for spectral–spatial hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7351–7365. doi: 10.1109/TGRS.2016.2600522

FANG Leyuan, WANG Cheng, LI Shutao, et al. Hyperspectral image classification via multiple-feature-based adaptive sparse representation[J]. IEEE Transactions on Instrumentation and Measurement, 2017, 66(7): 1646–1657. doi: 10.1109/TIM.2017.2664480

OUYED O and ALLILI M S. Feature weighting for multinomial kernel logistic regression and application to action recognition[J]. Neurocomputing, 2018, 275: 1752–1768. doi: 10.1016/j.neucom.2017.10.024

徐金環(huán), 沈煜, 劉鵬飛, 等. 聯(lián)合核稀疏多元邏輯回歸和TV-L1錯誤剔除的高光譜圖像分類算法[J]. 電子學報, 2018, 46(1): 175–184. doi: 10.3969/j.issn.0372-2112.2018.01.024

XU Jinhuan, SHEN Yu, LIU Pengfei, et al. Hyperspectral image classification combining kernel sparse multinomial logistic regression and TV-L1 error rejection[J]. Acta Electronica Sinica, 2018, 46(1): 175–184. doi: 10.3969/j.issn.0372-2112.2018.01.024

SCH?LKOPF B and SMOLA A J. Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond[M]. Cambridge: MIT Press, 2002.

汪洪橋, 孫富春, 蔡艷寧, 等. 多核學習方法[J]. 自動化學報, 2010, 36(8): 1037–1050. doi: 10.3724/SP.J.1004.2010.01037

WANG Hongqiao, SUN Fuchun, CAI Yanning, et al. On multiple kernel learning methods[J]. Acta Automatica Sinica, 2010, 36(8): 1037–1050. doi: 10.3724/SP.J.1004.2010.01037

G?NEN M and ALPAYDIN E. Multiple kernel learning algorithms[J]. Journal of Machine Learning Research, 2011, 12: 2211–2268.

GU Yanfeng, LIU Tianzhu, JIA Xiuping, et al. Nonlinear multiple kernel learning with multiple-structure-element extended morphological profiles for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(6): 3235–3247. doi: 10.1109/TGRS.2015.2514161

RAKOTOMAMONJY A, BACH F R, CANU S, et al. SimpleMKL[J]. Journal of Machine Learning Research, 2008, 9: 2491–2521.

LOOSLI G and ABOUBACAR H. Using SVDD in SimpleMKL for 3D-Shapes filtering[C]. CAp - Conférence D'apprentissage, Saint-Etienne, 2017. doi: 10.13140/2.1.3091.3605.

JAIN A, VISHWANATHAN S V N, and VARMA M. SPF-GMKL: Generalized multiple kernel learning with a million kernels[C]. The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 2012: 750–758. doi: 10.1145/2339530.2339648.

BAHMANI S, BOUFOUNOS P T, and RAJ B. Learning model-based sparsity via projected gradient descent[J]. IEEE Transactions on Information Theory, 2016, 62(4): 2092–2099. doi: 10.1109/TIT.2016.2515078

CORTES C, MOHRI M, and ROSTAMIZADEH A. Algorithms for learning kernels based on centered alignment[J]. Journal of Machine Learning Research, 2012, 13(28): 795–828.

CHENG Chunyuan, HSU C C, and CHENG Muchen. Adaptive kernel principal component analysis (KPCA) for monitoring small disturbances of nonlinear processes[J]. Industrial & Engineering Chemistry Research, 2010, 49(5): 2254–2262. doi: 10.1021/ie900521b

YANG Hongjun and LIU Jinkun. An adaptive RBF neural network control method for a class of nonlinear systems[J]. IEEE/CAA Journal of Automatica Sinica, 2018, 5(2): 457–462. doi: 10.1109/JAS.2017.7510820

BECK A and TEBOULLE M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183–202. doi: 10.1137/080716542

KRISHNAPURAM B, CARIN L, FIGUEIREDO M A T, et al. Sparse multinomial logistic regression: Fast algorithms and generalization bounds[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(6): 957–968. doi: 10.1109/tpami.2005.127

CHEN Xi, LIN Qihang, KIM S, et al. Smoothing proximal gradient method for general structured sparse regression[J]. The Annals of Applied Statistics, 2012, 6(2): 719–752. doi: 10.1214/11-aoas514

LECUN Y, BENGIO Y and HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436–444. doi: 10.1038/nature14539

PéREZ-ORTIZ M, GUTIéRREZ P A, SáNCHEZ-MONEDERO J, et al. A study on multi-scale kernel optimisation via centered kernel-target alignment[J]. Neural Processing Letters, 2016, 44(2): 491–517. doi: 10.1007/s11063-015-9471-0

相關(guān)文章

施引文獻

資源附件(0)

訪問統(tǒng)計