一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

零樣本圖像識別

蘭紅 方治嶼

蘭紅, 方治嶼. 零樣本圖像識別[J]. 電子與信息學(xué)報(bào), 2020, 42(5): 1188-1200. doi: 10.11999/JEIT190485
引用本文: 蘭紅, 方治嶼. 零樣本圖像識別[J]. 電子與信息學(xué)報(bào), 2020, 42(5): 1188-1200. doi: 10.11999/JEIT190485
Hong LAN, Zhiyu FANG. Recent Advances in Zero-Shot Learning[J]. Journal of Electronics & Information Technology, 2020, 42(5): 1188-1200. doi: 10.11999/JEIT190485
Citation: Hong LAN, Zhiyu FANG. Recent Advances in Zero-Shot Learning[J]. Journal of Electronics & Information Technology, 2020, 42(5): 1188-1200. doi: 10.11999/JEIT190485

零樣本圖像識別

doi: 10.11999/JEIT190485 cstr: 32379.14.JEIT190485
基金項(xiàng)目: 國家自然科學(xué)基金(61762046),江西省自然科學(xué)基金(20161BAB212048)
詳細(xì)信息
    作者簡介:

    蘭紅:女,1969年生,教授,碩士生導(dǎo)師,主要研究方向?yàn)橛?jì)算機(jī)視覺、圖像處理與模式識別

    方治嶼:男,1993年生,碩士生,研究方向?yàn)橛?jì)算機(jī)視覺與深度學(xué)習(xí)

    通訊作者:

    蘭紅 lanhong69@163.com

  • 中圖分類號: TN911.73; TP391.41

Recent Advances in Zero-Shot Learning

Funds: The National Natural Science Foundation of China (61762046), The Natural Science Foundation of Jiangxi Province (20161BAB212048)
  • 摘要:

    深度學(xué)習(xí)在人工智能領(lǐng)域已經(jīng)取得了非常優(yōu)秀的成就,在有監(jiān)督識別任務(wù)中,使用深度學(xué)習(xí)算法訓(xùn)練海量的帶標(biāo)簽數(shù)據(jù),可以達(dá)到前所未有的識別精確度。但是,由于對海量數(shù)據(jù)的標(biāo)注工作成本昂貴,對罕見類別獲取海量數(shù)據(jù)難度較大,所以如何識別在訓(xùn)練過程中少見或從未見過的未知類仍然是一個(gè)嚴(yán)峻的問題。針對這個(gè)問題,該文回顧近年來的零樣本圖像識別技術(shù)研究,從研究背景、模型分析、數(shù)據(jù)集介紹、實(shí)驗(yàn)分析等方面全面闡釋零樣本圖像識別技術(shù)。此外,該文還分析了當(dāng)前研究存在的技術(shù)難題,并針對主流問題提出一些解決方案以及對未來研究的展望,為零樣本學(xué)習(xí)的初學(xué)者或研究者提供一些參考。

  • 圖  1  零樣本學(xué)習(xí)技術(shù)結(jié)構(gòu)圖

    圖  2  零樣本學(xué)習(xí)示意圖

    圖  3  經(jīng)典歸納式零樣本模型示意圖[7]

    圖  4  AwA類-屬性關(guān)系矩陣[7]

    圖  5  3種視覺-語義映射示意圖

    圖  6  領(lǐng)域漂移示例圖[55]

    圖  7  語義間隔示例圖

    表  1  機(jī)器學(xué)習(xí)方法對比表

    訓(xùn)練集$\{ \cal{X},\cal{Y}\} $測試集$\{ \cal{X},\cal{Z}\} $訓(xùn)練類$\cal{Y}$與測試類$\cal{Z}$間關(guān)系$R$最終分類器$C$
    無監(jiān)督學(xué)習(xí)大量無標(biāo)簽圖片已知類圖片$\cal{Y} = \cal{Z}$$C:\cal{X} \to \cal{Y}$
    有監(jiān)督學(xué)習(xí)大量帶標(biāo)簽圖片已知類圖片$\cal{Y} = \cal{Z}$$C:\cal{X} \to \cal{Y}$
    半監(jiān)督學(xué)習(xí)較少帶標(biāo)簽圖片和大量無標(biāo)簽圖片已知類圖片$\cal{Y} = \cal{Z}$$C:\cal{X} \to \cal{Y}$
    少樣本學(xué)習(xí)極少帶標(biāo)簽圖片和大量無標(biāo)簽圖片已知類圖片$\cal{Y} = \cal{Z}$$C:\cal{X} \to \cal{Y}$
    零樣本學(xué)習(xí)大量帶標(biāo)簽圖片未知類圖片${\cal Y} \cap {\cal Z} = \varnothing$$C:\cal{X} \to \cal{Z}$
    下載: 導(dǎo)出CSV

    表  2  零樣本學(xué)習(xí)中深度卷積神經(jīng)網(wǎng)絡(luò)使用情況統(tǒng)計(jì)表

    網(wǎng)絡(luò)論文數(shù)量
    VGG501
    GoogleNet271
    ResNet397
    下載: 導(dǎo)出CSV

    表  3  零樣本學(xué)習(xí)性能比較(%)

    方法傳統(tǒng)零樣本學(xué)習(xí)泛化零樣本學(xué)習(xí)
    AwACUBSUNAwACUBSUN
    SSPSSSPSSSPSUTS→THUTS→THUTS→TH
    IAP46.935.927.124.017.419.40.987.61.80.272.80.41.037.81.8
    DAP58.746.137.540.038.939.90.084.70.01.767.93.34.225.17.2
    DeViSE68.659.753.252.057.556.517.174.727.823.853.032.816.927.420.9
    ConSE67.944.536.734.344.238.80.590.61.01.672.23.16.839.911.6
    SJE69.561.955.353.957.153.78.073.914.423.559.233.614.730.519.8
    SAE80.754.133.433.342.440.31.182.22.27.854.013.68.818.011.8
    SYNC71.246.654.155.659.156.310.090.518.011.570.919.87.943.313.4
    LDF83.470.4
    SP-AEN58.555.459.223.390.937.134.770.646.624.938.630.3
    QFSL84.879.769.772.161.758.366.293.177.471.574.973.251.331.238.8
    下載: 導(dǎo)出CSV
  • SUN Yi, CHEN Yuheng, WANG Xiaogang, et al. Deep learning face representation by joint identification-verification[C]. The 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: 1988–1996.
    LIU Chenxi, ZOPH B, NEUMANN M, et al. Progressive neural architecture search[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 19–35.
    LEDIG C, THEIS L, HUSZáR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 105–114.
    BIEDERMAN I. Recognition-by-components: A theory of human image understanding[J]. Psychological Review, 1987, 94(2): 115–147. doi: 10.1037/0033-295X.94.2.115
    LAROCHELLE H, ERHAN D, and BENGIO Y. Zero-data learning of new tasks[C]. The 23rd National Conference on Artificial Intelligence, Chicago, USA, 2008: 646–651.
    PALATUCCI M, POMERLEAU D, HINTON G, et al. Zero-shot learning with semantic output codes[C]. The 22nd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2009: 1410–1418.
    LAMPERT C H, NICKISCH H, and HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 951–958. doi: 10.1109/CVPR.2009.5206594.
    HARRINGTON P. Machine Learning in Action[M]. Greenwich, CT, USA: Manning Publications Co, 2012: 5–14.
    ZHOU Dengyong, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[C]. The 16th International Conference on Neural Information Processing Systems, Whistler, Canada, 2003: 321–328.
    劉建偉, 劉媛, 羅雄麟. 半監(jiān)督學(xué)習(xí)方法[J]. 計(jì)算機(jī)學(xué)報(bào), 2015, 38(8): 1592–1617. doi: 10.11897/SP.J.1016.2015.01592

    LIU Jianwei, LIU Yuan, and LUO Xionglin. Semi-supervised learning methods[J]. Chinese Journal of Computers, 2015, 38(8): 1592–1617. doi: 10.11897/SP.J.1016.2015.01592
    SUNG F, YANG Yongxin, LI Zhang, et al. Learning to compare: Relation network for few-shot learning[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1199–1208.
    FU Yanwei, XIANG Tao, JIANG Yugang, et al. Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content[J]. IEEE Signal Processing Magazine, 2018, 35(1): 112–125. doi: 10.1109/MSP.2017.2763441
    XIAN Yongqin, LAMPERT C H, SCHIELE B, et al. Zero-shot learning—A comprehensive evaluation of the good, the bad and the ugly[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2251–2265. doi: 10.1109/TPAMI.2018.2857768
    WANG Wenlin, PU Yunchen, VERMA V K, et al. Zero-shot learning via class-conditioned deep generative models[C]. The 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018: 4211–4218.
    FU Yanwei, HOSPEDALES T M, XIANG Tao, et al. Attribute learning for understanding unstructured social activity[C]. The 12th European Conference on Computer Vision, Florence, Italy, 2012: 530–543.
    ANTOL S, ZITNICK C L, and PARIKH D. Zero-shot learning via visual abstraction[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 401–416.
    ROBYNS P, MARIN E, LAMOTTE W, et al. Physical-layer fingerprinting of LoRa devices using supervised and zero-shot learning[C]. The 10th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Boston, USA, 2017: 58–63. doi: 10.1145/3098243.3098267.
    YANG Yang, LUO Yadan, CHEN Weilun, et al. Zero-shot hashing via transferring supervised knowledge[C]. The 24th ACM international conference on Multimedia, Amsterdam, The Netherlands, 2016: 1286–1295. doi: 10.1145/2964284.2964319.
    PACHORI S, DESHPANDE A, and RAMAN S. Hashing in the zero shot framework with domain adaptation[J]. Neurocomputing, 2018, 275: 2137–2149. doi: 10.1016/j.neucom.2017.10.061
    LIU Jingen, KUIPERS B, and SAVARESE S. Recognizing human actions by attributes[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Colorado, USA, 2011: 3337–3344.
    FU Yanwei, HOSPEDALES T M, XIANG Tao, et al. Learning multimodal latent attributes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(2): 303–316. doi: 10.1109/TPAMI.2013.128
    JAIN M, VAN GEMERT J C, MENSINK T, et al. Objects2action: Classifying and localizing actions without any video example[C]. The IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4588–4596.
    XU Baohan, FU Yanwei, JIANG Yugang, et al. Video emotion recognition with transferred deep feature encodings[C]. The 2016 ACM on International Conference on Multimedia Retrieval, New York, USA, 2016: 15–22.
    JOHNSON M, SCHUSTER M, LE Q V, et al. Google’s multilingual neural machine translation system: Enabling zero-shot translation[J]. Transactions of the Association for Computational Linguistics, 2017, 5: 339–351. doi: 10.1162/tacl_a_00065
    PRATEEK VEERANNA S, JINSEOK N, ENELDO L M, et al. Using semantic similarity for multi-label zero-shot classification of text documents[C]. The 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 2016: 423–428.
    DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886–893.
    LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110. doi: 10.1023/B:VISI.0000029664.99615.94
    BAY H, ESS A, TUYTELAARS T, et al. Speeded-up robust features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110(3): 346–359. doi: 10.1016/j.cviu.2007.09.014
    ROMERA-PAREDES B and TORR P H S. An embarrassingly simple approach to zero-shot learning[C]. The 32nd International Conference on International Conference on Machine Learning, Lille, France, 2015: 2152–2161.
    ZHANG Li, XIANG Tao, and GONG Shaogang. Learning a deep embedding model for zero-shot learning[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 3010–3019.
    LI Yan, ZHANG Junge, ZHANG Jianguo, et al. Discriminative learning of latent features for zero-shot recognition[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7463–7471.
    WANG Xiaolong, YE Yufei, and GUPTA A. Zero-shot recognition via semantic embeddings and knowledge graphs[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6857–6866.
    WAH C, BRANSON S, WELINDER P, et al. The caltech-UCSD birds-200-2011 dataset[R]. Technical Report CNS-TR-2010-001, 2011.
    MIKOLOV T, SUTSKEVER I, CHEN Kai, et al. Distributed representations of words and phrases and their compositionality[C]. The 26th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2013: 3111–3119.
    LEE C, FANG Wei, YEH C K, et al. Multi-label zero-shot learning with structured knowledge graphs[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1576–1585.
    JETLEY S, ROMERA-PAREDES B, JAYASUMANA S, et al. Prototypical priors: From improving classification to zero-shot learning[J]. arXiv: 2015, 1512.01192.
    KARESSLI N, AKATA Z, SCHIELE B, et al. Gaze embeddings for zero-shot image classification[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6412–6421.
    REED S, AKATA Z, LEE H, et al. Learning deep representations of fine-grained visual descriptions[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 49–58.
    ELHOSEINY M, ZHU Yizhe, ZHANG Han, et al. Link the head to the "beak": Zero shot learning from noisy text description at part precision[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6288–6297. doi: 10.1109/CVPR.2017.666.
    LAZARIDOU A, DINU G, and BARONI M. Hubness and pollution: Delving into cross-space mapping for zero-shot learning[C]. The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 2015: 270–280.
    WANG Xiaoyang and JI Qiang. A unified probabilistic approach modeling relationships between attributes and objects[C]. The IEEE International Conference on Computer Vision, Sydney, Australia, 2013: 2120–2127.
    AKATA Z, PERRONNIN F, HARCHAOUI Z, et al. Label-embedding for attribute-based classification[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, 2013: 819–826.
    JURIE F, BUCHER M, and HERBIN S. Generating visual representations for zero-shot classification[C]. The IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 2666–2673.
    FARHADI A, ENDRES I, HOIEM D, et al. Describing objects by their attributes[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 1778–1785. doi: 10.1109/CVPR.2009.5206772.
    PATTERSON G, XU Chen, SU Hang, et al. The sun attribute database: Beyond categories for deeper scene understanding[J]. International Journal of Computer Vision, 2014, 108(1/2): 59–81.
    XIAO Jianxiong, HAYS J, EHINGER K A, et al. Sun database: Large-scale scene recognition from abbey to zoo[C]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 3485–3492. doi: 10.1109/CVPR.2010.5539970.
    NILSBACK M E and ZISSERMAN A. Delving deeper into the whorl of flower segmentation[J]. Image and Vision Computing, 2010, 28(6): 1049–1062. doi: 10.1016/j.imavis.2009.10.001
    NILSBACK M E and ZISSERMAN A. A visual vocabulary for flower classification[C]. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, 2006: 1447–1454. doi: 10.1109/CVPR.2006.42.
    NILSBACK M E and ZISSERMAN A. Automated flower classification over a large number of classes[C]. The 6th Indian Conference on Computer Vision, Graphics & Image Processing, Bhubaneswar, India, 2008: 722–729. doi: 10.1109/ICVGIP.2008.47.
    KHOSLA A, JAYADEVAPRAKASH N, YAO Bangpeng, et al. Novel dataset for fine-grained image categorization: Stanford dogs[C]. CVPR Workshop on Fine-Grained Visual Categorization, 2011.
    DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 248–255.
    CHAO Weilun, CHANGPINYO S, GONG Boqing, et al. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 52–68.
    SONG Jie, SHEN Chengchao, YANG Yezhou, et al. Transductive unbiased embedding for zero-shot learning[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1024–1033.
    李亞南. 零樣本學(xué)習(xí)關(guān)鍵技術(shù)研究[D]. [博士論文], 浙江大學(xué), 2018: 40–43.

    LI Yanan. Research on key technologies for zero-shot learning[D]. [Ph.D. dissertation], Zhejiang University, 2018: 40–43
    FU Yanwei, HOSPEDALES T M, XIANG Tao, et al. Transductive multi-view zero-shot learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(11): 2332–2345. doi: 10.1109/TPAMI.2015.2408354
    KODIROV E, XIANG Tao, and GONG Shaogang. Semantic autoencoder for zero-shot learning[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4447–4456.
    STOCK M, PAHIKKALA T, AIROLA A, et al. A comparative study of pairwise learning methods based on kernel ridge regression[J]. Neural Computation, 2018, 30(8): 2245–2283. doi: 10.1162/neco_a_01096
    ANNADANI Y and BISWAS S. Preserving semantic relations for zero-shot learning[J]. arXiv: 2018, 1803.03049.
    LI Yanan, WANG Donghui, HU Huanhang, et al. Zero-shot recognition using dual visual-semantic mapping paths[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5207–5215.
    CHEN Long, ZHANG Hanwang, XIAO Jun, et al. Zero-shot visual recognition using semantics-preserving adversarial embedding networks[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1043–1052.
  • 加載中
圖(7) / 表(3)
計(jì)量
  • 文章訪問數(shù):  8015
  • HTML全文瀏覽量:  3708
  • PDF下載量:  525
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2019-07-01
  • 修回日期:  2019-11-03
  • 網(wǎng)絡(luò)出版日期:  2019-11-13
  • 刊出日期:  2020-06-04

目錄

    /

    返回文章
    返回