一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

基于深度學(xué)習(xí)的手語識別綜述

張淑軍 張群 李輝

張淑軍, 張群, 李輝. 基于深度學(xué)習(xí)的手語識別綜述[J]. 電子與信息學(xué)報(bào), 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416
引用本文: 張淑軍, 張群, 李輝. 基于深度學(xué)習(xí)的手語識別綜述[J]. 電子與信息學(xué)報(bào), 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416
Shujun ZHANG, Qun ZHANG, Hui LI. Review of Sign Language Recognition Based on Deep Learning[J]. Journal of Electronics & Information Technology, 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416
Citation: Shujun ZHANG, Qun ZHANG, Hui LI. Review of Sign Language Recognition Based on Deep Learning[J]. Journal of Electronics & Information Technology, 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416

基于深度學(xué)習(xí)的手語識別綜述

doi: 10.11999/JEIT190416 cstr: 32379.14.JEIT190416
基金項(xiàng)目: 國家自然科學(xué)基金(61702295, 61672305),山東省重點(diǎn)研發(fā)計(jì)劃項(xiàng)目(2017GGX10127)
詳細(xì)信息
    作者簡介:

    張淑軍:女,1980年生,副教授,研究方向?yàn)橛?jì)算機(jī)視覺

    張群:女,1994年生,碩士生,研究方向?yàn)橛?jì)算機(jī)視覺

    李輝:男,1984年生,副教授,研究方向?yàn)橛?jì)算機(jī)視覺

    通訊作者:

    張淑軍 lindazsj@163.com

  • 中圖分類號: TP391

Review of Sign Language Recognition Based on Deep Learning

Funds: The National Natural Science Foundation of China (61702295, 61672305), The Key Research & Development Plan Project of Shandong Province (2017GGX10127)
  • 摘要:

    手語識別涉及計(jì)算機(jī)視覺、模式識別、人機(jī)交互等領(lǐng)域,具有重要的研究意義與應(yīng)用價(jià)值。深度學(xué)習(xí)技術(shù)的蓬勃發(fā)展為更加精準(zhǔn)、實(shí)時(shí)的手語識別帶來了新的機(jī)遇。該文綜述了近年來基于深度學(xué)習(xí)的手語識別技術(shù),從孤立詞與連續(xù)語句兩個(gè)分支展開詳細(xì)的算法闡述與分析。孤立詞識別技術(shù)劃分為基于卷積神經(jīng)網(wǎng)絡(luò)(CNN)、3維卷積神經(jīng)網(wǎng)絡(luò)(3D-CNN)和循環(huán)神經(jīng)網(wǎng)絡(luò)(RNN) 3種架構(gòu)的方法;連續(xù)語句識別所用模型復(fù)雜度更高,通常需要輔助某種長時(shí)時(shí)序建模算法,按其主體結(jié)構(gòu)分為雙向長短時(shí)記憶網(wǎng)絡(luò)模型、3維卷積網(wǎng)絡(luò)模型和混合模型。歸納總結(jié)了目前國內(nèi)外常用手語數(shù)據(jù)集,探討了手語識別技術(shù)的研究挑戰(zhàn)與發(fā)展趨勢,高精度前提下的魯棒性和實(shí)用化仍有待于推進(jìn)。

  • 圖  1  總體分類圖

    圖  2  RWTH德國手語數(shù)據(jù)樣例

    圖  3  CSL中國手語數(shù)據(jù)樣例

    圖  4  每幀的視覺方式

    表  1  基于深度學(xué)習(xí)的孤立詞手語識別技術(shù)及代表性工作

    作者/單位年份技術(shù)特點(diǎn)準(zhǔn)確率(%)數(shù)據(jù)集樣本大小
    Tang Ao, Li HouQiang, Huang Jie, Li Xiaoxu, Huang Shiliang/中國科學(xué)技術(shù)大學(xué)2013卷積神經(jīng)網(wǎng)絡(luò)(基于RGB-D并對手部
    進(jìn)行分割與追蹤)[4]
    98.12American Sign Language(ASL)50700幀
    20153維卷積神經(jīng)網(wǎng)絡(luò)(多模態(tài)輸入)[17]94.20Chinese Sign Language(CSL)25類
    2016循環(huán)神經(jīng)網(wǎng)絡(luò)(加入軌跡數(shù)據(jù))[27]85.60500類
    2017長短時(shí)記憶網(wǎng)絡(luò)(加入手型描述符)[28]86.20100類
    2018循環(huán)神經(jīng)網(wǎng)絡(luò)(關(guān)鍵幀視頻序列篩選)[29]91.18310類
    3維卷積網(wǎng)絡(luò)(基于注意力機(jī)制)[18]88.70500類
    Pigou L/根特大學(xué)2014卷積神經(jīng)網(wǎng)絡(luò)[5]91.70Chalearn20類
    20163維卷積網(wǎng)絡(luò)(多模態(tài)數(shù)據(jù)的特征融合)[16]81.002014
    Molchanov P,Garcia B,Hardie Cate/斯坦福大學(xué)20153維卷積網(wǎng)絡(luò)(多尺度數(shù)據(jù))[15]77.50VIVA Dataset
    循環(huán)神經(jīng)網(wǎng)絡(luò)[25]90.80南威爾士大學(xué)數(shù)據(jù)集95類
    2016卷積神經(jīng)網(wǎng)絡(luò)[9]91.63ASL fingerspelling
    Kang B /加州大學(xué)2015卷積神經(jīng)網(wǎng)絡(luò)[6]99.99ASL fingerspelling31類
    Miao Qiguang /西安電子科技大學(xué)20163維卷積神經(jīng)網(wǎng)絡(luò)(基于RGB-D)[19]56.90Chalearn
    2017(基于顯著性特征和RGB-D)[20]59.43
    (基于多模態(tài)數(shù)據(jù)和手部特征增強(qiáng))[21]67.71
    Koller O/亞琛工業(yè)大學(xué)2016卷積神經(jīng)網(wǎng)絡(luò)(關(guān)注手型變化)[8]Danish Sign Language分辨率4730×22
    Chai Xiujuan/中科院計(jì)算所2017改進(jìn)的RNN(對手部分割定位)[26]99.00Chinese Sign Language(CSL)40類
    Yang Su/北京工業(yè)大學(xué)2017RNN和CNN相結(jié)合[30]98.43CSL40類
    RNN(數(shù)據(jù)預(yù)處理)[31]99.00CSL40類
    Hossen M A /特斯瓦拉工程學(xué)院2017卷積神經(jīng)網(wǎng)絡(luò)[7]100.00Kinect錄制10類
    ElBadawy M /埃及埃因薩姆斯大學(xué)20173維卷積網(wǎng)絡(luò)[22]98.00阿拉伯?dāng)?shù)據(jù)集25類
    Kim S /韓國首爾大學(xué)2017卷積神經(jīng)網(wǎng)絡(luò)(幀間采樣)[10]86.00攝像頭采集20類
    2018卷積神經(jīng)網(wǎng)絡(luò)(手部分割)[11]98.0012類
    Kopuklu O/德國慕尼黑大學(xué)2018卷積神經(jīng)網(wǎng)絡(luò)(時(shí)空特征融合)[12]96.28Jester Chalearn
    57.40
    Konstantinidis D /希臘大學(xué)2018卷積神經(jīng)網(wǎng)絡(luò)(RGB和骨架數(shù)據(jù))[13]98.09阿根廷數(shù)據(jù)集LSA64
    循環(huán)神經(jīng)網(wǎng)絡(luò)(多模態(tài)數(shù)據(jù)融合)[36]89.50印度手語數(shù)據(jù)集(IIT)
    Devineau G /巴黎圣米歇爾研究大學(xué)2018卷積神經(jīng)網(wǎng)絡(luò)(骨架數(shù)據(jù)、加入手部關(guān)節(jié)點(diǎn)位置序列)[14]84.35DHG Dataset28 類
    Ye Yuancheng /紐約城市大學(xué)20183維卷積網(wǎng)絡(luò)(特征融合)[23]69.20American Sign Language27類
    Liang Zhijie /華中師范大學(xué)20183維卷積網(wǎng)絡(luò)(骨架、輪廓、深度數(shù)據(jù))[24]83.60Chalearn
    Lin Chi/中國科學(xué)院自動化所2018帶有掩膜的ResC3D網(wǎng)絡(luò)與RNN相結(jié)合[32]68.42Chalearn
    Halim K /印尼大學(xué)2018循環(huán)神經(jīng)網(wǎng)絡(luò)(基于SIBI詞性變化手勢的特征集)[33]96.15印尼手語數(shù)據(jù)集
    Masood S /新德里大學(xué)2018循環(huán)神經(jīng)網(wǎng)絡(luò)和卷積神經(jīng)網(wǎng)絡(luò)相結(jié)合[34]95.20阿根廷數(shù)據(jù)集LSA6446類
    Bantupalli K /美國肯尼索州立大學(xué)2018循環(huán)神經(jīng)網(wǎng)絡(luò)和卷積神經(jīng)網(wǎng)絡(luò)相結(jié)合[35]93.00American Sign Language(ASL)100類
    Hernandez V /東京農(nóng)業(yè)大學(xué)2019卷積神經(jīng)網(wǎng)絡(luò)與長短時(shí)記憶網(wǎng)絡(luò)相結(jié)合[37]89.30American Sign Language(ASL)19類
    Liao YanQiu/南昌大學(xué)2019循環(huán)神經(jīng)網(wǎng)絡(luò)和3維卷積網(wǎng)絡(luò)相結(jié)合[38]86.90Chinese Sign Language(CSL)500類
    下載: 導(dǎo)出CSV

    表  2  基于深度學(xué)習(xí)的連續(xù)語句的手語識別技術(shù)及代表性工作

    作者/單位年份技術(shù)特點(diǎn)評估標(biāo)準(zhǔn)(%)數(shù)據(jù)集樣本大小
    Camgoz NC, Koller O/亞琛工業(yè)大學(xué)20163維卷積網(wǎng)絡(luò)(從RGB數(shù)據(jù)提取時(shí)序特征)[45]Jaccard系數(shù):26.9Chalearn
    2016基于卷積神經(jīng)網(wǎng)絡(luò)和HMM的混合模型[49]WER:39.7RWTH-PHOENX-Weather
    2017基于CNN、HMM、CTC[50]WER:38.8
    2017雙向長短時(shí)網(wǎng)絡(luò)-BLSTM(基于CTC算法)[39]WER:43.1分辨率:5000×90
    2018基于CNN、HMM及RNN的混合模型[51]
    Pigou L /根特大學(xué)2017基于3維網(wǎng)絡(luò)和LSTM混合模型(RGB-D)[52]Jaccard系數(shù):31.6Chalearn
    Cui Runpeng/清華大學(xué)2017基于CNN和BLSTM(基于CTC算法)[53]WER:38.7RWTHPHOENIX-Weather分辨率:16000×20
    2018雙向長短時(shí)網(wǎng)絡(luò)-BLSTM(多模態(tài)數(shù)據(jù))[40]WER:46.9
    Shi B /美國芝加哥大學(xué)2018基于注意力機(jī)制的長短時(shí)網(wǎng)絡(luò)[41]WER:41.9AmericanSign Language (ASL)
    Ko S K /韓國電子研究所2018循環(huán)神經(jīng)網(wǎng)絡(luò)(加入骨架關(guān)節(jié)點(diǎn)數(shù)據(jù))[42]Acc:89.5KETI韓國手語數(shù)據(jù)集100類
    Zhang Qian/上海交通大學(xué)2018雙向長短時(shí)網(wǎng)絡(luò)-BLSTM[43]Acc:93.1AmericanSign Language(ASL)100類
    Li Houqiang, Huang Jie /中國科學(xué)技術(shù)大學(xué)20183維卷積網(wǎng)絡(luò)(時(shí)間分類的對齊算法)[46]WER:37.3RWTH-PHOENIX-Weather
    雙流3維卷積網(wǎng)絡(luò)(加入LSTM)[47]Acc:82.7ChineseSign Language100類
    Guo Dan/合肥工業(yè)大學(xué),中國科學(xué)技術(shù)大學(xué)20183維卷積神經(jīng)網(wǎng)絡(luò)(時(shí)域卷積、CTC算法、后融合策略)[48]WER:37.8RWTH-PHOENIX-Weather
    3維卷積網(wǎng)絡(luò)和RNN相結(jié)合(自適應(yīng)變長在線關(guān)鍵片段挖掘關(guān)鍵幀)[55]Acc:92.9ChineseSign Language(CSL)100類
    Ariesta M C /雅加達(dá)大學(xué)20183維卷積網(wǎng)絡(luò)和RNN相結(jié)合(基于CTC)[54]SIBI30類
    Mittal A /印尼科技大學(xué)2019改進(jìn)的長短時(shí)記憶網(wǎng)絡(luò)[44]Acc:72.3印度手語數(shù)據(jù)集(ISL)942類
    下載: 導(dǎo)出CSV

    表  3  手語數(shù)據(jù)集分類

    名稱所屬國家類別場景樣本數(shù)據(jù)特點(diǎn)數(shù)據(jù)類型可用性
    RWTH-PHOENIX-Weather[56]德國1200945760RGB句子公開
    Chalearn[57]美國249750000RGB/深度單詞部分公開
    DGS Kinect 40[58]德國40153000多視角孤立詞
    CSL[47]中國500/100125000深度/骨架/RGB孤立詞/句子公開
    SIGNUM[59]德國4502533210RGB句子公開
    GSL 20[60]希臘206840RGB單詞
    Boston ASLLVD[61]美國3300+69800RGB單詞公開
    PSL Kinect 30[62]波蘭301300RGB/深度單詞公開
    LSA64[63]阿根廷64103200RGB單詞公開
    DEVISIGN-G[64]中國368432RGB單詞
    DEVISIGN-D[64]5006000
    DEVISIGN-L[64]200024000
    CUNY ASL[65]美國8RGB句子
    SignsWorld Atlas[66]阿拉伯3210RGB單詞公開
    ASL Fingerspelling[67]美國245131000RGB/深度單詞公開
    下載: 導(dǎo)出CSV

    表  4  RWTH-PHOENIX-Weather參數(shù)

    參數(shù)2012年版2014年版
    # 操作者數(shù)量 7 9
    # 樣例 190 645
    # 幀數(shù) 293077 965940
    # 語句數(shù)量 1980 6861
    # 詞匯量 911 1558
    # 分辨率 210×260 720×576
    下載: 導(dǎo)出CSV

    表  5  CSL數(shù)據(jù)集參數(shù)

    參數(shù)名稱數(shù)值
    RGB分辨率1920×1080
    深度數(shù)據(jù)分辨率512×424
    視頻時(shí)長(s)10~14
    平均樣例數(shù)7
    總樣例25000
    # 操作者數(shù)量50
    詞匯量178
    骨架關(guān)節(jié)點(diǎn)數(shù)21
    fps25
    總時(shí)長100+
    下載: 導(dǎo)出CSV
  • HINTON G E, OSINDERO S, and TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527–1554. doi: 10.1162/neco.2006.18.7.1527
    周宇. 中國手語識別中自適應(yīng)問題的研究[D].[博士論文], 哈爾濱工業(yè)大學(xué), 2009.

    ZHOU Yu. Research on signer adaptation in Chinese sign language recognition[D].[Ph.D. dissertation], Harbin Institute of Technology, 2009.
    CHEOK M J, OMAR Z, and JAWARD M H. A review of hand gesture and sign language recognition techniques[J]. International Journal of Machine Learning and Cybernetics, 2019, 10(1): 131–153. doi: 10.1007/s13042-017-0705-5
    TANG Ao, LU Ke, WANG Yufei, et al. A real-time hand posture recognition system using deep neural networks[J]. ACM Transactions on Intelligent Systems and Technology, 2015, 6(2): 1–23. doi: 10.1145/2735952
    PIGOU L, DIELEMAN S, KINDERMANS P J, et al. Sign language recognition using convolutional neural networks[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 572–578.
    KANG B, TRIPATHI S, and NGUYEN T Q. Real-time sign language fingerspelling recognition using convolutional neural networks from depth map[C]. The 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 2015: 136–140.
    HOSSEN M A, GOVINDAIAH A, SULTANA S, et al. Bengali sign language recognition using Deep Convolutional Neural Network[C]. The 7th Joint International Conference on Informatics, Electronics & Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Kitakyushu, Japan, 2018: 369–373.
    KOLLER O, BOWDEN R, and NEY H. Automatic alignment of hamNoSys subunits for continuous sign language recognition[C]. The 10th Edition of the Language Resources and Evaluation Conference, Portoro?, Slovenia, 2016: 121–128.
    GARCIA B and VIESCA S A. Real-time American sign language recognition with convolutional neural networks[J]. Convolutional Neural Networks for Visual Recognition, 2016, 2: 225–232.
    JI Y, KIM S, and LEE K B. Sign language learning system with image sampling and convolutional neural network[C]. The 1st IEEE International Conference on Robotic Computing (IRC), Taichung, China, 2017: 371–375.
    KIM S, JI Y, and LEE K B. An effective sign language learning with object detection based ROI segmentation[C]. The 2nd IEEE International Conference on Robotic Computing (IRC), Laguna Hills, USA, 2018: 330–333.
    K?PüKLü O, K?SE N, and RIGOLL G. Motion fused frames: Data level fusion strategy for hand gesture recognition[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA, 2018: 2103–2111.
    KONSTANTINIDIS D, DIMITROPOULOS K, and DARAS P. Sign language recognition based on hand and body skeletal data[C]. 2018-3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, 2018: 1–4.
    DEVINEAU G, MOUTARDE F, WANG Xi, et al. Deep learning for hand gesture recognition on skeletal data[C]. The 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xian, China, 2018: 106–113.
    MOLCHANOV P, GUPTA S, KIM K, et al. Hand gesture recognition with 3D convolutional neural networks[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition workshops, Boston, USA, 2015: 1–7.
    WU Di, PIGOU L, KINDERMANS P J, et al. Deep dynamic neural networks for multimodal gesture segmentation and recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(8): 1583–1597. doi: 10.1109/TPAMI.2016.2537340
    HUANG Jie, ZHOU Wengang, LI Houqiang, et al. Sign language recognition using 3D convolutional neural networks[C]. 2015 IEEE International Conference on Multimedia and Expo (ICME), Turin, Italy, 2015: 1–6.
    HUANG Jie, ZHOU Wengang, LI Houqiang, et al. Attention-based 3D-CNNs for large-vocabulary sign language recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(9): 2822–2832. doi: 10.1109/TCSVT.2018.2870740
    LI Yunan, MIAO Qiguang, TIAN Kuan, et al. Large-scale gesture recognition with a fusion of RGB-D data based on the C3D model[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 25–30.
    LI Yunan, MIAO Qiguang, TIAN Kuan, et al. Large-scale gesture recognition with a fusion of RGB-D data based on saliency theory and C3D model[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(10): 2956–2964. doi: 10.1109/TCSVT.2017.2749509
    MIAO Qiguang, LI Yunan, OUYANG Wanli, et al. Multimodal gesture recognition based on the resc3d network[C]. 2017 IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 3047–3055.
    ELBADAWY M, ELONS A S, SHEDEED H A, et al. Arabic sign language recognition with 3d convolutional neural networks[C]. The 8th International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 2017: 66–71.
    YE Yuancheng, TIAN Yingli, HUENERFAUTH M, et al. Recognizing American sign language gestures from within continuous videos[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA, 2018: 2064–2073.
    LIANG Zhijie, LIAO Shengbin, and HU Bingzhang. 3D convolutional neural networks for dynamic sign language recognition[J]. The Computer Journal, 2018, 61(11): 1724–1736. doi: 10.1093/comjnl/bxy049
    CATE H, DALVI F, and HUSSAIN Z. Sign language recognition using temporal classification[EB/OL]. http://arxiv.org/abs/1701.01875v1, 2017.
    CHAI Xiujuan, LIU Zhipeng, YIN Fang, et al. Two streams recurrent neural networks for large-scale continuous gesture recognition[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 31–36.
    LIU Tao, ZHOU Wengang, and LI Houqiang. Sign language recognition with long short-term memory[C]. 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, USA, 2016: 2871–2875.
    LI Xiaoxu, MAO Chensi, HUANG Shiliang, et al. Chinese sign language recognition based on SHS descriptor and encoder-decoder LSTM model[C]. The 12th Chinese Conference on Biometric Recognition. Shenzhen, China, 2017: 719–728.
    HUANG Shiliang, MAO Chensi, TAO Jinxu, et al. A novel chinese sign language recognition method based on keyframe-centered clips[J]. IEEE Signal Processing Letters, 2018, 25(3): 442–446. doi: 10.1109/LSP.2018.2797228
    YANG Su and ZHU Qing. Continuous Chinese sign language recognition with CNN-LSTM[J]. SPIE, 2017, 10420.
    YANG Su and ZHU Qing. Video-based Chinese sign language recognition using convolutional neural network[C]. The 9th IEEE International Conference on Communication Software and Networks (ICCSN), Guangzhou, China, 2017: 929–934.
    LIN Chi, WAN Jun, LIANG Yanyan, et al. Large-scale isolated gesture recognition using a refined fused model based on masked Res-C3D network and skeleton LSTM[C]. The 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 2018: 52–58.
    HALIM K and RAKUN E. Sign language system for Bahasa Indonesia (Known as SIBI) recognizer using TensorFlow and Long Short-Term Memory[C]. 2018 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Yogyakarta, Indonesia, 2018: 403–407.
    BHATEJA V, COELLO C A C, and SATAPATHY S C. Intelligent Engineering Informatics[C]. The 6th International Conference on FICTA. Singapore: 2018: 623–632.
    BANTUPALLI K and XIE Ying. American Sign Language recognition using deep learning and computer vision[C]. 2018 IEEE International Conference on Big Data (Big Data), Seattle, USA, 2018: 4896–4899.
    KONSTANTINIDIS D, DIMITROPOULOS K, and DARAS P. A deep learning approach for analyzing video and skeletal features in sign language recognition[C]. 2018 IEEE International Conference on Imaging Systems and Techniques (IST), Krakow, Poland, 2018: 1–6.
    VINCENT H, TOMOYA S, and GENTIANE V. Convolutional and recurrent neural network for human action recognition: Application on American sign language[EB/OL]. http://biorxiv.org/content/10.1101/535492v1, 2019.
    LIAO Yanqiu, XIONG Pengwen, MIN Weidong, et al. Dynamic sign language recognition based on video sequence with BLSTM-3D residual networks[J]. IEEE Access, 2019, 7: 38044–38054. doi: 10.1109/ACCESS.2019.2904749
    CAMGOZ N C, HADFIELD S, KOLLER O, et al. SubUNets: End-to-end hand shape and continuous sign language recognition[C]. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 3075–3084.
    CUI Runpeng, LIU Hu, and ZHANG Changshui. A deep neural framework for continuous sign language recognition by iterative training[J]. IEEE Transactions on Multimedia, 2019, 21(7): 1880–1891. doi: 10.1109/TMM.2018.2889563
    SHI Bowen, DEL RIO A M, KEANE J, et al. American Sign Language fingerspelling recognition in the wild[C]. 2018 IEEE Spoken Language Technology Workshop (SLT), Athens, Greece, 2018: 145–152.
    KO S K, SON J G, and JUNG H. Sign language recognition with recurrent neural network using human keypoint detection[C]. 2018 Conference on Research in Adaptive and Convergent Systems, Honolulu, USA, 2018: 326–328.
    ZHANG Qian, WANG Dong, ZHAO Run, et al. MyoSign: Enabling end-to-end sign language recognition with wearables[C]. The 24th International Conference on Intelligent User Interfaces, Marina del Ray, USA, 2019: 650–660.
    MITTAL A, KUMAR P, ROY P P, et al. A modified LSTM model for continuous sign language recognition using leap motion[J]. IEEE Sensors Journal, 2019, 19(16): 7056–7063. doi: 10.1109/JSEN.2019.2909837
    CAMGOZ N C, HADFIELD S, KOLLER O, et al. Using convolutional 3d neural networks for user-independent continuous gesture recognition[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 49–54.
    PU Junfu, ZHOU Wengang, and LI Houqiang. Dilated convolutional network with iterative optimization for continuous sign language recognition[C]. The 27th International Joint Conference on Artificial Intelligence, Wellington, New Zealand, 2018: 885–891.
    HUANG Jie, ZHOU Wengang, ZHANG Qilin, et al. Video-based sign language recognition without temporal segmentation[C]. The 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018: 2257–2264.
    WANG Shuo, GUO Dan, ZHOU Wengang, et al. Connectionist temporal fusion for sign language translation[C]. The 26th ACM International Conference on Multimedia, Seoul, Korea, 2018: 1483–1491.
    KOLLER O, ZARGARAN O, NEY H, et al. Deep sign: Hybrid CNN-HMM for continuous sign language recognition[C]. 2016 British Machine Vision Conference, York, UK, 2016: 1–2.
    KOLLER O, ZARGARAN S, and NEY H. Re-sign: Re-aligned end-to-end sequence modelling with deep recurrent CNN-HMMs[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 2017: 4297–4305.
    KOLLER O, ZARGARAN S, NEY H, et al. Deep sign: Enabling robust statistical continuous sign language recognition via hybrid CNN-HMMs[J]. International Journal of Computer Vision, 2018, 126(12): 1311–1325. doi: 10.1007/s11263-018-1121-3
    PIGOU L, VAN HERREWEGHE M, and DAMBRE J. Gesture and sign language recognition with temporal residual networks[C]. 2017 IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 3086–3093.
    CUI Runpeng, LIU Hu, and ZHANG Changshui. Recurrent convolutional neural networks for continuous sign language recognition by staged optimization[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 7361–7369.
    ARIESTA M C, WIRYANA F, SUHARJITO, et al. Sentence level Indonesian sign language recognition using 3D convolutional neural network and bidirectional recurrent neural network[C]. 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), Jakarta, Indonesia, 2018: 16–22.
    GUO Dan, ZHOU Wengang, LI Houqiang, et al. Hierarchical LSTM for sign language translation[C]. The 32nd AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, USA, 2018: 6845–6852.
    FORSTER J, SCHMIDT C, HOYOUX T, et al. RWTH-PHOENIX-Weather: A large vocabulary sign language recognition and translation corpus[C]. The 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey, 2012: 3785–3789.
    ESCALERA S, BARó X, GONZàLEZ J, et al. Chalearn looking at people challenge 2014: Dataset and results[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 459–473.
    ONG E J, COOPER H, PUGEAULT N, et al. Sign language recognition using sequential pattern trees[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 2200–2207.
    VON AGRIS U, ZIEREN J, CANZLER U, et al. Recent developments in visual sign language recognition[J]. Universal Access in the Information Society, 2008, 6(4): 323–362. doi: 10.1007/s10209-007-0104-x
    EFTHIMIOU E and FOTINEA S E. GSLC: Creation and annotation of a Greek sign language corpus for HCI[C]. The 4th International Conference on Universal Access in Human-Computer Interaction, Beijing, China, 2007: 657–666.
    NEIDLE C, THANGALI A, and SCLAROFF S. Challenges in development of the American Sign Language lexicon video dataset (ASLLVD) corpus[C]. The 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, Istanbul, Turkey, 2012: 1–8.
    OSZUST M and WYSOCKI M. Polish sign language words recognition with Kinect[C]. The 6th International Conference on Human System Interactions (HSI), Sopot, Poland, 2013: 219–226.
    RONCHETT F, QUIROGA F, ESTREBOU C A, et al. LSA64: An Argentinian sign language dataset[C]. The 22nd Congreso Argentino de Ciencias de la Computación (CACIC 2016), San Luis, USA, 2016: 794–803.
    CHAI Xiujuan, WANG Hanjie, and CHEN Xilin. The DEVISIGN large vocabulary of Chinese sign language database and baseline evaluations[R]. Technical Report VIPL-TR-14-SLR-001, 2014.
    LU Pengfei and HUENERFAUTH M. Collecting and evaluating the CUNY ASL corpus for research on American sign language animation[J]. Computer Speech & Language, 2014, 28(3): 812–831. doi: 10.1016/j.csl.2013.10.004
    SHOHIEB S M, ELMINIR H K, and RIAD A M. Signsworld atlas; a benchmark Arabic sign language database[J]. Journal of King Saud University-Computer and Information Sciences, 2015, 27(1): 68–76. doi: 10.1016/j.jksuci.2014.03.011
    PUGEAULT N and BOWDEN R. Spelling it out: Real-time ASL fingerspelling recognition[C]. 2011 IEEE International Conference on Computer Vision workshops (ICCV Workshops), Barcelona, Spain, 2011: 1114–1119.
    PRABHAVALKAR R, SAINATH T N, WU Yonghui, et al. Minimum word error rate training for attention-based sequence-to-sequence models[C]. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018: 4839–4843.
    KOLLER O, FORSTER J, and NEY H. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers[J]. Computer Vision and Image Understanding, 2015, 141: 108–125. doi: 10.1016/j.cviu.2015.09.013
  • 加載中
圖(4) / 表(5)
計(jì)量
  • 文章訪問數(shù):  15816
  • HTML全文瀏覽量:  6823
  • PDF下載量:  1354
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2019-06-06
  • 修回日期:  2019-11-20
  • 網(wǎng)絡(luò)出版日期:  2020-01-18
  • 刊出日期:  2020-06-04

目錄

    /

    返回文章
    返回