融合U-net網(wǎng)絡(luò)的純卷積視頻預(yù)測(cè)模型
中圖分類號(hào):TP391.41 文獻(xiàn)標(biāo)志碼:A
DOI: 10.7652/xjtuxb202506012 文章編號(hào):0253-987X(2025)06-0112-10
A Pure Convolutional Model Fused with U-net Network for Video Prediction
XIE Yumei1 ,CAI Yuanli2,GAO Haiyan3,GUANG Xiangfeng1,TANG Weiqiang4 (1.SchoolofElectronicInformationScience,F(xiàn)ujianJiangxiaUniversityFuzhou35olo8,China;2.FacultyofElectrical and Information Engineering,Xi'an jiaotong University,Xi'an 71o49,China;3. Schoolof Electrical Engineering and Automation,Xiamen Universityof Technology, Xiamen,F(xiàn)ujian 361024,China;4.College of Electrical and Information Engineering,Lanzhou University of Technology,Lanzhou 73oo5o3,China)
Abstract: To address the issues of insufficient spatiotemporal feature extraction and inadequate image detail preservation in deep learning-based video prediction,a pure convolutional video prediction model (CUnet) fused with the U-net network,using the Inception unit from the SimVP model,is proposed. CUnet model consists of 3 core modules. Firstly,the Cell module uses 2D convolutional layers to extract spatial features and feeds these features into multiple Inception units to capture spatiotemporal features. Secondly, the DeCell module captures spatiotemporal features through Inception units and performs upsampling operations using 2D deconvolutional layers to restore the original image size. Finally,U-net is introduced as the backbone network to organically integrate the Cell module and the DeCell module,effectively preserving the detailed information of the image and achieving high-quality image reconstruction. The experimental results showed that on the TaxiBJ dataset,compared with the currently bestperforming TAU model, the prediction accuracy of the CUnet model had increased by 5.23% : On the Human3.6M dataset,compared with the currently best-performing FFINet model,the prediction accuracy of the CUnet model had increased by 12.88% .The CUnet model demonstrates exceptional predictive capabilities,offering valuable insights for the application of pure convolutional neural networks in the field of video prediction.
Keywords: deep learning; video prediction; spatiotemporal features; U-net; pure convolutionalneural network
視頻預(yù)測(cè)是通過對(duì)歷史幀的學(xué)習(xí),實(shí)現(xiàn)對(duì)未來(lái)幀的精準(zhǔn)預(yù)測(cè)。(剩余16553字)
-
-
- 西安交通大學(xué)學(xué)報(bào)
- 2025年06期
- 氨/氫混合燃料超燃沖壓發(fā)動(dòng)機(jī)模...
- 噴射壓力對(duì)甲醇缸內(nèi)直噴發(fā)動(dòng)機(jī)燃...
- 固體火箭發(fā)動(dòng)機(jī)碳/碳復(fù)合材料噴...
- 壓縮比與點(diǎn)火正時(shí)對(duì)氫燃料橢圓轉(zhuǎn)...
- 含電磁敏感鐵絲推進(jìn)劑的制備及其...
- 萘四甲酸二酐改性的聚醚酰亞胺共...
- 膨脹石墨和碳納米管涂層對(duì)相變材...
- β-磷酸鈣增強(qiáng)鋅合金激光選區(qū) ...
- 用于骨修復(fù)中可降解生物陶瓷的制...
- 采用堆疊長(zhǎng)短期記憶神經(jīng)網(wǎng)絡(luò)的水...
- 神經(jīng)算子增強(qiáng)的雙級(jí)低壓渦輪子午...
- 融合U-net網(wǎng)絡(luò)的純卷積視頻...
- 過熱蒸汽管道噴霧冷卻特性數(shù)值分...
- 燃?xì)廨啓C(jī)拉桿轉(zhuǎn)子跨尺度接觸界面...
- 采用SHAP的高壓渦輪級(jí)高維設(shè)...
- 結(jié)合點(diǎn)云距離和角度雙閾值的 橋...
- 多尺度視覺增強(qiáng)語(yǔ)音驅(qū)動(dòng)人臉生成...
- 利用兩側(cè)邊線空間幾何關(guān)系的單幅...
- 油電混合-機(jī)液復(fù)合動(dòng)力傳動(dòng)系統(tǒng)...
- 時(shí)滯對(duì)半主動(dòng)懸架不同控制策略的...
- 壓電變壓器結(jié)構(gòu)的便攜式磁電耦合...