Generic placeholder image

Recent Advances in Computer Science and Communications

Editor-in-Chief

ISSN (Print): 2666-2558
ISSN (Online): 2666-2566

Research Article

DeepFake Detection with Remote Heart Rate Estimation Using 3D Central Difference Convolution Attention Network

Author(s): Xiao Feng, Hua Ma* and Yijie Sun

Volume 16, Issue 7, 2023

Published on: 17 April, 2023

Article ID: e010323214186 Pages: 9

DOI: 10.2174/2666255816666230301091725

Price: $65

Abstract

Objective: As GAN-based deepfakes have become increasingly mature and realistic, the demand for effective deepfake detectors has become essential. We are inspired by the fact that normal pulse rhythms present in real-face video can be decreased or even completely interrupted in a deepfake video; thus, we have introduced a new deepfake detection approach based on remote heart rate estimation using the 3D Central Difference Convolution Attention Network (CDCAN).

Methods: Our proposed fake detector is mainly composed of a 3D CDCAN with an inverse attention mechanism and LSTM architecture. It utilizes 3D central difference convolution to enhance the spatiotemporal representation, which can capture rich physiological-related temporal context by gathering the time difference information. The soft attention mechanism is to focus on the skin region of interest, while the inverse attention mechanism is to further denoise rPPG signals.

Results: The performance of our approach is evaluated on the two latest Celeb-DF and DFDC datasets, for which the experiment results show that our proposed approach achieves an accuracy of 99.5% and 97.4%, respectively.

Conclusion: Our approach outperforms the state-of-art methods and proves the effectiveness of our DeepFake detector.

Graphical Abstract

[1]
J.C. Dheeraj, K. Nandakumar, A.V. Aditya, B.S. Chethan, and G.C.R. Kartheek, Detecting deepfakes using deep learning International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), 2021, pp. 651-654.
[http://dx.doi.org/10.1109/RTEICT52294.2021.9573740]
[2]
M.S. Rana, M.N. Nobi, B. Murali, and A.H. Sung, "Deepfake detection: A systematic literature review", IEEE Access, vol. 10, pp. 25494-25513, 2022.
[http://dx.doi.org/10.1109/ACCESS.2022.3154404]
[3]
Y. Huang, F. Juefei-Xu, Q. Guo, Y. Liu, and G. Pu, "FakeLocator: Robust localization of gan-based face manipulations", IEEE Trans. Inf. Forensics Security, vol. 17, pp. 2657-2672, 2022.
[http://dx.doi.org/10.1109/TIFS.2022.3141262]
[4]
R. Tolosana, S. Romero-Tapiador, J. Fierrez, and R. VeraRodriguez, DeepFakes evolution: Analysis of facial regions and fake detection performanceInternational Conference on Pattern Recognition, 2021, pp. 442-456.
[http://dx.doi.org/10.1007/978-3-030-68821-9_38]
[5]
W. Chen, and D. McDuff, DeepPhys: Video-based physiological measurement using convolutional attention networksProceedings of the European Conference on Computer Vision, 2018, pp. 349-365.
[http://dx.doi.org/10.1007/978-3-030-01216-8_22]
[6]
S-H. Lee, G-E. Yun, M.Y. Lim, and Y.K. Lee, A study on effective use of bpm information in deepfake detectionInternational Conference on Information and Communication Technology Convergence, 2021, pp. 425-427.
[http://dx.doi.org/10.1109/ICTC52510.2021.9621186]
[7]
H. Qi, Q. Guo, and F. Juefei-Xu, DeepRhythm: Exposing deepfakes with attentional visual heartbeat rhythmsProceedings of the 28th ACM International Conference on Multimedia, 2021, pp. 4318-4327.
[http://dx.doi.org/10.1145/3394171.3413707]
[8]
Z. Yu, and Z. Bochao, Video-based physiological measurement 3d central difference convolution attention networkIEEE International Joint Conference on Biometrics China, 2021, pp. 1-6.
[http://dx.doi.org/10.1109/IJCB52358.2021.9484405]
[9]
K. Dale, K. Sunkavalli, M.K. Johnson, D. Vlasic, W. Matusik, and H. Pfister, "Video face replacement", ACM Trans. Graph., vol. 30, no. 6, pp. 1-10, 2011.
[http://dx.doi.org/10.1145/2070781.2024164]
[10]
P. Garrido, L. Valgaerts, O. Rehmsen, T. Thormaehlen, P. Perez, and C. Theobalt, Automatic face reenactmentIEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 4217-4224.
[http://dx.doi.org/10.1109/CVPR.2014.537]
[11]
T. Karras, T. Aila, S. Laine, and J. Lehtinen, Progressive growing of GANs for improved quality stability and variationProc. Int. Conf. Learn. Represent., 2018, pp. 1-9.
[http://dx.doi.org/10.48550/arXiv.1710.10196]
[12]
T. Karras, S. Laine, and T. Aila, A style-based generator architecture for generative adversarial networksIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4396-4405.
[http://dx.doi.org/10.1109/CVPR.2019.00453]
[13]
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, Analyzing and improving the image quality of styleGANIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8110-8119.
[http://dx.doi.org/10.1109/CVPR42600.2020.00813]
[14]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial networks", Commun. ACM, vol. 63, no. 11, pp. 139-144, 2020.
[http://dx.doi.org/10.1145/3422622]
[15]
X. Pan, X. Zhang, and S. Lyu, Exposing image splicing with inconsistent local noise variancesIEEE International Conference on Computational Photography (ICCP), 2012, pp. 1-10.
[http://dx.doi.org/10.1109/ICCPhot.2012.6215223]
[16]
P. Buchana, I. Cazan, M. Diaz-Granados, F. Juefei-Xu, and M. Savvides, Simultaneous forgery identification and localization in paintings using advanced correlation filtersIEEE International Conference on Image Processing (ICIP), 2016, pp. 146-150.
[http://dx.doi.org/10.1109/ICIP.2016.7532336]
[17]
D. Cozzolino, G. Poggi, and L. Verdoliva, Recasting residual-based local descriptors as convolutional neural networks: An application to image forgery detectionProceedings of the 5th ACM workshop on information hiding and multimedia security, 2017, pp. 159-164.
[http://dx.doi.org/10.1145/3082031.3083247]
[18]
P. Zhou, X. Han, V.I. Morariu, and L.S. Davis, Two-stream neural networks for tampered face detection., Honolulu, HI, USA, 2017, pp. 1831-1839.
[http://dx.doi.org/10.1109/CVPRW.2017.229]
[19]
A. Chintha, B. Thai, S.J. Sohrawardi, K. Bhatt, A. Hickerson, M. Wright, and R. Ptucha, "Recurrent convolutional structures for audio spoof and video deepfake detection", IEEE J. Sel. Top. Signal Process., vol. 14, no. 5, pp. 1024-1037, 2020.
[http://dx.doi.org/10.1109/JSTSP.2020.2999185]
[20]
Y. Li, and S. Lyu, Exposing deepfake videos by detecting face warping artifactsProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 46-52.
[21]
H.N. Huy, Y. Junichi, and E. Isao, "Use of a capsule network to detect fake images and videos", arXiv preprint arXiv:1910.12467, 2019.
[http://dx.doi.org/10.48550/1910.12467]
[22]
M.Z. Poh, D.J. McDuff, and R.W. Picard, "Non-contact, automated cardiac pulse measurements using video imaging and blind source separation", Opt. Express, vol. 18, no. 10, pp. 10762-10774, 2010.
[http://dx.doi.org/10.1364/OE.18.010762] [PMID: 20588929]
[23]
G. de Haan, and V. Jeanne, "Robust pulse rate from chrominance-based rPPG", IEEE Trans. Biomed. Eng., vol. 60, no. 10, pp. 2878-2886, 2013.
[http://dx.doi.org/10.1109/TBME.2013.2266196] [PMID: 23744659]
[24]
W. Wang, A.C. den Brinker, S. Stuijk, and G. de Haan, "Algorithmic principles of remote PPG", IEEE Trans. Biomed. Eng., vol. 64, no. 7, pp. 1479-1491, 2017.
[http://dx.doi.org/10.1109/TBME.2016.2609282] [PMID: 28113245]
[25]
S. Liu, and Y. Lan, and Y. PongChi, "Temporal similarity analysis of remote photoplethys-mography for fast 3d mask face presentation attack detection", Proc. the IEEE Winter Conference on Applications of Computer Vision, 2020, pp. 2608-2616.
[http://dx.doi.org/10.1109/WACV45572.2020.9093337]
[26]
E. Nowara, D. Mcduff, and A. Veeraraghavan, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 4955-4964.
[http://dx.doi.org/10.48550/arXiv.2010.07770]
[27]
J. Hernandez-Ortega, R. Tolosana, J. Fierrez, and A. Morales, "DeepFakesON-Phys: DeepFakes Detection based on Heart Rate Estimation", Handbook of Digital Face Manipulation and Detection - From DeepFakes to Morphing Attacks, 2021, pp. 255-273.
[http://dx.doi.org/10.48550/arXiv.2010.00400]
[28]
V. Conotter, E. Bodnari, G. Boato, and H. Farid, Physiologically-based detection of computer generated faces in videoIEEE International Conference on Image Processing (ICIP), 2014, pp. 248-252.
[http://dx.doi.org/10.1109/ICIP.2014.7025049]
[29]
U.A. Ciftci, I. Demir, and L. Yin, "FakeCatcher: Detection of synthetic portrait videos using biological signals", IEEE Trans. Pattern Anal. Mach. Intell., pp. 1-1, 2020.
[http://dx.doi.org/10.1109/TPAMI.2020.3009287] [PMID: 32750816]
[30]
J. Hernandez-Ortega, J. Fierrez, A. Morales, and P. Tome, Time analysis of pulse-based face anti-spoofing in visible and NIRIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 657-6578.
[http://dx.doi.org/10.1109/CVPRW.2018.00096]
[31]
M. Chen, X. Liao, and M. Wu, "PulseEdit: Editing physiological signals in facial videos for privacy protection", IEEE Trans. Inf. Forensics Security, vol. 17, pp. 457-471, 2022.
[http://dx.doi.org/10.1109/TIFS.2022.3142993]
[32]
S. Hochreiter, and J. Schmidhuber, "Long short-term memory", Neural Comput., vol. 9, no. 8, pp. 1735-1780, 1997.
[http://dx.doi.org/10.1162/neco.1997.9.8.1735] [PMID: 9377276]
[33]
J. Xiang, and G. Zhu, "Joint face detection and facial expression recognition with MTCNN", 2017 4th International Conference on Information Science and Control Engineering (ICISCE),, 2017, pp. 424-427.
[http://dx.doi.org/10.1109/ICISCE.2017.95]
[34]
Y. Li, X. Yang, P. Sun, H. Qi, and S. Lyu, "Celeb-DF: A largescale challenging dataset for deepfake forensics", Proc. IEEE/CVF Conf. on Comp. Vision and Pattern Recognition, 2020, Seattle, WA, USA, 2020, pp. 3204-3213.
[http://dx.doi.org/10.1109/CVPR42600.2020.00327]
[35]
B. Dolhansky, R. Howes, N.B. Ben Pflaum, and C.C. Ferrer, "The deepfake detection challenge (dfdc) preview dataset ArXiv, ", abs/1910.08854, 2019.
[http://dx.doi.org/10.48550/arXiv.1910.08854]
[36]
X. Yang, Y. Li, and S. Lyu, Exposing deep fakes using inconsistent head posesProc. 2019 IEEE International Conference on Acoustics, Speech and Signal Processing, 2019, pp. 8261-8265.
[http://dx.doi.org/10.1109/ICASSP.2019.8683164]
[37]
D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen, MesoNet: A compact facial video forgery detection networkIEEE International Workshop on Information Forensics and Security, 2018, pp. 1-7.
[http://dx.doi.org/10.1109/WIFS.2018.8630761]
[38]
I. Ganiyusufoglua, and M. Ngo, "Spatio-temporal features for generalized detection of deepfake videos ", ArXiv, abs/2010.11844, 2020.
[http://dx.doi.org/10.48550/arxiv.2010.11844]

Rights & Permissions Print Cite
© 2025 Bentham Science Publishers | Privacy Policy