I am working towards a Ph.D. degree at Sun Yat-sen University, under the supervision of Prof.Wenhan Luo. Before, I received the M.E. degree from Shenzhen University, China, 2023 and under the supervision of Prof.Feng Liu and Prof.Linlin Shen. I have published papers in journals, including TIFS, TNNLS. Previously, I interned at Tencent. My research interests include generative models and image/video synthesis.

πŸ”₯ News

  • 2023.04: Β πŸŽ‰πŸŽ‰ I join Sun Yat-sen University to pursue the Ph.D. degree under the supervision of Wenhan Luo!
  • 2023.02: Β πŸŽ‰πŸŽ‰ One paper is accepted by TNNLS!
  • 2022.08: Β πŸŽ‰πŸŽ‰ One paper is accepted by TIFS!

πŸ“ Publications

Arxiv
sym

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, Guanying Chen, Wei Liu, Wenhan Luo

  • In this work, we propose OMG, an occlusion-friendly personalized generation framework designed to seamlessly integrate multiple concepts within a single image. We propose a novel two-stage sampling solution. The first stage takes charge of layout generation and visual comprehension information collection for handling occlusions. The second one utilizes the acquired visual comprehension information and the designed noise blending to integrate multiple concepts while considering occlusions. We also observe that the initiation denoising timestep for noise blending is the key to identity preservation and layout. Moreover, our method can be combined with various single-concept models, such as LoRA and InstantID without additional tuning. Especially, LoRA models on Civitai.com can be exploited directly.
  • [Project] [Code]
TNNLS
sym

Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing

Zhe Kong, Wentian Zhang, Feng Liu, Wenhan Luo, Haozhe Liu, Linlin Shen, Raghavendra Ramachandra

  • We proposed a self-supervised learning-based method, denoted as DF-DM. Specifically, DF-DM is based on a global-local view coupled with De-Folding and De-Mixing to derive the task-specific representation for PAD. During De-Folding, the proposed technique will learn region-specific features to represent samples in a local pattern by explicitly minimizing generative loss. While De-Mixing drives detectors to obtain the instance-specific features with global information for more comprehensive representation by minimizing interpolation-based consistency.
  • [Code]
TIFS
sym

Fingerprint Presentation Attack Detection by Channel-Wise Feature Denoising

Feng Liu, Zhe Kong, Haozhe Liu, Wentian Zhang, Linlin Shen

  • This paper proposes a novel channel-wise feature denoising fingerprint PAD (CFD-PAD) method by handling the redundant noise information ignored in previous studies. The proposed method learns important features of fingerprint images by weighing the importance of each channel and identifying discriminative channels and β€œnoise” channels. Then, the propagation of β€œnoise” channels is suppressed in the feature map to reduce interference. Specifically, a PA-Adaptation loss is designed to constrain the feature distribution to make the feature distribution of live fingerprints more aggregate and that of spoof fingerprints more disperse.
  • [Code]

Publication List

πŸ’» Internships

  • 2022.03 - 2022.11, Tencent, China.

πŸŽ– Honors and Awards

  • 2023 Outstanding Graduate Award (Rate<5%)
  • 2022 Excellent Academic Scholarship, First Class.
  • 2021 Excellent Academic Scholarship, First Class.
  • 2020 Freshman Scholarship, First Class.