Yu-Lun (Alex) Liu | 劉育綸

I am an Assistant Professor in the Department of Computer Science at National Yang Ming Chiao Tung University. I work on image/video processing, computer vision, and computational photography, particularly on essential problems requiring machine learning with insights from geometry and domain-specific knowledge.

Prior to joining NYCU, I was a Research Scientist Intern at Meta Reality Labs Research and a senior software engineer at MediaTek Inc. I received my PhD from NTU, CSIE in 2022, where I was a member of CMLab.

I am looking for undergraduate / master's / Ph.D. / postdoc students to join my group. If you are interested in working with me and want to conduct research in image processing, computer vision, and machine learning, don't hesitate to contact me directly with your CV and transcripts.

Email  /  CV  /  Google Scholar  /  Facebook  /  Instagram  /  Github  /  YouTube

profile photo For those who personally know me, that might be thinking: Who is this guy?
Hover over to see how I usually look like before a paper submission deadline.


Timeline
2023 -
image
Assistant Professor at NYCU
2022
image
Research Scientist Intern at Meta
Computational Photography Group
Seattle, WA, USA
2017 - 2022
image
Senior Software Engineer at MediaTek Inc.
Multimedia Technology Development (MTD) Division
Intelligent Vision Processing (IVP) Department
2014 - 2017
image
Software Engineer at MediaTek Inc.
Multimedia Technology Development (MTD) Division
Intelligent Vision Processing (IVP) Department
2012 - 2014
image
NCTU: MS
CommLab, Institute of Electronics
2008 - 2012
image
NCTU: BS
Department of Electronics Engineering

News

Research Group

PhD Students

黃怡川
Institute of Computer Science

Research Assistants

葉長瀚
BS, NYCU CE & CS

陳霆軒
BS, NCHU AMATH

MS Students

林晉暘
Institute of Data Science
Co-advised by Wei-Chen Chiu

吳中赫
Institute of
Multimedia

嚴士函
Institute of
Multimedia

陳捷文
Institute of
Computer Science

許皓翔
Institute of Computer Science
Co-advised by Wen-Chieh Lin

陳映寰
Institute of
Computer Science

朱劭璿
Institute of
Computer Science

張欀齡
Institute of
Computer Science

鄭淮薰
Institute of
Computer Science

柯柏旭
Institute of Computer Science
Co-advised by Wei-Chen Chiu

吳秉宸
Program of
Artificial Intelligence

范丞德
Institute of Computer Science
Co-advised by Yu-Chee Tseng

BS Students

蘇智海
NYCU CS

胡智堯
NTU MED

李明謙
NYCU ARETEHP

林奕杰
NTHU SCIDM

孫揚喆
NYCU MED

羅宇呈
NYCU MATH

郭玠甫
NYCU MATH

葉柔昀
NYCU EP

陳士弘
NYCU MATH & CS

鄭又豪
NYCU EP

陳昱佑
NYCU CS

丁祐承
NYCU MATH

李宗諺
NYCU CS

楊宗儒
NYCU CS

陳凱昕
NYCU CS

劉珆睿
NYCU CS

吳俊宏
NYCU CS

蔡師睿
NYCU CS

張維程
NYCU CS

李杰穎
NYCU CS

謝明翰
NTHU EE

陳楊融
NYCU CS

施惟智
NTHU IEEM

端木竣偉
NYCU CS

吳定霖
NYCU CS

司徒立中
NYCU CS

朱驛庭
NYCU CS

徐和
NYCU CS

翁晨昱
NTHU PHYS

何義翔
NYCU CS

謝侑哲
NYCU CS

Alumni

鄭伯俞
MS Spring 2023
Co-advised by Wei-Chen Chiu
Next Qualcomm

陳俊瑋
BS 2023
Next MS Student
NYCU


Research
Description
Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation
Ning-Hsu Wang, Yu-Lun Liu
arXiv, 2024  
project page / arXiv / code [coming soon] / demo

Depth Anywhere distills the knowledge from the state-of-the-art perspective depth model and improves 360 depth models.

Description
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
Ting-Hsuan Chen, Jiewen Chan, Hau-Shiang Shiu, Shih-Han Yen, Chang-Han Yeh, Yu-Lun Liu
arXiv, 2024  
project page / arXiv / results / code [coming soon] / demo

A video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics.

Description
DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation
Chun-Hung Wu, Shih-Hong Chen, Chih-Yao Hu, Hsin-Yu Wu, Kai-Hsin Chen, Yu-You Chen, Chih-Hai Su, Chih-Kuo Lee, Yu-Lun Liu
arXiv, 2024  
project page / arXiv / code & dataset [coming soon] / colab

DeNVeR is presented, an unsupervised approach for vessel segmentation in X-ray videos without annotated ground truth, providing a robust, data-efficient tool for disease diagnosis and treatment planning and setting a new standard for future research in video vessel segmentation.



DriveEnv-NeRF: Exploration of A NeRF-Based Autonomous Driving Environment for Real-World Performance Validation
Mu-Yi Shen, Chia-Chi Hsu, Hao-Yu Hou, Yu-Chen Huang, Wei-Fang Sun, Chia-Che Chang, Yu-Lun Liu, Chun-Yi Lee
ICRA RoboNerF Workshop, 2024  
project page / video / arXiv / code

The DriveEnv-NeRF framework, which leverages Neural Radiance Fields (NeRF) to enable the validation and faithful forecasting of the efficacy of autonomous driving agents in a targeted real-world scene, can serve as a training environment for autonomous driving agents under various lighting conditions.


Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang, Chun-Pei Chen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Yung-Yu Chuang, Yen-Yu Lin
CVPR, 2024  
arXiv / code

This paper addresses text-supervised semantic segmentation, aiming to learn a model capable of segmenting arbitrary visual concepts within images by using only imagetext pairs without dense annotations.


Description
HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses
Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang, Wu Liu, Xinchen Liu, Zheng Wang
CVPR, 2024  
project page / arXiv / code

This work reconstructs the previous HumanNeRF approach, combining explicit and implicit human representations with both general and specific mapping processes, and shows that explicit shape can filter the information used to fit implicit representation, and frozen general mapping combined with point-specific mapping can effectively avoid overfitting and improve pose generalization performance.


Dual Associated Encoder for Face Restoration
Yu-Ju Tsai, Yu-Lun Liu, Lu Qi, Kelvin C.K. Chan, Ming-Hsuan Yang
ICLR, 2024  
project page / arXiv

This work proposes a novel dual-branch framework named DAEFR, which introduces an auxiliary LQ branch that extracts crucial information from the LQ inputs and incorporates association training to promote effective synergy between the two branches, enhancing code prediction and output quality.

Description
DisCO: Portrait Distortion Correction with Perspective-Aware 3D GANs
Zhixiang Wang, Yu-Lun Liu, Jia-Bin Huang, Shin'ichi Satoh, Sizhuo Ma, Guru Krishnan, Jian Wang
IJCV, 2024  
project page / arXiv

This work proposes a simple yet effective method for correcting perspective distortions in a single close-up face using GAN inversion using a perspective-distorted input facial image, and develops starting from a short distance, optimization scheduling, reparametrizations, and geometric regularization.


Description
Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields
Bo-Yu Cheng, Wei-Chen Chiu, Yu-Lun Liu
AAAI, 2024  
project page / arXiv / code

An algorithm that allows joint refinement of camera pose and scene geometry represented by decomposed low-rank tensor, using only 2D images as supervision is proposed, which achieves an equivalent effect to brute-force 3D convolution with only incurring little computational overhead.

Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction
Su-Kai Chen, Hung-Lin Yen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Wen-Hsiao Peng, Yen-Yu Lin
ICCV, 2023  
project page / arXiv / code / video

This work proposes the continuous exposure value representation (CEVR), which uses an implicit function to generate LDR images with arbitrary EVs, including those unseen during training, to improve HDR reconstruction.

ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection
Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, Min Sun
ICCV, 2023  
project page / arXiv

The studies indicate that the proposed image-induced geometry-aware representation can enable image-based methods to attain superior detection accuracy than the seminal point cloud-based method, VoteNet, in two practical scenarios: (1) scenarios where point clouds are sparse and noisy, such as in ARKitScenes, and (2) scenarios involve diverse object classes, particularly classes of small objects, as in the case in ScanNet200.

Description
Progressively Optimized Local Radiance Fields for Robust View Synthesis
Andreas Meuleman, Yu-Lun Liu, Chen Gao, Jia-Bin Huang, Changil Kim, Min H. Kim, Johannes Kopf
CVPR, 2023  
project page / paper / code / video

This work presents an algorithm for reconstructing the radiance field of a large-scale scene from a single casually captured video, and shows that progressive optimization significantly improves the robustness of the reconstruction.

Description
Robust Dynamic Radiance Fields
Yu-Lun Liu, Chen Gao, Andreas Meuleman, Hung-Yu Tseng, Ayush Saraf, Changil Kim, Yung-Yu Chuang, Johannes Kopf, Jia-Bin Huang
CVPR, 2023  
project page / arXiv / code / video

This work addresses the robustness issue by jointly estimating the static and dynamic radiance fields along with the camera parameters (poses and focal length) and shows favorable performance over the state-of-the-art dynamic view synthesis methods.




Denoising Likelihood Score Matching for Conditional Score-based Data Generation
Chen-Hao Chao, Wei-Fang Sun, Bo-Wun Cheng, Yi-Chen Lo, Chia-Che Chang, Yu-Lun Liu, Yu-Lin Chang, Chia-Ping Chen, Chun-Yi Lee
ICLR, 2022  
arXiv / OpenReview

This work forms a novel training objective, called Denoising Likelihood Score Matching (DLSM) loss, for the classifier to match the gradients of the true log likelihood density, and concludes that the conditional scores can be accurately modeled, and the effect of the score mismatch issue is alleviated.

Description
Learning to See Through Obstructions with Layered Decomposition
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
TPAMI, 2021  
project page / arXiv / code / demo / video

This work alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network, facilitates accommodating potential errors in the flow estimation and brittle assumptions, such as brightness consistency.




Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision
Ning-Hsu Wang, Ren Wang, Yu-Lun Liu, Yu-Hao Huang, Yu-Lin Chang, Chia-Ping Chen, Kevin Jou
ICCV, 2021  
project page / arXiv / code

This paper proposes a method to estimate not only a depth map but an AiF image from a set of images with different focus positions (known as a focal stack), and shows that this method outperforms the state-of-the-art methods both quantitatively and qualitatively, and also has higher efficiency in inference time.

Description
Hybrid Neural Fusion for Full-frame Video Stabilization
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
ICCV, 2021  
project page / arXiv / poster / slides / code / demo / video / Two minute video

This work presents a frame synthesis algorithm to achieve full-frame video stabilization that first estimate dense warp fields from neighboring frames and then synthesize the stabilized frame by fusing the warped contents.



Explorable Tone Mapping Operators
Chien-Chuan Su, Ren Wang, Hung-Jin Lin, Yu-Lun Liu, Chia-Ping Chen, Yu-Lin Chang, Soo-Chang Pei
ICPR, 2020  
arXiv

This paper proposes a learning-based multimodal tone-mapping method, which not only achieves excellent visual quality but also explores the style diversity and shows that the proposed method performs favorably against state-of-the-art tone-Mapping algorithms both quantitatively and qualitatively.


Learning Camera-Aware Noise Models
Ke-Chi Chang, Ren Wang, Hung-Jin Lin, Yu-Lun Liu, Chia-Ping Chen, Yu-Lin Chang, Hwann-Tzong Chen
ECCV, 2020  
project page / arXiv / code

A data-driven approach, where a generative noise model is learned from real-world noise, which is camera-aware and quantitatively and qualitatively outperforms existing statistical noise models and learning-based methods.

Description
Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline
Yu-Lun Liu*, Wei-Sheng Lai*, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
CVPR, 2020  
project page / arXiv / poster / slides / code / demo / 1-minute video

This work model the HDR-to-LDR image formation pipeline as the dynamic range clipping, non-linear mapping from a camera response function, and quantization, and proposes to learn three specialized CNNs to reverse these steps.

Description
Learning to See Through Obstructions
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
CVPR, 2020  
project page / arXiv / poster / slides / code / demo / 1-minute video / video / New Scientists

The method leverages the motion differences between the background and the obstructing elements to recover both layers and alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network.


Attention-based View Selection Networks for Light-field Disparity Estimation
Yu-Ju Tsai, Yu-Lun Liu, Yung-Yu Chuang, Ming Ouhyoung
AAAI, 2020  
paper / code / benchmark

A novel deep network for estimating depth maps from a light field image that generates an attention map indicating the importance of each view and its potential for contributing to accurate depth estimation and enforce symmetry in the attention map to improve accuracy.

Description
Deep Video Frame Interpolation using Cyclic Frame Generation
Yu-Lun Liu, Yi-Tung Liao, Yen-Yu Lin, Yung-Yu Chuang
AAAI, 2019   (Oral Presentation)
project page / paper / poster / slides / code / video

A new loss term, the cycle consistency loss, which can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data is introduced.

Background modeling using depth information
Yu-Lun Liu, Hsueh-Ming Hang
APSIPA, 2014  
paper

This paper focuses on creating a global background model of a video sequence using the depth maps together with the RGB pictures, and develops a recursive algorithm that iterates between the depth map and color pictures.

Virtual view synthesis using backward depth warping algorithm
Du-Hsiu Li, Hsueh-Ming Hang, Yu-Lun Liu
PCS, 2013  
paper

A backward warping process is proposed to replace the forward warped process, and the artifacts (particularly the ones produced by quantization) are significantly reduced, so the subjective quality of the synthesized virtual view images is thus much improved.


Teaching
CSCS10017: Signals and Systems
NYCU - Spring 2024 (Instructor)
CSIC30107: Video Compression
NYCU - Fall 2023 (Instructor)
CSIC30107: Video Compression
NYCU - Spring 2023 (Instructor)
DEE1315: Probability and Statistics
NCTU - Spring 2013 (Teaching Assistant)

Sponsors

My research is made possible by the generous support of the following organizations.


Stolen from Jon Barron's website.
Last updated May 2024.