Yu-Lun (Alex) Liu | 劉育綸

I am an Assistant Professor in the Department of Computer Science at National Yang Ming Chiao Tung University. I work on image/video processing, computer vision, and computational photography, particularly on essential problems requiring machine learning with insights from geometry and domain-specific knowledge.

Prior to joining NYCU, I was a Research Scientist Intern at Meta Reality Labs Research and a senior software engineer at MediaTek Inc. I received my PhD from NTU, CSIE in 2022, where I was a member of CMLab.

I received the 教育部玉山青年學者, 國科會 2030 新秀學者, Google Research Scholar Award, and CVPR 2024 Outstanding Reviewer award.

Dear prospective students: I am looking for undergraduate / master's / Ph.D. / postdoc students to join my group. If you are interested in working with me and want to conduct research in image processing, computer vision, and machine learning, don't hesitate to contact me directly with your CV and transcripts.

Email  /  CV  /  Google Scholar  /  Facebook  /  Instagram  /  Github  /  YouTube

profile photo For those who personally know me, that might be thinking: Who is this guy?
Hover over to see how I usually look like before a paper submission deadline.

Meet my cats: 虎皮捲 and 雞蛋糕!


Timeline
2023 -
image
Assistant Professor at NYCU
2022
image
Research Scientist Intern at Meta
Computational Photography Group
Seattle, WA, USA
2017 - 2022
image
Senior Software Engineer at MediaTek Inc.
Multimedia Technology Development (MTD) Division
Intelligent Vision Processing (IVP) Department
2014 - 2017
image
Software Engineer at MediaTek Inc.
Multimedia Technology Development (MTD) Division
Intelligent Vision Processing (IVP) Department
2012 - 2014
image
NCTU: MS
CommLab, Institute of Electronics
2008 - 2012
image
NCTU: BS
Department of Electronics Engineering

News

Research Group

Postdoc

赵祯俊
PhD, CUHK

PhD Students

黃怡川
Institute of
Computer Science

黃正輝
NTU CSIE
with Yung-Yu Chuang

端木竣偉
Institute of
Computer Science

Research Assistants

葉長瀚
BS, NYCU CE & CS
now MS student @ UIUC

陳霆軒
BS, NCHU AMATH
now MS student @ USC

簡浩任
MS, UCLA ECE

李沅罡
MS, NTU GICE

MS Students

林晉暘
Institute of Data Science
with Wei-Chen Chiu

吳中赫
Institute of
Multimedia

嚴士函
Institute of
Multimedia

陳捷文
Institute of
Computer Science

許皓翔
Institute of Computer Science
with Wen-Chieh Lin

范丞德
Institute of Computer Science
with Yu-Chee Tseng

張宸瑋
Institute of AI Innovation
with Jiun-Long Huang

林佑庭
Institute of Computer Science
with Yen-Yu Lin

黃亭幃
Institute of Computer Science
with Yen-Yu Lin

陳映寰
Institute of
Computer Science

張欀齡
Institute of
Computer Science

鄭淮薰
Institute of
Computer Science

柯柏旭
Institute of Computer Science
with Wei-Chen Chiu

吳秉宸
Program of
Artificial Intelligence

王皓平
Institute of
Multimedia

戚維凌
Institute of
Multimedia

俞柏帆
Institute of
Computer Science

黃靖恩
Institute of Data Science
with Jun-Cheng Chen

BS Students

蘇智海
NYCU CS

胡智堯
NTU MED

李明謙
NYCU ARETEHP

孫揚喆
NYCU MED

陳士弘
NYCU MATH & CS
now MS student @ NYCU

陳昱佑
NYCU CS

李宗諺
NYCU CS

楊宗儒
NYCU CS

陳凱昕
NYCU CS

劉珆睿
NYCU CS
now MS student @ UIUC

吳俊宏
NYCU CS
now MS student @ NYCU

蔡師睿
NYCU CS

張維程
NYCU CS

李杰穎
NYCU CS
now exchange student @ ETHZ

陳楊融
NYCU CS

吳定霖
NYCU CS

司徒立中
NYCU CS

朱驛庭
NYCU CS

徐和
NYCU CS

翁晨昱
NTHU PHYS

何義翔
NYCU CS

謝侑哲
NYCU CS

蔡聿瑋
NYCU CS

鄭名翔
NYCU ARETEHP

葉家蓁
NYCU CS

林揚森
NYCU ME & CS

蔡昀錚
NYCU CS
with Yen-Yu Lin

邢捷
Shanghai University
MATH

Alumni

鄭伯俞
MS Spring 2023
with Wei-Chen Chiu
now @ Qualcomm

陳俊瑋
BS 2023
Next MS Student
NYCU

林奕杰
BS 2023-2024

羅宇呈
BS 2023-2024

郭玠甫
BS 2023-2024

葉柔昀
BS 2023-2024

鄭又豪
BS 2023-2024

丁祐承
BS 2023-2024

謝明翰
BS 2023-2024

施惟智
BS 2023-2024

朱劭璿
MS Spring 2024
now PhD Student
@ Johns Hopkins


Research

Representative Papers


2024
Description
Description
FIPER: Generalizable Factorized Fields for Joint Image Compression and Super-Resolution
Yang-Che Sun, Cheng Yu Yeo, Ernie Chu, Jun-Cheng Chen, Yu-Lun Liu
arXiv, 2024  
project page / arXiv / code [coming soon]

This work derives a SR model, which includes a Coefficient Backbone and Basis Swin Transformer for generalizable Factorized Fields, and introduces a merged-basis compression branch that consolidates shared structures, further optimizing the compression process.

Description
SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes
Cheng-De Fan, Chen-Wei Chang, Yi-Ruei Liu , Jie-Ying Lee, Jiun-Long Huang, Yu-Chee Tseng, Yu-Lun Liu
arXiv, 2024
project page / arXiv / results / code [comming soon]

SpectroMotion is presented, a novel approach that combines 3D Gaussian Splatting with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes and is the only existing 3DGS method capable of synthesizing photorealistic real-world dynamic specular scenes.

Description
FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors
Chin-Yang Lin, Chung-Ho Wu, Chang-Han Yeh, Shih-Han Yen, Cheng Sun, Yu-Lun Liu
arXiv, 2024
project page / arXiv / code [comming soon]

FrugalNeRF is introduced, a novel few-shot NeRF framework that leverages weight-sharing voxels across multiple scales to efficiently represent scene details and guides training without relying on externally learned priors, enabling full utilization of the training data.

Description
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models
Chang-Han Yeh, Chin-Yang Lin, Zhixiang Wang, Chi-Wei Hsiao, Ting-Hsuan Chen, Yu-Lun Liu
arXiv, 2024
project page / arXiv / code / demo

It is shown that this method not only achieves top performance in zero-shot video restoration but also significantly surpasses trained models in generalization across diverse datasets and extreme degradations.

Description
DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation
Chun-Hung Wu, Shih-Hong Chen, Chih-Yao Hu, Hsin-Yu Wu, Kai-Hsin Chen, Yu-You Chen, Chih-Hai Su, Chih-Kuo Lee, Yu-Lun Liu
arXiv, 2024
project page / arXiv / code & dataset [coming soon] / colab

DeNVeR is presented, an unsupervised approach for vessel segmentation in X-ray videos without annotated ground truth, providing a robust, data-efficient tool for disease diagnosis and treatment planning and setting a new standard for future research in video vessel segmentation.

Description
ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration
Chi-Wei Hsiao, Yu-Lun Liu, Cheng-Kun Yang, Sheng-Po Kuo, Yucheun Kevin Jou, Chia-Ping Chen
NeurIPS, 2024
project page / paper / code

This work proposes ReF-LDM, an adaptation of LDM designed to generate HQ face images conditioned on one LQ image and multiple HQ reference images, and designs a timestep-scaled identity loss, enabling the LDM-based model to focus on learning the discriminating features of human faces.

Description
Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation
Ning-Hsu Wang, Yu-Lun Liu
NeurIPS, 2024
project page / arXiv / code [coming soon] / demo

This work proposes a new depth estimation framework that utilizes unlabeled 360-degree data effectively and uses state-of-the-art perspective depth estimation models as teacher models to generate pseudo labels through a six-face cube projection technique, enabling efficient labeling of depth in 360-degree images.

Description
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
Ting-Hsuan Chen, Jiewen Chan, Hau-Shiang Shiu, Shih-Han Yen, Chang-Han Yeh, Yu-Lun Liu
NeurIPS, 2024
project page / arXiv / results / code / video / demo

A video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics.

Description
Precise Pick-and-Place using Score-Based Diffusion Networks
Shih-Wei Guo, Tsu-Ching Hsiao, Yu-Lun Liu, Chun-Yi Lee
IROS, 2024
project page / arXiv / video

A novel coarse-to-fine continuous pose diffusion method to enhance the precision of pick-and-place operations within robotic manipulation tasks and facilitates the accurate perception of object poses, which enables more precise object manipulation.

Description
Description
GenRC: 3D Indoor Scene Generation from Sparse Image Collections
Ming-Feng Li, Yueh-Feng Ku, Hong-Xuan Yen, Chi Liu, Yu-Lun Liu, Albert Y. C. Chen, Cheng-Hao Kuo, Min Sun
ECCV, 2024  
project page / arXiv / code

The proposed GenRC, an automated training-free pipeline to complete a room-scale 3D mesh with high-fidelity textures, outperforms state-of-the-art methods under most appearance and geometric metrics on ScanNet and ARKitScenes datasets, even though GenRC is not trained on these datasets nor using predefined camera trajectories.



DriveEnv-NeRF: Exploration of A NeRF-Based Autonomous Driving Environment for Real-World Performance Validation
Mu-Yi Shen, Chia-Chi Hsu, Hao-Yu Hou, Yu-Chen Huang, Wei-Fang Sun, Chia-Che Chang, Yu-Lun Liu, Chun-Yi Lee
ICRA RoboNerF Workshop, 2024  
project page / video / arXiv / code

The DriveEnv-NeRF framework, which leverages Neural Radiance Fields (NeRF) to enable the validation and faithful forecasting of the efficacy of autonomous driving agents in a targeted real-world scene, can serve as a training environment for autonomous driving agents under various lighting conditions.

Description
Matting by Generation
Zhixiang Wang, Baiang Li, Jian Wang, Yu-Lun Liu, Jinwei Gu, Yung-Yu Chuang, Shin'ichi Satoh
SIGGRAPH, 2024  
project page / arXiv / code [coming soon] / slides / supplement

An innovative approach for image matting that redefines the traditional regression-based task as a generative modeling challenge and harnesses the capabilities of latent diffusion models, enriched with extensive pre-trained knowledge, to regularize the matting process.

Description
BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes
Chih-Hai Su*, Chih-Yao Hu*, Shr-Ruei Tsai*, Jie-Ying Lee*, Chin-Yang Lin, Yu-Lun Liu
SIGGRAPH, 2024  
project page / arXiv / code / video

This paper presents a novel approach called BoostMVSNeRFs to enhance the rendering quality of MVS-based NeRFs in large-scale scenes, and identifies limitations in MVS-based NeRF methods, such as restricted viewport coverage and artifacts due to limited input views.


Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang, Chun-Pei Chen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Yung-Yu Chuang, Yen-Yu Lin
CVPR, 2024  
arXiv / code

This paper addresses text-supervised semantic segmentation, aiming to learn a model capable of segmenting arbitrary visual concepts within images by using only imagetext pairs without dense annotations.


Description
HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses
Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang, Wu Liu, Xinchen Liu, Zheng Wang
CVPR, 2024  
project page / arXiv / code

This work reconstructs the previous HumanNeRF approach, combining explicit and implicit human representations with both general and specific mapping processes, and shows that explicit shape can filter the information used to fit implicit representation, and frozen general mapping combined with point-specific mapping can effectively avoid overfitting and improve pose generalization performance.


Dual Associated Encoder for Face Restoration
Yu-Ju Tsai, Yu-Lun Liu, Lu Qi, Kelvin C.K. Chan, Ming-Hsuan Yang
ICLR, 2024  
project page / arXiv

This work proposes a novel dual-branch framework named DAEFR, which introduces an auxiliary LQ branch that extracts crucial information from the LQ inputs and incorporates association training to promote effective synergy between the two branches, enhancing code prediction and output quality.

Description
DisCO: Portrait Distortion Correction with Perspective-Aware 3D GANs
Zhixiang Wang, Yu-Lun Liu, Jia-Bin Huang, Shin'ichi Satoh, Sizhuo Ma, Guru Krishnan, Jian Wang
IJCV, 2024  
project page / arXiv

This work proposes a simple yet effective method for correcting perspective distortions in a single close-up face using GAN inversion using a perspective-distorted input facial image, and develops starting from a short distance, optimization scheduling, reparametrizations, and geometric regularization.


Description
Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields
Bo-Yu Cheng, Wei-Chen Chiu, Yu-Lun Liu
AAAI, 2024  
project page / arXiv / code

An algorithm that allows joint refinement of camera pose and scene geometry represented by decomposed low-rank tensor, using only 2D images as supervision is proposed, which achieves an equivalent effect to brute-force 3D convolution with only incurring little computational overhead.


2023
Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction
Su-Kai Chen, Hung-Lin Yen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Wen-Hsiao Peng, Yen-Yu Lin
ICCV, 2023  
project page / arXiv / code / video

This work proposes the continuous exposure value representation (CEVR), which uses an implicit function to generate LDR images with arbitrary EVs, including those unseen during training, to improve HDR reconstruction.

ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection
Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, Min Sun
ICCV, 2023  
project page / arXiv

The studies indicate that the proposed image-induced geometry-aware representation can enable image-based methods to attain superior detection accuracy than the seminal point cloud-based method, VoteNet, in two practical scenarios: (1) scenarios where point clouds are sparse and noisy, such as in ARKitScenes, and (2) scenarios involve diverse object classes, particularly classes of small objects, as in the case in ScanNet200.

Description
Progressively Optimized Local Radiance Fields for Robust View Synthesis
Andreas Meuleman, Yu-Lun Liu, Chen Gao, Jia-Bin Huang, Changil Kim, Min H. Kim, Johannes Kopf
CVPR, 2023  
project page / paper / code / video

This work presents an algorithm for reconstructing the radiance field of a large-scale scene from a single casually captured video, and shows that progressive optimization significantly improves the robustness of the reconstruction.

Description
Robust Dynamic Radiance Fields
Yu-Lun Liu, Chen Gao, Andreas Meuleman, Hung-Yu Tseng, Ayush Saraf, Changil Kim, Yung-Yu Chuang, Johannes Kopf, Jia-Bin Huang
CVPR, 2023  
project page / arXiv / code / video

This work addresses the robustness issue by jointly estimating the static and dynamic radiance fields along with the camera parameters (poses and focal length) and shows favorable performance over the state-of-the-art dynamic view synthesis methods.


2022



Denoising Likelihood Score Matching for Conditional Score-based Data Generation
Chen-Hao Chao, Wei-Fang Sun, Bo-Wun Cheng, Yi-Chen Lo, Chia-Che Chang, Yu-Lun Liu, Yu-Lin Chang, Chia-Ping Chen, Chun-Yi Lee
ICLR, 2022  
arXiv / OpenReview

This work forms a novel training objective, called Denoising Likelihood Score Matching (DLSM) loss, for the classifier to match the gradients of the true log likelihood density, and concludes that the conditional scores can be accurately modeled, and the effect of the score mismatch issue is alleviated.


2021
Description
Learning to See Through Obstructions with Layered Decomposition
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
TPAMI, 2021  
project page / arXiv / code / demo / video

This work alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network, facilitates accommodating potential errors in the flow estimation and brittle assumptions, such as brightness consistency.




Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision
Ning-Hsu Wang, Ren Wang, Yu-Lun Liu, Yu-Hao Huang, Yu-Lin Chang, Chia-Ping Chen, Kevin Jou
ICCV, 2021  
project page / arXiv / code

This paper proposes a method to estimate not only a depth map but an AiF image from a set of images with different focus positions (known as a focal stack), and shows that this method outperforms the state-of-the-art methods both quantitatively and qualitatively, and also has higher efficiency in inference time.

Description
Hybrid Neural Fusion for Full-frame Video Stabilization
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
ICCV, 2021  
project page / arXiv / poster / slides / code / demo / video / Two minute video

This work presents a frame synthesis algorithm to achieve full-frame video stabilization that first estimate dense warp fields from neighboring frames and then synthesize the stabilized frame by fusing the warped contents.


2020


Explorable Tone Mapping Operators
Chien-Chuan Su, Ren Wang, Hung-Jin Lin, Yu-Lun Liu, Chia-Ping Chen, Yu-Lin Chang, Soo-Chang Pei
ICPR, 2020  
arXiv

This paper proposes a learning-based multimodal tone-mapping method, which not only achieves excellent visual quality but also explores the style diversity and shows that the proposed method performs favorably against state-of-the-art tone-Mapping algorithms both quantitatively and qualitatively.


Learning Camera-Aware Noise Models
Ke-Chi Chang, Ren Wang, Hung-Jin Lin, Yu-Lun Liu, Chia-Ping Chen, Yu-Lin Chang, Hwann-Tzong Chen
ECCV, 2020  
project page / arXiv / code

A data-driven approach, where a generative noise model is learned from real-world noise, which is camera-aware and quantitatively and qualitatively outperforms existing statistical noise models and learning-based methods.

Description
Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline
Yu-Lun Liu*, Wei-Sheng Lai*, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
CVPR, 2020  
project page / arXiv / poster / slides / code / demo / 1-minute video

This work model the HDR-to-LDR image formation pipeline as the dynamic range clipping, non-linear mapping from a camera response function, and quantization, and proposes to learn three specialized CNNs to reverse these steps.

Description
Learning to See Through Obstructions
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang
CVPR, 2020  
project page / arXiv / poster / slides / code / demo / 1-minute video / video / New Scientists

The method leverages the motion differences between the background and the obstructing elements to recover both layers and alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network.


Attention-based View Selection Networks for Light-field Disparity Estimation
Yu-Ju Tsai*, Yu-Lun Liu*, Yung-Yu Chuang, Ming Ouhyoung
AAAI, 2020  
paper / code / benchmark

A novel deep network for estimating depth maps from a light field image that generates an attention map indicating the importance of each view and its potential for contributing to accurate depth estimation and enforce symmetry in the attention map to improve accuracy.


2019
Description
Deep Video Frame Interpolation using Cyclic Frame Generation
Yu-Lun Liu, Yi-Tung Liao, Yen-Yu Lin, Yung-Yu Chuang
AAAI, 2019   (Oral Presentation)
project page / paper / poster / slides / code / video

A new loss term, the cycle consistency loss, which can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data is introduced.


2014
Background modeling using depth information
Yu-Lun Liu, Hsueh-Ming Hang
APSIPA, 2014  
paper

This paper focuses on creating a global background model of a video sequence using the depth maps together with the RGB pictures, and develops a recursive algorithm that iterates between the depth map and color pictures.


2013
Virtual view synthesis using backward depth warping algorithm
Du-Hsiu Li, Hsueh-Ming Hang, Yu-Lun Liu
PCS, 2013  
paper

A backward warping process is proposed to replace the forward warped process, and the artifacts (particularly the ones produced by quantization) are significantly reduced, so the subjective quality of the synthesized virtual view images is thus much improved.


Miscellanea
Publication Chair, ACCV 2024
Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, SIGGRAPH, SIGGRAPH Asia, AAAI, IJCAI
Journal Reviewer: TPAMI, IJCV, TIP, TOG

Teaching
CSIC30107: Video Compression
NYCU - Spring 2023, Fall 2023, Fall 2024 (Instructor)
CSCS10017: Signals and Systems
NYCU - Spring 2024 (Instructor)
DEE1315: Probability and Statistics
NCTU - Spring 2013 (Teaching Assistant)

Funding

My research is made possible by the generous support of the following organizations.


Stolen from Jon Barron's website.
Last updated Nov 2024.