11/20/2023

As NeurIPS 2023 is coming up, I am happy to share that we are going to present some interesting results both in the main track and at various workshops.

11/02/2023

After a summer of amazing research collaboration with Microsoft’s AI for Good lab on advancing wildlife conservation efforts, we are eager to announce that a preliminary version of our work has been published on arXiv. We investigate how multimodal foundation models can aid biologists in automatically identifying animal species in camera trap imagery. Our proposed technique, WildMatch, can greatly reduce the cost of image analysis, as it requires no expert-labelled training data. We leverage the rich visual understanding capabilities of pre-trained vision-language foundation models, and adapt them for detailed visual description generation of animals. Then, we find the closest match in an external knowledge base of animal descriptions built from Wikipedia and other publicly available sources. This is still a work in progress with additional results coming soon.

arXiv

03/25/2023

The pre-print version of our new paper DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency is now available on arXiv. In this work, we propose a novel diffusion-based framework for inverse problem solving: we assume that the observation comes from a stochastic degradation process that gradually degrades and noises the original clean image. We reverse this corruption process in order to incrementally add more and more detail back to images. By leveraging early-stopping we can freely trade off perceptual quality (how ‘nice’ and realistic an image looks) for better distortion metrics (how faithful it is to our observation) or vice versa.

arXiv

01/15/2023

I am thrilled to share that I’m joining Microsoft’s AI for Good Lab as a research intern for the summer of 2023. Microsoft’s AI for Good initiative is at the forefront of using artificial intelligence to address some of the world’s most pressing global challenges to the environment, humanitarian issues and healthcare. In collaboration with Zhongqi Miao, I will be working on leveraging multimodal foundation models to advance wildlife conservation efforts. I’m excited to learn and make a positive impact with AI for Good researchers!

09/14/2022

Our work HUMUS-Net: a Transformer-convolutional hybrid model for accelerated MRI reconstruction has been accepted for NeurIPS 2022. I am looking forward to sharing our work and interacting with other researchers in the field in person for the first time in a while at NeurIPS.

Paper Slides Code

09/12/2022

I am participating in IPAM’s Computational Microscopy long program at UCLA as a Graduate Visiting Researcher. This program brings together leading experts in the fields of applied mathematics, physics, biology, materials science and engineering in order to encourage debate and collaboration on modern microscopy techniques, such as coherent diffraction imaging, super-resolved fluorescence microscopy (2014 Nobel prize) and a special focus on cryo-electron microscopy (cryo-EM, 2017 Nobel prize). These methods result in high-dimensional, multimodal and extremely noisy data, where extracting useful information, and eventually scientific knowledge is challenging. Deep learning has enormous potential in tackling these challenges that may lead to breakthroughs in materials science, quantum devices and drug discovery.

05/19/2022

I am very excited to share that I will be joining Amazon’s Alexa Perceptual Technologies as an Applied Scientist Intern for the summer under the mentorship of Rajath Kumar. We are going to work on improving wake word verification models through semi-supervised learning techniques and data augmentation. I am looking forward to collaborating with Amazon researchers and learning more about their work.

03/15/2022

We published a preprint version of our new work on HUMUS-Net: a Transformer-convolutional hybrid model for accelerated MRI reconstruction. Common deep learning reconstruction techniques in MRI typically utilize convolution layers as the primary processing unit, however convolution kernels struggle to model long-range pixel dependencies in images and are content-independent. On the other hand, Transformer architectures are free from these limitations and are rapidly gaining ground in a range of vision applications. Our proposed architecture combines the benefits of both worlds and achieves state-of-the-art results on the fastMRI dataset, the largest publicly available MRI dataset.

Paper Slides Code

10/29/2021

Today we hosted the 11th Annual Research Festival at the Ming Hsieh ECE department with close to 100 posters and guided lab tours organized by the Dynamic Imaging Science Center (DISC) showcasing their new, unique high-performance low-field MRI scanner. My poster on MRAugment has won the Best Poster - Honorable Mention award.

Poster

09/28/2021

I have been selected as a Ming Hsieh PhD Scholar for 2021-2022 along with Haleh Akrami, Hefei Liu, Rodrigo Lobos, Qinyi Luo and Qiaochu Zhang. We are going to work together to organize professional events and support our PhD community. You can find the slides for my presentation Overcoming the data bottleneck in AI for the sciences presented at the MHI Scholar Finalist Talk Competition below.

Finalist Talk Slides

07/01/2021

Our paper Data augmentation for deep learning based accelerated MRI reconstruction with limited data has been accepted for short talk at ICML 2021. In this work we propose MRAugment, a data augmentation pipeline for MRI that improves MRI image reconstruction quality when training data is scarce and helps training more robust deep learning models against various forms of distributions shift (different scanner models, anatomies) and hallucinations.

Paper Code
Grill et al., Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning, NeurIPS, 2020

03/12/2021

I gave a talk on two popular self-supervised learning techniques BYOL (Grill et al.) and SimSiam (Chen et al.) at our Learning from Signals and Data journal club. Self-supervised learning algorithms have been rapidly catching up with supervised methods recently. They typically build upon a siamese architecture that can learn good image representations from unlabelled data. Thus, these methods have great potential in scientific applications where labelled training data is scarce or extremely costly to acquire.

Slides

01/18/2021

Our paper 3D phase retrieval at nano-scale via accelerated Wirtinger flow has been accepted at EUSIPCO 2020. High-resolution imaging of small 3D structures is an important problem in biology (protein complexes) and microelectronics (chip manufacturing). Our work introduces a fast and accurate algorithm that can recover the 3D structure of such objects more accurately and from fewer projections than previous techniques.

Conference Paper Extended arXiv

12/12/2020

Our work Minimax lower bounds for transfer learning with linear and one-hidden layer neural networks has been accepted at NeurIPS 2020 for poster presentation. We investigate how the number of source samples and the distance between datasets impact model generalization on a target dataset. We introduce a novel metric, the so called transfer distance, that quantifies how challenging it is to transfer knowledge from one dataset to another.

Paper
Zhu et al., Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV, 2017

10/23/2020

I gave a talk on CycleGAN (Zhu et al.) and its applications in medical imaging at our Learning from Signals and Data journal club. CycleGAN is a powerful image-to-image translation network that learns to map images of a certain domain (e.g. landscape photos) to another domain (e.g. paintings) and vice versa without seeing corresponding pairs from these two domains. The key idea is to make sure that if we translate an image and then translate it back, we recover the original image. CycleGAN and its variants have been very successful in cross-modal image synthesis in medical tasks. That is it can create synthetic images of a target modality (e.g. MRI) from images of a different source modality (e.g. CT).

Slides