Muhammad Ferjad Naeem
I am a Ph.D. Candidate at the Computer Vision lab at ETH Zürich supervised by Prof. Luc Van Gool and PD. Dr. Federico Tombari.
As part of my PhD, supported by a Google fellowship, I work as a Research Consultant with Google in Zurich. I also closely collaborate with Google Deepmind on Foundational Vision Language Models.
I am interested in building strong multimodal foundational models. I am also interested in distilling the world knowledge of foundational models to smaller task specific models that can adapt and generalize to novel classes and environments.
Previously I obtained my Master's degree from the Technical University of Munich with a focus on Generative Models at Naver AI Lab and Zero-shot Learning at UniTübingen AI Research.
I have also been an intern at Nvidia.
Having completed my PhD requirements, I am now on the job market. I am looking for Research Scientist roles focused on building strong foundational models and using them to solve various downstream tasks.
Email  / 
Google Scholar  / 
Github  / 
Twitter  / 
CV
|
|
Publications
|
SILC: Improving vision language pretraining with self-distillation
Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai, Lukas Hoyer, Luc Van Gool, Federico Tombari
arxiv, 2024
|
|
I2DFormer+: Learning Image to Document Summary Attention for Zero-Shot Image
Muhammad Ferjad Naeem, Yongqin Xian, Luc Van Gool, Federico Tombari
IJCV, 2024
|
|
Learning to Prompt with Text Only Supervision for Vision-Language Models
Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer, Luc Van Gool, Federico Tombari
arxiv, 2024
|
|
SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc Van Gool, Federico Tombari
arxiv, 2024
|
|
Introducing Language Guidance in Prompt-based Continual Learning
Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc Van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal
ICCV, 2023
|
|
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Ferjad Naeem, Muhammad Gul Zain Ali Khan, Yongqin Xian, Muhammad Zeshan Afzal, Didier Stricker, Luc Van Gool, Federico Tombari
CVPR, 2023 Highlight paper (top 2.5%)
|
|
Learning Attention Propagation for Compositional Zero-Shot Learning
Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc Van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal
WACV, 2023
|
|
I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification
Muhammad Ferjad Naeem, Yongqin Xian, Luc Van Gool, Federico Tombari
NeurIPS, 2022
|
|
3D Compositional Zero-shot Learning with DeCompositional Consensus
Muhammad Ferjad Naeem*, Evin Pinar Ornek*, Yongqin Xian, Luc Van Gool, Federico Tombari
*Equal contribution
ECCV, 2022
|
|
Learning Graph Embeddings for Open World Compositional Zero-shot Learning
Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata
TPAMI, 2022
|
|
Learning Graph Embeddings for Compositional Zero-shot Learning
Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata
CVPR, 2021
|
|
Open World Compositional Zero-Shot Learning
Massimiliano Mancini*, Muhammad Ferjad Naeem*, Yongqin Xian, Zeynep Akata
*Equal contribution
CVPR, 2021
|
|
Reliable Fidelity and Diversity Metrics for Generative Models
Muhammad Ferjad Naeem*, Seong Joon Oh*, Youngjung Uh, Yunjey Choi, Jaejun Yoo
*Equal contribution
ICML, 2020
|
|
Deep Learning Under the Microscope: Improving the Interpretability of Medical Imaging Neural Networks
Magdalini Paschali*, Muhammad Ferjad Naeem*, Walter Simson, Katja Steiger, Martin Mollenhauer, Nassir Navab
*Equal contribution
Arxiv, 2019
|
|
A Multi-Faceted OCR Framework for Artificial Urdu News Ticker Text Recognition
Sami-Ur-Rehman, Burhan Ul Tayyab, Muhammad Ferjad Naeem, Adnan Ul-Hasan, Faisal Shafait
DAS, 2018
|
|
Impact of Ligature Coverage on Training Practical Urdu OCR Systems
Muhammad Ferjad Naeem, Noor ul Sehr Zia, Aqsa Ahmed Awan, Faisal Shafait, Adnan ul Hasan
ICDAR, 2017
|
|