I'm a Researcher. Developer. Programmer. |

Not so long ago Mudit was born in Varanasi, India. Since then he has attended several elementary schools but none could save him from JEE. He earned his B. Tech in Information Technology from Delhi College of Engineering, somehow. He believes computer science chose him.

Currently he is pursuing his Ph.D. in Computer Science from Arizona State University advised by Dr. Subbarao Kambhampati. His research interests lie in Explainable-AI and Human in the loop AI. He is currently focussing on how to explain plans and policies to fellow humans in the paradigms of AI Planning and Reinforcement Learniing.

In my free time I like to post paper reviews on Youtube @Papers & Chill & play tons of Chess! Find me on lichess @WhatsappOnly and chess.com @pawnTakesPawnTakes  

Education

Arizona State University

Ph.D. in Computer Science
(2019-Present)
CGPA : 4.0/4.0

Delhi Technological University

B.Tech in Information Technology
(2015-2019)
Gold Medalist
CGPA : 9.51/10.0

Ramjas School Pusa Road

Alumnus (2008-2015)
12th : 96.4%
10th : 10 CGPA

Experience


Summer 2023
Cupertino, USA

Machine Learning Research Intern, Apple Inc.

Research work with Machine Learning Research (MLR) Group. Advised by Rin Metcalf and Barry Theobald. Hindsight PRIORs for Reward Learning from Human Preferences. (ICLR 2024)


Summer 2022
Cupertino, USA

Machine Learning Research Intern, Apple Inc.

Preference based Reinforcement Learning research with Machine Learning Research (MLR) group Advised by Rin Metcalf Barry Theobald. Symbol Guided Hindsight Priors for Reward Learning from Human Preferences at IROS RLCONFORM, NeurIPS HILL 2022.



Summer 2021
Santa Clara, USA

Deep Learning Software Engineering Intern, Intel Corporation

• First analysis of float32 ResNet50 architecture on Intel IceLake (ICX) machines.
• Proposed Several optimizations (in parallel computing) like shared processes, to achieve BFloat16 performance (as bench- marked on CooperLake machines) on an ICX cluster.
• Additionally, first to provide the Best Known method (an automated way) for working with ResNet50 on Intel Endevour Cluster.
• Parallely, first to work with Quantized ResNet Models to show discrepancy in Saliency Based explanations between original RN50 and Quantized RN50.

Summer 2018
Bangalore, India 

Software Engineering Intern, Samsung Semiconductor India Research

• Created DRAM Bank Simulator, (400 times faster) with enhanced Fault Classes.
• Novel Approach to Redundancy Analysis Algorithms through State Space Reduction schemes &  Beating RA through Monte Carlo Tree Search and Residual Networks.
• Awarded Best Intern Project at SSIR.

Summer 2017
Bangalore, India 

Software Engineering Intern, Samsung Semiconductor India Research

• Diagnosed issues with SSDs & Implemented SSD Simulator for Read/Write/Garbage Collection.
• Created an LSTM based Algorithm - Stream Selection for Smart Data Categorization (STRASDAC) to reduce write-wearing in SSDs and in turn further improve Garbage Collection.
• Reached Best Intern Project Finals at SSIR.

Research

2024

  • Hindsight PRIORs for Reward Learning from Human Preferences

    Mudit Verma, Katherine Metcalf

    ICLR 2024

    PaperPoster
  • Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion?

    Mudit Verma*, Siddhant Bhambri*, Subbarao Kambhampati

    HRI 2024

    Invited Talk : AGI Leap Summit 2024

    Paper

2023

  • Trust-Aware Planning: Modeling Trust Evolution in Iterated Human-Robot Interaction.

    Zahra Zadehi, Mudit Verma, Sreedharan, Subbarao Kambhampati

    Human Robot Interaction (HRI)

    PaperPoster
  • Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments.

    Tung Thai, Mudit Verma, Utkarsh Soni, Gopalakrishnan S., Shen M., Garg M., Kalani A.,, Vaidya N., Kambhampati S., Varshney N., Baral C., Sinapov J., Scheutz M.

    AAMAS Extended Abstract

    Paper
  • Preference Proxies: Evaluating Large Language Models in capturing Human Preferences in Human-AI Tasks

    Mudit Verma*, Siddhant Bhambri*, Subbarao Kambhampati

    Theory of Mind Workshop, Many Facets of Preference Learning (Oral) Workshop at ICML 2023.

    PaperPoster
  • Exploiting Action Distances for Reward Learning from Human Preferences.

    Mudit Verma*, Siddhant Bhambri*, Subbarao Kambhampati

    In Many Facets of Preference Learning Workshop at ICML 2023.

    PaperSlides
  • Data Driven Reward Initialization for Preference based Reinforcement Learning

    Mudit Verma, Subbarao Kambhampati

    In AAAI R2HCAI 2023.

    PaperSlides
  • Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning.

    Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    In AAAI R2HCAI 2023.

    PaperSlides

2022

  • Symbol Guided Hindsight Priors for Reward Learning from Human Preferences

    Mudit Verma and Katherine Metcalf

    NeurIPS HILL 2022

    IROS RLCONFORM 2022

    Paper
  • Advice Conformance Verification by Reinforcement Learning agents for Human-in-the-Loop

    Mudit Verma, Ayush Kharkwal, Subbarao Kambhampati

    NeurIPS HILL 2022

    IROS RLCONFORM 2022

    Paper Video
  • Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion

    Utkarsh Soni, Sarath Sreedharan, Mudit Verma, Subbarao Kambhampati

    NeurIPS HILL 2022

    Paper
  • Computing Policies That Account for the Effects of Human Uncertainty During Execution in Markov Decision Processes

    Sriram Gopalakrishnan, Mudit Verma, Subbarao Kambhampati

    ICAPS Workshop on Explainable AI Planning (XAIP) 2022

    Paper
  • Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations

    Sarath Sreedharan, Utkarsh Soni, Mudit Verma, Siddharth Srivastava and Subbarao Kambhampati

    ICLR 2022

    Paper Video
  • Symbols as a Lingua Franca for Bridging Human-AI Chasm for Explainable and Advisable AI Systems.

    Subbarao Kambhampati, Sarath Sreedharan, Mudit Verma, Yantian Zha, Lin Guan

    In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) Blue Sky Track

    Paper
  • Modeling the interplay between human trust and monitoring

    Zahra Zahedi, Sarath Sreedharan, Mudit Verma and Subbarao Kambhampati

    HRI 2022 (Late breaking paper)

    Paper

2021

  • Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation.

    Lin Guan, Mudit Verma, Sihang Guo, Ruohan Zhang, Subbarao Kambhampati

    In Advances in Neural Information Processing Systems. (NeurIPS) Spotlight

    Paper
  • Trust-Aware Planning: Modeling Trust Evolution in Longitudinal Human-Robot Interaction

    Zahra Zadehi, Mudit Verma, Sarath Sreedharan, Subbarao Kambhampati

    In ICAPS 2021 Workshop on Explainable AI Planning, Also in ICAPS 2021 Workshop on Planning and Robotics

    Paper
  • Synthesizing Policies That Account For Human Execution Er- rors Caused By State Aliasing In Markov Decision Processes

    Sriram Gopalakrishnan, Mudit Verma, Subbarao Kambhampati

    In ICAPS 2021 Workshop on Explainable AI Planning.

    Paper

2020

  • Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Black Box Simulators

    Sarath Sreedharan, Utkarsh Soni, Mudit Verma, Siddharth Srivastava, Subbarao Kambhampati

    ICML Workshop on Human in the Loop Learning (HILL)

    PaperPoster
  • Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning

    Lin Guan*, Mudit Verma*, Sihang Guo, Ruohan Zhang, Subbarao Kambhampati

    NeurIPS Deep Reinforcement Learning Workshop (DRL)
    NeurIPS Workshop on Human And Model in the Loop Evaluation and Training Strategies (HAMLETS)

    Paper
  • Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning

    Lin Guan*, Mudit Verma*, Subbarao Kambhampati

    ICML Workshop on Human in the Loop Learning (HILL)

    PaperPoster
  • Fine-grained Language Identification with MultilingualCapsNetModel.

    Mudit Verma, Arun Balaji Buduru

    IEEE International Conference on Multimedia Big Data (BigMM)

    PaperSlides

2019

  • A Novel Framework for Neural Architecture Search in the Hill Climbing Domain.

    Mudit Verma, Pradyumna Sinha, Karan Goyal, Apoorva Verma, Seba Susan

    IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)

    Paper
  • Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop

    Mudit Verma, Siddhant Bhambri, Saurabh Gupta, Arun Balaji Buduru

    Paper

Teaching & Service

  • Teaching Assistant, CSE 471 - Introduction To Artificial Intelligence, ASU (Fall 2019) by Dr. Subbarao Kambhampati

  • Reviewer, ICAPS XAIP-2022, ICAPS XAIP-2021

  • PC Member, ICLR-2023

  • PC Member/Reviewer, ICML-2023, ICML-2022

  • PC Member, IJCAI-2024

  • PC Member/Reviewer, NeurIPS-2023, NeurIPS-2022, Neurips GenPlan 2023

  • PC Member/Reviewer, AAAI-2023, AAAI-2022

Awards

  • ASU SCAI Doctoral Fellowship, ASU 2024

  • ASU SCAI Doctoral Fellowship, ASU 2023

  • Engineering Graduate Fellowship, ASU 2022

  • ASU University Graduate/ Doctoral Fellowship, ASU 2019

  • DTU Merit Department Rank Scholarship BTech, DTU, 2019, 2018, 2017, 2016

  • 4th , Hack In The North (IIIT Allahabad), 2018

  • Selected for Education Innovation Mentorship Programme , ReadAlliance, 2018

  • Department Topper for 6 consecutive semesters, DTU, 2018

  • 1st READing Hackathon (USAID), 2017

  • Pramod Jain Scholarship , best student at DTU, 2017

  • Top 15 , World Food India Hackathon, 2017

  • Award for Exemplary Contribution , Computer Society of India-DTU Chapter, 2017

  • Interest Development Group Head , CSI-DTU Chapter, 2017

  • 46th Rank at HackerEarth MLChallenge-1, 2017

  • Top 10 Synergy DTU-Hack, DTU, 2017

Projects & Other Stuff

  • Technical Report, Perfect Observability is a Myth: Restraining Bolts in the RealWorld. Spring 2021Paper

    Technical Report, Implementation and Analysis of Recommender Systems. Spring 2021Paper

    Technical Report, Diverging Emerging Field of Multi-Task Reinforcement Learning • Colors of Desert Used D3 to highlight deserts are indeed colorful. Spring 2020Paper

    Technical Report, Colors of the Desert. Spring 2020Paper

    Technical Report, Randomly Wired Networks are on the rise, have we been creating wrong Networks all along? Fall 2019PaperSlidesCode

    Shut The Fake Up  App/Website Wisdom of Majority & AI for Fake News detection. 

    Text Summarization Human like summarization using Pointer Generator Networks

    StressOut App  to check one’s stress levels and suggest better work timings to bring relief through Machine Learning.

    CookHub  Open Source Community for Recipes where one can chat, push, pull, fork, collaborate & view trending recipes and contributors.

    Tutoring All Children (TAC)  App that adapts and teaches children/adults (specially dyslexic) to read/write/recognize using ML Techniques.