Introduction

Things I looked at while prepping up for interviews, and generally gaining insights about AI/ML problems during my PhD. It is, by no means, a complete list but reflects some things that I put down in my notes. Maybe dump it to Gemini, ask it to filter for what you want to study. These are organized by Google's Gemini.

Some topics I had experience in because of my research / interests:

ML fundamentals
Maths / Stats / Probs.
Loss functions / Auxiliary losses (thesis)
LLM fundamentals / prompting / limits / uses (thesis)
PbRL / RLHF / RL (thesis)
User study / AB testing / significance testing (prior works)
Active learning (prior works)
Neural Architecture search (prior works)
Human aware AI (prior works)

Good Scientist can reason

1. Recommended Reading

1.1 Reinforcement Learning

Reinforcement Learning: An Introduction - Richard S. Sutton and Andrew G. Barto
Dynamic Programming and Optimal Control - Dimitri P. Bertsekas
Algorithms for Reinforcement Learning - Csaba Szepesvári
Markov Decision Processes: Discrete Stochastic Dynamic Programming - Martin L. Puterman
Deep Reinforcement Learning Hands-On - Maxim Lapan
Pivotal papers on Policy Gradient (PG), A2C, and other algorithms
An Algorithmic Perspective on Imitation Learning
ESE 650: Learning in Robotics Spring 2023 - Pratik
Laura Graesser - Foundations of Reinforcement Learning
Lil'Log - Policy Gradient Algorithms
Density Constrained Reinforcement Learning
Grokking Deep Reinforcement Learning
Deep Reinforcement Learning: Frontiers of AI - Mohit Sewak

1.2 Decision Making

Planning with Markov Decision Processes - Mausam

1.3 RLHF / Alignment / RM

Key papers on Reinforcement Learning from Human Feedback (RLHF), Alignment, and Reward Modeling
The Alignment Problem
Human Compatible - Stuart Russell
Artificial Intelligence: A Guide for Thinking Humans - Melanie Mitchell
Reward hacking, IRL, LfD, offline RL, bandits, dueling bandits

1.4 AI / Machine Learning

Artificial Intelligence: A Modern Approach (AIMA)
Pattern Recognition and Machine Learning - Christopher M. Bishop
Designing Machine Learning Systems - Chip Huyen
Introduction to Machine Learning - Ethem Alpaydın
Lifelong Machine Learning - Morgan & Claypool (Continual Learning and Catastrophic Forgetting)

1.5 Fundamental NLP

Speech and Language Processing - Daniel Jurafsky and James H. Martin
Foundations of Statistical Natural Language Processing - Christopher D. Manning and Hinrich Schütze

1.6 Other Recommended Books

Notes on Theory of Choice
Trust in Machine Learning, Formalization, and Applied Areas
Books on Causality - Judea Pearl; Counterfactuals
Books on Probabilistic Graphical Models
A Guided Tour of Artificial Intelligence Research - Pamela McCorduck
Should We Trust Artificial Intelligence?
Some books on Markov Chains

2. Areas to Cover

2.1 General

STAR Method for behavioral interviews: https://www.themuse.com/advice/star-interview-method
Designing Machine Learning Systems by Chip Huyen: https://huyenchip.com/machine-learning-systems-design/toc.html

Good MLE can implement

2. Areas to Cover [LLM Heavy]

2.1 LLM (Large Language Models)

Understanding the fundamentals of LLMs
Scaling of LLMs
Multimodal models? Latest ones? what's the key difference? why is it harder than LLMs
Latest fine-tuning methods, key challenges, limitations? LLM Modulo
RLHF / PbRL for control domains, LLMs, differences? similarities?
Latest techniques in RLHF - RM based vs. DPO / IPO. Why?
Agents: When do things work? When does it break?
Agents: CoT, ReAct, Reflexion, Voyager, Eureka, ToM, Planning & Reasoning
Understanding RAG techniques, why? where?

2.2 LLM Optimization

Hardware-specific optimization techniques - Accelerators, specialized hardware
Software-level optimization tricks - Tiny ML, flash attention variants
Main ideas behind DeepSpeed
ZeRO-1 Optimizer
ZeRO-2 Optimizer
Horovod for distributed training

Hardware-aware algorithms for sequence modeling: YouTube Video
Optimizing LLMs: Oracle Blog
Mastering LLM Optimization: Attri AI Blog
Text Generation Inference Streaming: Hugging Face Documentation
Flash Attention: https://arxiv.org/abs/2205.14135
Assisted Generation Medusa: Hugging Face Blog
Parameter-Efficient Fine-Tuning (PEFT): Hugging Face PEFT
Diffusion Models: Hugging Face Diffusers
Transformer Reinforcement Learning (TRL): Hugging Face TRL
Choosing Evaluation Metrics: Hugging Face Evaluate

Exploring acceleration strategies: Hugging Face Accelerate
Handling large models: Big Model Handling
Gradient Accumulation
SageMaker for scalable training
DeepSpeed integration: DeepSpeed Documentation

RLHF Introduction: YouTube Video
LLM Optimization Overview: YouTube Video
LoRA Fine-tuning
CLM Prompt Tuning: Hugging Face CLM Prompt Tuning
CLIP Fine-tuning for image retrieval
CLIP4Cir Repository: https://github.com/ABaldrati/CLIP4Cir

2.3 Others

Key concepts and practices in Data Science
Learning through auxiliary tasks: https://vivien000.github.io/blog/journal/learning-though-auxiliary_tasks.html

3. Common ML Engineering Interview Questions

Explain the Bias-Variance trade-off. How does it affect model performance?
What is cross-validation and why is it important?
Describe different types of cross-validation techniques.
Differentiate between supervised, unsupervised, and semi-supervised learning with examples.
Explain regularization and its importance in machine learning.
Difference between L1 and L2 regularization.
Feature selection techniques and identifying important features.
Handling missing data and common imputation techniques.
Steps involved in a typical NLP pipeline.
Methods for reducing dimensionality and how they work.
Concept of overfitting, identification, and mitigation strategies.
Differences between precision, recall, and F1-score; when to prioritize each.
Understanding the Curse of Dimensionality and its impact.
Explain the k-nearest neighbors (KNN) algorithm and determining the value of 'k'.
Cross-entropy vs. contrastive loss.
Dealing with class imbalance.
Multi-armed bandits, MLE vs. Bayesian approaches.
Importance of randomization in A/B tests and understanding p-values.
Bias/variance trade-off in non-parametric models.
Overfitting and model capacity.
Regularization techniques in deep neural networks.
Logistic regression and proof of global minima.
Explain gradient descent, batch normalization, and acceleration techniques.
Working principles of CNNs, RNNs, Transformers, and Attention mechanisms.
Difference between bagging and boosting; computational differences between XGBoost and Random Forest.
Designing a recommendation system for books.
Understanding page ranking systems.

4. Skills

Technical Skills:

BasicsTake NumPy and PyTorch and know everything about it. Know about JAX, XLA, optimizers. Equations to code is not easy, learn it.
Code up algorithms - basic Classification - SVMs, binary, multi; regression; bandits; RL
Transformers - know why it was needed, why people don't like it. why do they still have to bear it.
Decision Trees. This deserves a special bullet.
You claim to know CS - please know bash.
Monitoring, logging, docker, git, kubernetes? They will ask - "nothing works, what to do?"
CUDA: Experience with flash attention, Triton, and CUDA fundamentals.
High-Performance Computing (HPC): MPI, multi-node setups on SLURM, Turi Bolt, and Horovod.
Weights & Biases (W&B), MuJoCo, Hacktoberfest.

C++ Python. SQL. Main libraries.

Distributed Computing: Hadoop and Spark.

Good Candidate can answer

5. Interview Preparation Resources

Top Machine Learning Interview Questions: LeetCode Discussion
System Design Concepts: Grokking System Design
Coding Questions for ML Interviews: GitHub Repository

6. Behavioral Interview Questions

Tell me about yourself.
Why are you interested in our company?
Why are you interested in this position?
What do you know about our organization?
What is your relevant experience for the ML engineering role?
Describe a weakness.
Share an innovative solution you've implemented (non-technical).
Understanding and applying the STAR method for behavioral questions.

7. Data Science Knowledge

Application Areas

Search Engines
Advertising Technology
Recommender Systems
Speech Recognition

Basics

Understanding measures of central tendency (mean, median, mode)
Measures of spread (standard deviation, interquartile range, range)
Data distribution shapes (skewness, kurtosis, unimodal, bimodal)
Identifying and handling outliers

Common Data Science Questions

Difference between a Validation Set and a Test Set
Explanation of cross-validation techniques
Univariate vs. bivariate vs. multivariate analysis
Understanding Star Schema
Explanation of Cluster Sampling and Systematic Sampling
Eigenvectors and Eigenvalues
Supervised vs. Unsupervised Learning
Meaning of "Naive" in Naive Bayes
Detailed explanation of the SVM algorithm
Support vectors and kernel functions in SVM
Decision Tree algorithm, entropy, and information gain
Preference between Python or R for text analytics
Importance of data cleaning in analysis
Scenarios where false positives are more critical than false negatives, and vice versa
When both false positives and false negatives are equally important

Additional Resources

Learning from Imbalanced Classes: Article
Understanding Hadoop and Spark
MapReduce paradigm
Statistical tests and their applications
Current trending projects in data science
Recommender systems in streaming platforms
A/B Testing and Experimentation methodologies
Key Algorithms in data science
Analytics techniques
Behavioral interview preparation
Business case questions
Database design principles
Machine Learning system design
Probability concepts and problems
Product metrics and analysis
Proficiency in Python, SQL, or Pandas
Statistical fundamentals