Introduction
Things I looked at while prepping up for interviews, and generally gaining insights about AI/ML problems during my PhD. It is, by no means, a complete list but reflects some things that I put down in my notes. Maybe dump it to Gemini, ask it to filter for what you want to study.
These are organized by Google's Gemini.
Some topics I had experience in because of my research / interests:
ML fundamentals
Maths / Stats / Probs.
Loss functions / Auxiliary losses (thesis)
LLM fundamentals / prompting / limits / uses (thesis)
PbRL / RLHF / RL (thesis)
User study / AB testing / significance testing (prior works)
Active learning (prior works)
Neural Architecture search (prior works)
Human aware AI (prior works)
Table of Contents
Good Scientist can reason
1. Recommended Reading
1.1 Reinforcement Learning
Reinforcement Learning: An Introduction - Richard S. Sutton and Andrew G. Barto
Dynamic Programming and Optimal Control - Dimitri P. Bertsekas
Algorithms for Reinforcement Learning - Csaba Szepesvári
Markov Decision Processes: Discrete Stochastic Dynamic Programming - Martin L. Puterman
Deep Reinforcement Learning Hands-On - Maxim Lapan
Pivotal papers on Policy Gradient (PG), A2C, and other algorithms
An Algorithmic Perspective on Imitation Learning
ESE 650: Learning in Robotics Spring 2023 - Pratik
Laura Graesser - Foundations of Reinforcement Learning
Lil'Log - Policy Gradient Algorithms
Density Constrained Reinforcement Learning
Grokking Deep Reinforcement Learning
Deep Reinforcement Learning: Frontiers of AI - Mohit Sewak
1.2 Decision Making
Planning with Markov Decision Processes - Mausam
1.3 RLHF / Alignment / RM
Key papers on Reinforcement Learning from Human Feedback (RLHF), Alignment, and Reward Modeling
The Alignment Problem
Human Compatible - Stuart Russell
Artificial Intelligence: A Guide for Thinking Humans - Melanie Mitchell
Reward hacking, IRL, LfD, offline RL, bandits, dueling bandits
1.4 AI / Machine Learning
Artificial Intelligence: A Modern Approach (AIMA)
Pattern Recognition and Machine Learning - Christopher M. Bishop
Designing Machine Learning Systems - Chip Huyen
Introduction to Machine Learning - Ethem Alpaydın
Lifelong Machine Learning - Morgan & Claypool (Continual Learning and Catastrophic Forgetting )
1.5 Fundamental NLP
Speech and Language Processing - Daniel Jurafsky and James H. Martin
Foundations of Statistical Natural Language Processing - Christopher D. Manning and Hinrich Schütze
1.6 Other Recommended Books
Notes on Theory of Choice
Trust in Machine Learning, Formalization, and Applied Areas
Books on Causality - Judea Pearl; Counterfactuals
Books on Probabilistic Graphical Models
A Guided Tour of Artificial Intelligence Research - Pamela McCorduck
Should We Trust Artificial Intelligence?
Some books on Markov Chains
2. Areas to Cover
2.1 General
Good MLE can implement
2. Areas to Cover [LLM Heavy]
2.1 LLM (Large Language Models)
Understanding the fundamentals of LLMs
Scaling of LLMs
Multimodal models? Latest ones? what's the key difference? why is it harder than LLMs
Latest fine-tuning methods, key challenges, limitations? LLM Modulo
RLHF / PbRL for control domains, LLMs, differences? similarities?
Latest techniques in RLHF - RM based vs. DPO / IPO. Why?
Agents: When do things work? When does it break?
Agents: CoT, ReAct, Reflexion, Voyager, Eureka, ToM, Planning & Reasoning
Understanding RAG techniques, why? where?
2.2 LLM Optimization
Hardware-specific optimization techniques - Accelerators, specialized hardware
Software-level optimization tricks - Tiny ML, flash attention variants
Main ideas behind DeepSpeed
ZeRO-1 Optimizer
ZeRO-2 Optimizer
Horovod for distributed training
2.3 Others
3. Common ML Engineering Interview Questions
Explain the Bias-Variance trade-off. How does it affect model performance?
What is cross-validation and why is it important?
Describe different types of cross-validation techniques.
Differentiate between supervised, unsupervised, and semi-supervised learning with examples.
Explain regularization and its importance in machine learning.
Difference between L1 and L2 regularization.
Feature selection techniques and identifying important features.
Handling missing data and common imputation techniques.
Steps involved in a typical NLP pipeline.
Methods for reducing dimensionality and how they work.
Concept of overfitting, identification, and mitigation strategies.
Differences between precision, recall, and F1-score; when to prioritize each.
Understanding the Curse of Dimensionality and its impact.
Explain the k-nearest neighbors (KNN) algorithm and determining the value of 'k'.
Cross-entropy vs. contrastive loss.
Dealing with class imbalance.
Multi-armed bandits, MLE vs. Bayesian approaches.
Importance of randomization in A/B tests and understanding p-values.
Bias/variance trade-off in non-parametric models.
Overfitting and model capacity.
Regularization techniques in deep neural networks.
Logistic regression and proof of global minima.
Explain gradient descent, batch normalization, and acceleration techniques.
Working principles of CNNs, RNNs, Transformers, and Attention mechanisms.
Difference between bagging and boosting; computational differences between XGBoost and Random Forest.
Designing a recommendation system for books.
Understanding page ranking systems.
4. Skills
Technical Skills:
Basics Take NumPy and PyTorch and know everything about it. Know about JAX, XLA, optimizers. Equations to code is not easy, learn it.
Code up algorithms - basic Classification - SVMs, binary, multi; regression; bandits; RL
Transformers - know why it was needed, why people don't like it. why do they still have to bear it.
Decision Trees. This deserves a special bullet.
You claim to know CS - please know bash.
Monitoring, logging, docker, git, kubernetes? They will ask - "nothing works, what to do?"
CUDA : Experience with flash attention, Triton, and CUDA fundamentals.
High-Performance Computing (HPC) : MPI, multi-node setups on SLURM, Turi Bolt, and Horovod.
Weights & Biases (W&B), MuJoCo, Hacktoberfest.
C++ Python. SQL. Main libraries.
Distributed Computing : Hadoop and Spark.
Good Candidate can answer
5. Interview Preparation Resources
6. Behavioral Interview Questions
Tell me about yourself.
Why are you interested in our company?
Why are you interested in this position?
What do you know about our organization?
What is your relevant experience for the ML engineering role?
Describe a weakness.
Share an innovative solution you've implemented (non-technical).
Understanding and applying the STAR method for behavioral questions.
7. Data Science Knowledge
Application Areas
Search Engines
Advertising Technology
Recommender Systems
Speech Recognition
Basics
Understanding measures of central tendency (mean, median, mode)
Measures of spread (standard deviation, interquartile range, range)
Data distribution shapes (skewness, kurtosis, unimodal, bimodal)
Identifying and handling outliers
Common Data Science Questions
Difference between a Validation Set and a Test Set
Explanation of cross-validation techniques
Univariate vs. bivariate vs. multivariate analysis
Understanding Star Schema
Explanation of Cluster Sampling and Systematic Sampling
Eigenvectors and Eigenvalues
Supervised vs. Unsupervised Learning
Meaning of "Naive" in Naive Bayes
Detailed explanation of the SVM algorithm
Support vectors and kernel functions in SVM
Decision Tree algorithm, entropy, and information gain
Preference between Python or R for text analytics
Importance of data cleaning in analysis
Scenarios where false positives are more critical than false negatives, and vice versa
When both false positives and false negatives are equally important
Additional Resources
Learning from Imbalanced Classes: Article
Understanding Hadoop and Spark
MapReduce paradigm
Statistical tests and their applications
Current trending projects in data science
Recommender systems in streaming platforms
A/B Testing and Experimentation methodologies
Key Algorithms in data science
Analytics techniques
Behavioral interview preparation
Business case questions
Database design principles
Machine Learning system design
Probability concepts and problems
Product metrics and analysis
Proficiency in Python, SQL, or Pandas
Statistical fundamentals