A real ML Specialization question first, not a wall of copy
Correct answer plus per-choice explanation
Source link for follow-up study
Free daily set, then full-bank Pro when you want more
Question 1 of 10
Objective ml.026Foundations And Evaluation
According to the probability rules in Stanford CS229's probability review, which statement accurately describes the intersection of two events A and B?
Correct Answer: A. P(A ∩ B) ≤ min(P(A), P(B))
Concept tested: Foundations And Evaluation
A. ✓ Correct: According to the probability rules, the intersection of two events A and B has a probability that cannot exceed the minimum probability of either event.
B. × Incorrect: The maximum probability does not apply to intersections; it would be relevant for unions or other scenarios but not here.
C. × Incorrect: The sum of probabilities for individual events can never be less than their intersection, which could even be zero if they are disjoint.
D. × Incorrect: This formula incorrectly combines complements and does not reflect a valid probability rule.
Why this matters:Understanding these rules helps in accurately calculating joint probabilities between dependent events.
Question 2 of 10
Objective ml.022Recommenders And Reinforcement Learning
In content-based filtering, what is the primary goal when recommending items to a user based on their past preferences?
Correct Answer: B. To recommend items that match the user's feature profile.
Concept tested: Recommenders And Reinforcement Learning
A. × Incorrect: Predicting ratings based on similar users' preferences describes collaborative filtering, not content-based filtering.
B. ✓ Correct: Recommending items that match the user's feature profile accurately reflects the goal of content-based filtering in aligning item features with user preferences.
C. × Incorrect: Maximizing variance explained by each component relates to Principal Component Analysis (PCA), which is unrelated to content-based recommendation systems.
D. × Incorrect: Separating mixed signals into statistically independent components pertains to Independent Component Analysis (ICA) and does not apply to recommender systems.
Why this matters:This matters because the wrong choice changes how technicians or teams configure, troubleshoot, or support To recommend items that match the user's feature profile.
Question 3 of 10
Objective ml.017Unsupervised Learning
In the context of the EM algorithm, what is a key step in parameter optimization involving hidden variables?
Correct Answer: A. Updating parameters to maximize the expected log-likelihood with respect to the current estimate of hidden variables
Concept tested: Unsupervised Learning
A. ✓ Correct: It accurately describes a key step in parameter optimization involving hidden variables in the EM algorithm.
B. × Incorrect: Minimizing distance between data points and centroids is relevant to K-means clustering, not EM algorithm.
C. × Incorrect: Ensuring predicted probabilities sum to one pertains to softmax regression, unrelated to EM algorithm's steps.
D. × Incorrect: Defining similarity using a covariance function relates to Gaussian processes, not the EM algorithm.
Why this matters:This matters because the wrong choice changes how technicians or teams configure, troubleshoot, or support Updating parameters to maximize the expected log-likelihood....
Keep the momentum
You're 3 questions in. Want the full bank?
Unlock the full question set, timed exam mode, practice mode, saved progress, previous tests, and readiness scoring.
143 more questions, timed exam mode, and saved history are waiting in the full unlock.
Pro is active. Use the full bank, Exam mode, and saved box scores when you want deeper review.
Question 4 of 10
Objective ml.014Convex Optimization
For a differentiable function f: R^n → R, what condition must be satisfied at point x* to ensure it is an optimal solution for minimizing f(x)?
Correct Answer: A. The gradient of f at x* must be zero.
Concept tested: Convex Optimization
A. ✓ Correct: Setting the gradient to zero identifies critical points, which can include local minima if additional conditions like positive definiteness of the Hessian are met.
B. × Incorrect: A positive definite Hessian alone does not guarantee optimality without checking other conditions such as convexity.
C. × Incorrect: F(x*) equaling 0 would only be relevant for specific scenarios and not generally for identifying optimal points.
D. × Incorrect: Lying on the boundary of the feasible region can indicate constraint-activated minima but does not address the gradient condition necessary for an interior point.
Why this matters:This matters because the wrong choice changes how technicians or teams configure, troubleshoot, or support The gradient of f at x* must be zero.
Question 5 of 10
Objective ml.013Kernels And Margins
In the context of support vector machines, what does regularization primarily aim to prevent?
Correct Answer: A. Overfitting by penalizing large model weights
Concept tested: Kernels And Margins
A. ✓ Correct: Regularization helps prevent overfitting by adding a penalty term to the loss function that discourages overly complex models with large parameter values.
B. × Incorrect: Increasing model complexity can lead to overfitting, which regularization aims to avoid.
C. × Incorrect: While SVMs aim to maximize margins, regularization does not directly control margin size but rather penalizes model complexity.
D. × Incorrect: Dimensionality reduction is a different technique from regularization and is not the primary goal of adding a penalty term.
Why this matters:This matters because the wrong choice changes how technicians or teams configure, troubleshoot, or support Overfitting by penalizing large model weights.
Question 6 of 10
Objective ml.010Classification Models
According to the Stanford CS229 lecture notes, what is a fundamental assumption in Naive Bayes classifiers about feature independence given a specific class label?
Correct Answer: A. Features are conditionally independent given the class label.
Concept tested: Classification Models
A. ✓ Correct: Features are assumed to be conditionally independent given the class label in Naive Bayes classifiers, allowing for simplified probability calculations.
B. × Incorrect: Naive Bayes assumes conditional independence rather than mutual dependence among features within a specific class.
C. × Incorrect: This option contradicts the assumption that feature distributions can vary between different classes in Naive Bayes classifiers.
D. × Incorrect: Gaussian distribution assumptions are not required for Naive Bayes classifiers, although they may be used in other models like Gaussian discriminant analysis.
Why this matters:Understanding this concept helps learners configure and troubleshoot classification models correctly by recognizing the importance of feature independence within a specific class.
Question 7 of 10
Objective ml.002Supervised Learning
According to the CS229 lecture notes, what is the primary purpose of gradient descent in linear regression?
Correct Answer: A. Minimizing the cost function by iteratively updating parameters based on the derivative of the cost function.
Concept tested: Supervised Learning
A. ✓ Correct: Gradient descent aims to minimize a cost function through iterative updates using derivatives, which helps find optimal parameter values for linear regression models.
B. × Incorrect: While accuracy is important, gradient descent specifically focuses on minimizing error rather than maximizing prediction accuracy directly.
C. × Incorrect: Selecting features is part of the model design phase and not directly related to the process of updating parameters via gradient descent.
D. × Incorrect: D is incorrect as non-negative coefficients are a constraint that might be applied in certain contexts but are unrelated to the core purpose of minimizing cost through iterative updates.
Why this matters:Cost decisions depend on linking estimates, budgets, and actual performance in a way the team can act on.
Question 8 of 10
Objective ml.027Foundations And Evaluation
Given a matrix A and its inverse A⁻¹, what is the result of multiplying them together?
Correct Answer: C. The identity matrix
Concept tested: Foundations And Evaluation
A. × Incorrect: Vector because multiplying a matrix with its inverse results in an identity matrix, not a vector.
B. × Incorrect: Another matrix because the product of a matrix and its inverse specifically yields the identity matrix, which has ones on the diagonal and zeros elsewhere.
C. ✓ Correct: The identity matrix is correct as it represents the result when multiplying any square matrix by its inverse.
D. × Incorrect: A scalar because the multiplication results in an entire matrix (the identity matrix), not just a single number.
Why this matters:Security teams rely on this distinction when choosing the right protection or response for the risk in front of them.
Question 9 of 10
Objective ml.023Recommenders And Reinforcement Learning
In a recommender system, which technique is used to predict user preferences based on historical ratings by reducing the dimensionality of feature vectors?
A. × Incorrect: Dimensionality reduction because it helps in predicting user preferences by simplifying the feature space.
B. × Incorrect: Clustering because it groups similar data points but does not directly predict user preferences based on ratings.
C. × Incorrect: Anomaly detection because it identifies unusual patterns and does not focus on predicting user preferences from historical ratings.
D. ✓ Correct: Collaborative filtering is incorrect although related, as it uses past user behavior to recommend items but does not specifically address dimensionality reduction.
Why this matters:This matters because dimensionality reduction helps in making recommender systems more efficient by simplifying the feature space while retaining important information for predicting preferences.
Question 10 of 10
Objective ml.018Unsupervised Learning
According to the Stanford CS229 main notes, what is a key aspect of principal components in PCA when considering data variance?
Correct Answer: A. To maximize the variance explained by each component
Concept tested: Unsupervised Learning
A. ✓ Correct: To maximize the variance explained by each component accurately reflects the primary goal of PCA in identifying directions (principal components) that capture maximum variability in the data set, aligning with the concept's purpose as described in the source.
B. × Incorrect: Minimizing reconstruction error during dimensionality reduction is a related but distinct objective often associated with other techniques like autoencoders rather than being the primary goal of PCA.
C. × Incorrect: Ensuring equal distribution of variance across all dimensions contradicts the fundamental principle of PCA, which aims to concentrate variability along fewer principal components.
D. × Incorrect: Maintaining original data variance without transformation does not align with the purpose of PCA, which seeks to transform and reduce the dimensionality of data while retaining as much information (variance) as possible.
Why this matters:This matters because the wrong choice changes how technicians or teams configure, troubleshoot, or support To maximize the variance explained by each component.
Free preview complete
You've reached the free preview.
Go beyond sample questions with the full source-backed bank, objective practice, exam mode, saved progress, and readiness scoring.
153 verified questions are ready behind the full unlock.
Pro is active. Use the full bank, readiness score, and saved exams when you want deeper reps.
Ready to finish?Answer the questions, then submit your test for review.
Go Pro
Unlock the full ML Specialization bank.
Get the full source-backed bank, timed exam mode, practice mode, saved progress, previous tests, and readiness scoring for this exam.
153 full-bank questionsEvery choice explainedExam Mode and Practice ModeQuestion sets and random testsReadiness score and trendsPrevious test box scores
You've answered 0/10 free questions today.
Locked: 143 more questions in the full bank.
Locked: exam simulation mode and end-of-exam review.
Today's free set refreshes soon. Upgrade to continue with the full bank.
Box scores, domain breakdowns, and full answer explanations for Pro exam attempts on this browser.
Today’s Set
10 questions
Daily set rotates at 10:00 AM local time
Progress
0/10
Answered on this page session
Accuracy
0%
Loading countdown…
7-day score keeper
Answer questions today and this will become a rolling 7-day scorecard.
Local history
Optional progress sync
Keep today’s practice moving
Guest progress saves automatically on this device. Add an email later when you want a magic link that keeps your daily ML Specialization practice in sync across browsers.
Guest progress saves on this device automatically
153 verified questions are currently in the live bank. Questions updated at May 12, 2026, 5:26 PM CDT. The daily set rotates at 10:00 AM local time, and each explanation links back to the source used to write it. Use the web set for quick practice, then switch to the app when available for larger banks and deeper review.
Careers and fields this exam supports
This ML specialization path is strongest for people building a foundation before they move into more applied ML engineering or data-science roles.
Role examples: aspiring machine learning engineer, analyst moving into ML, junior data scientist, and AI career changer.
Where it shows up: machine learning foundations, model intuition, supervised learning, and analytical problem solving.
On-the-job payoff: the next step is stronger conceptual grounding before platform-specific certifications.
Typical next step: It works well before cloud ML, TensorFlow, IBM AI, or data-platform exams.
Stanford Machine Learning is easiest once you understand what this exam is really rewarding beyond surface memorization.
Current emphasis in this bank: Supervised Learning (25%).
Questions in this DeepLearning.AI lane usually separate the right answer from the merely familiar answer by scenario fit, scope, and the exact decision the exam is testing.
Best official starting point: Machine Learning Specialization.
How are Stanford Machine Learning questions generated?
dotCreds builds Stanford Machine Learning practice questions from DeepLearning.AI documentation and source-backed references, with official or primary sources preferred first. The questions are written for realistic study practice, not copied from exam dumps.
How are explanations sourced?
Each question includes a source-backed explanation and a link to the documentation or reference used to validate the answer. If an official page is too broad, dotCreds uses a reputable answer-level reference instead of pretending a generic page proves the answer.
What score do I get?
The page tracks today's answered count and accuracy for the 10-question daily set, then saves a 7-day score history on this device so you can see your recent practice trend.
Why use this site?
The site is the fastest way to start Stanford Machine Learning practice without installing anything. It is built for daily recall, quick weak-topic discovery, and source-backed explanations you can review immediately.
Why use the app when available?
The web page is the quick free sampler. If a dotCreds app is available for Stanford Machine Learning, the app is better for larger banks, focused weak-domain drills, longer review sessions, and mobile study routines.
Related practice tests
If you want another cert after Stanford Machine Learning, these pages keep the same daily-question format with source-backed explanations.