Python Scikit-Learn Models Quiz
A 50-question quiz covering the Scikit-Learn machine learning library, from estimator basics and pipelines to model selection and evaluation metrics.
Question 1
What is the core method used to train a model in Scikit-Learn?
Question 2
Which method is used to generate predictions from a trained model?
Question 3
What does `estimator.score(X, y)` typically return for a classifier?
Question 4
What is a 'Transformer' in Scikit-Learn?
Question 5
Which method combines `fit()` and `transform()` into one step?
scaler = StandardScaler()
X_scaled = scaler.____(X_train)
Question 6
What is the shape of the input feature matrix `X` typically expected by `fit()`?
Question 7
What happens if you call `fit()` on an estimator that has already been trained?
Question 8
Which attribute usually stores the learned parameters after training (e.g., coefficients)?
Question 9
How do you instantiate a Linear Regression model?
from sklearn.linear_model import LinearRegression
model = ____
Question 10
What is the target vector `y` usually expected to be?
Question 11
For a classifier, what does `predict_proba(X)` return?
Question 12
If `predict_proba` returns `[0.2, 0.8]` for a binary classifier, what will `predict` return?
Question 13
What input does `predict()` require?
Question 14
Which method is used for unsupervised clustering models to assign labels?
Question 15
Can you use `predict()` on a model before calling `fit()`?
Question 16
Which class handles missing values by replacing them with the mean or median?
Question 17
How do you convert categorical string variables (e.g., 'red', 'blue') into integers?
Question 18
What does `OneHotEncoder` do?
Question 19
Why should you fit a scaler ONLY on the training set?
Question 20
Which preprocessing step is often required for Support Vector Machines (SVM) and KNN?
Question 21
What is the main purpose of `sklearn.pipeline.Pipeline`?
Question 22
How do you create a pipeline with a scaler and a classifier?
from sklearn.pipeline import make_pipeline
pipe = ____(StandardScaler(), LogisticRegression())
Question 23
When you call `pipe.fit(X, y)`, what happens to the intermediate steps?
Question 24
When you call `pipe.predict(X)`, what happens?
Question 25
What is `ColumnTransformer` used for?
Question 26
What does `StandardScaler` do?
Question 27
What does `MinMaxScaler` do?
Question 28
Which scaler is robust to outliers?
Question 29
What is the difference between `Normalizer` and `StandardScaler`?
Question 30
If you use `fit_transform` on the test set with `StandardScaler`, what happens?
Question 31
Which function is used to split data into training and testing sets?
Question 32
What is the purpose of the `random_state` parameter?
Question 33
What is 'Stratified Sampling' (via `stratify=y`)?
Question 34
Which model is a good baseline for classification tasks?
Question 35
How do you perform K-Fold Cross-Validation?
Question 36
What does `GridSearchCV` do?
Question 37
How do you define the parameter grid for `GridSearchCV`?
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['linear', 'rbf']
}
Question 38
What is the advantage of `RandomizedSearchCV` over `GridSearchCV`?
Question 39
After fitting `GridSearchCV`, how do you access the best model?
Question 40
Can you tune pipeline parameters with GridSearchCV?
Question 41
Why is Cross-Validation preferred over a single Train/Test split?
Question 42
What is `LeaveOneOut` cross-validation?
Question 43
What does `cross_val_predict` return?
Question 44
When should you use `TimeSeriesSplit`?
Question 45
Does `cross_val_score` return a fitted model?
Question 46
Which metric is appropriate for a classification problem with imbalanced classes?
Question 47
What does the Confusion Matrix show?
Question 48
What is the ROC Curve?
Question 49
Which function calculates the Mean Squared Error?
Question 50
What does an R^2 score of 1.0 indicate?
