Data Analytics
Klausur
Klausur
Set of flashcards Details
Flashcards | 129 |
---|---|
Language | English |
Category | Finance |
Level | University |
Created / Updated | 24.11.2024 / 08.02.2025 |
Weblink |
https://card2brain.ch/box/20241124_data_analytics
|
Embed |
<iframe src="https://card2brain.ch/box/20241124_data_analytics/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
What is the Python code to show the regression statistics of training data?
print('Performance Measures (Training data)') regressionSummary(train_y, toyota_ml.predict(train_X))
What is the Python code to show the regression statistics of validation data?
print('Performance Measures (Validation data)') regressionSummary(valid_y, toyota_ml.predict(valid_X))
What is the Python code to replace the spaces in all variable names with underscores _?
banking_df.columns = [s.strip().replace(" ", "_") for s in banking_df.columns] banking_df.head()
What is the Python code to convert a variable into a categorical variable?
banking_df["Education"].value_counts().sort_index()
banking_df["Education"] = banking_df["Education"].map({1: "Undergrad", 2: "Graduate", 3: "Advanced/Professional"})
banking_df.head()
What is the Python code to generate a new variable that takes the value 0 when Mortgage has the value 0 and takes the value 1 in all other cases?
banking_df["has_mortgage"] = [0 if x == 0 else 1 for x in banking_df["Mortgage"]]
banking_df.head()
What is the Python code to estimate a logit model: log(odds(has.mortgage = 1| income) = ß0 + ß1 * income?
X_simple = banking_df["Income"]
Y_simple = banking_df["has_mortgage"]
X_simple = sm.add_constant
(X_simple)logit_simple_mod = sm.Logit
(Y_simple, X_simple)logit_simple_mod_res = logit_simple_mod.fit()print(logit_simple_mod_res.summary())
What is the Python code to add explanatory variables and estimate it again?
X_full = banking_df[["Income", "Family", "CCAvg", "Education", "Age"]] X_full = pd.get_dummies(X_full, prefix_sep="_", drop_first=True)
X_full = X_full.astype(float) # Make sure that all columns have numerical data types
Y_full = banking_df["has_mortgage"] X_full = sm.add_constant
(X_full)logit_full_mod = sm.Logit(Y_full, X_full)
logit_full_mod_res = logit_full_mod.fit()print(logit_full_mod_res.summary())
What is the Python code to make a confusion matrix?
predict_valid = logit_reg.predict(valid_X) cm2 = confusion_matrix(valid_y, predict_valid)
ConfusionMatrixDisplay(cm2).plot()
What is the Python code to generate a lift chart?
import kds as kds
kds.metrics.plot_lift(valid_y, predict_valid)