Milestone 1: Description and Summarization
Milestone 3: regression
This assignment is to conduct a regression analysis using SalePrice
as the response. The response variable is also known as y, the target variable, or the output variable. It is up to you to decide which explanatory variables to use. Explanatory variables are also known as x, features, or input variables. This assignment will be graded in part on your choice of explanatory variables and the quality of your explanations of the regression output.
You are responsible for completing the following steps.
- Create two linear models through trial and error. Explain them.
- Create a model through the
ols_step_best_subset()
function of theolsrr
package. Explain it. - Create a fourth model based on, but not identical to, the model created above. The reason that it is not identical is that you will probably find that the model created in step 2 contains only some values of the categorical variables. This fourth model will be manually specified, not determined by the
olsrr
approach. This fourth model is the model you will use for milestone 4. Explain it.
You must come up with an overall summary that summarizes your reasoning for the path you took. You should include some visualizations or refer to the visualizations from milestone 2 to help explain why you chose some variables and not others. Keep in mind that your .qmd file should include your milestone 1 (possibly altered) and milestone 2 (possibly altered), so that the report builds up on what you now know about the data.