Now that we’ve reached the end, it’s time to comment on the final model for BMI after removing outliers, working through the process of model building, adding… Read more “Conclusions”

## Interactions

Now it’s time to try adding interaction terms to the model. My existing variables are BloodPressure, SkinThickness, and Diabetes, so I could try adding combinations of these… Read more “Interactions”

## Outliers

While investigating the distributions of my variables with R weeks ago I could have looked for outliers, but here I’ll look again and consider removing any unusual… Read more “Outliers”

## Overfitting & Variable Selection

Overfitting is the act of adding too many variables to a model and “perfectly” fitting it to the training data. Example: When a student overstudies the practice… Read more “Overfitting & Variable Selection”

## Logistic Regression Prediction

Now let’s return to the logistic regression model discussed in previous posts and assess its prediction accuracy and stability. First, I’ll divide the data into training and… Read more “Logistic Regression Prediction”

## Linear Regression Prediction

In order to assess a model’s prediction capability, it is important to divide the data into training set and test set. So from here on, I’ll always… Read more “Linear Regression Prediction”

## Logistic Regression Assumptions

Now it’s time to test the assumptions and requirements of logistic regression models, just as we learned to do for linear regression models. Recall, the logistic regression… Read more “Logistic Regression Assumptions”

## Logistic Regression

Now that we’ve learned logistic regression, I can start working to understand / predict instances of diabetes in the patients in my dataset. I’m building this model:… Read more “Logistic Regression”

## Testing Assumptions of Linear Regression

In this post, I’ll test the five assumptions for linear regression on the model I’ve been developing in the last few posts. The model equation is shown… Read more “Testing Assumptions of Linear Regression”

## Multicollinearity

In my last post, I presented a model with 3 [significant!] predictors: BMI = 24.50 + .12*BloodPressure – .07*Age + 4.94*Diabetes Now it’s time to test this… Read more “Multicollinearity”

## Multiple Regression

Now that we’ve seen how to interpret a basic model predicting BMI, let’s try to improve it to explain more of the variation in BMI. We started… Read more “Multiple Regression”

## Interpreting the R-Squared (Model Fit)

As presented in the last couple of posts, my first linear model predicts BMI using BloodPressure. The results are shown below: The R-squared for this model is… Read more “Interpreting the R-Squared (Model Fit)”

## Interpreting the Coefficients

Now it’s time to actually interpret the coefficients from my linear regression model, created in my last post. Here it is again as a refresher: Because both… Read more “Interpreting the Coefficients”

## Linear Regression Basics

For my first linear regression model, I’ll predict BMI (Y, as mentioned in my previous post) using each patient’s BloodPressure (X). Below is the code I’ll use… Read more “Linear Regression Basics”

## Investigating my data with R

In this post, I’ll be using R to investigate my data and understand what I have to work with. First, I’ll take a look at the first… Read more “Investigating my data with R”

## Data & Impact

This blog will detail my investigation of a dataset fromÂ Kaggle.comÂ that could help diagnose instances of diabetes. The dataset contains 9 variables, including the following information: It seems… Read more “Data & Impact”

## Welcome!

Welcome to my example blog, where I will share my progress through Fundamentals of Business Analytics, working through a full-scale analysis over the course of a semester.… Read more “Welcome!”