This R script loads several packages such as readxl, lmtest, car, and ggplot2, sets the working directory to the folder where the “data.xlsx” file is located, imports the data from the “data.xlsx” file and saves it as a data frame, and then renames the column names of the data frame as “x” and “y”. The script then creates a residual plot using ggplot2 to visualize the relationship between the fitted values and residuals of the linear regression model.
The script then fits a linear regression model using the data frame, performs the Breusch-Pagan test for heteroscedasticity and prints the result, followed by an interpretation of the result that either indicates evidence of heteroscedasticity in the model or no evidence of heteroscedasticity. The script then performs the White test for heteroscedasticity, prints the result, and interprets the result in the same way as for the Breusch-Pagan test. The significance level for both tests is set to 0.05.
How to Run the Code
The R script performs the following actions:
- Before running the script, make sure that you have installed the required packages, which are readxl, lmtest, car, and ggplot2. If you haven’t, type
install.packages("package_name")
for each package in the console and press enter to install them. - Once the packages are installed, load the packages by typing
library(package_name)
for each package in the console and press enter. - In the script, you will see a line of code that sets the working directory to the folder where the “data.xlsx” file is located. If your data is located in a different folder, replace the file path with the correct file path to your data.
- Next, change the value of the significance_level variable if necessary. The significance_level variable is set to 0.05 by default.
- Now, it’s time to import the data from the “data.xlsx” file. Make sure that the file name and the column names match the ones in the script. If not, modify the column names to match the ones in your data. To import the data, copy and paste the following line of code into the console and press enter:
data <- read_excel("C:\\Users\\barbi\\Desktop\\data.xlsx")
. - To check if the data was imported correctly, type
head(data)
in the console and press enter. This will display the first few rows of the data frame. - To create a residual plot, copy and paste the code block that starts with
ggplot(data.frame(residuals = resid(model))...
into the console and press enter. This will display a residual plot. - To fit a linear regression model, copy and paste the following line of code into the console and press enter:
model <- lm(y ~ x, data = data)
. - To perform the Breusch-Pagan test for heteroscedasticity, copy and paste the following line of code into the console and press enter:
bp_test <- bptest(model)
. This will calculate the test statistic and the p-value for the test. - To interpret the result of the Breusch-Pagan test, copy and paste the following code block into the console and press enter:
if(bp_test$p.value < significance_level) { cat("There is evidence of heteroscedasticity in the model.\n") } else { cat("There is no evidence of heteroscedasticity in the model.\n") }
. This will display a message indicating whether there is evidence of heteroscedasticity in the model or not. - To perform the White test for heteroscedasticity, copy and paste the following line of code into the console and press enter:
white_test <- bptest(model)
. - To print the result of the White test, type
white_test
in the console and press enter. - To interpret the result of the White test, copy and paste the following code block into the console and press enter:
if(white_test$p.value < significance_level) { cat("There is evidence of heteroscedasticity in the model.\n") } else { cat("There is no evidence of heteroscedasticity in the model.\n") }
. This will display a message indicating whether there is evidence of heteroscedasticity in the model or not.
Overall Code
## Homoscedasticity Test
#+------+------+
#| x | y |
#+------+------+
#| 0.25 | 2.46 |
#| 0.26 | 1.99 |
#| 0.15 | 2.13 |
#+------+------+
# Load required packages
library(readxl)
library(lmtest)
library(car)
library(ggplot2)
# Set the working directory to the folder where the .xlsx file is located
input_path <- "C:\\Users\\barbi\\Desktop\\data.xlsx"
significance_level = 0.05
# Import the data from the .xlsx file and save it as a data frame
data <- read_excel(input_path)
colnames(data) <- c("x", "y")
# View the first few rows of the data frame to confirm that the data was imported correctly
head(data)
# Create a residual plot
ggplot(data.frame(residuals = resid(model)), aes(x = fitted(model), y = residuals)) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(x = "Fitted values", y = "Residuals", title = "Residual plot")
# Fit a linear regression model
model <- lm(y ~ x, data = data)
# Perform the Breusch-Pagan test for heteroscedasticity
bp_test <- bptest(model)
print(bp_test)
# Interpret the result
if(bp_test$p.value < significance_level) {
cat("There is evidence of heteroscedasticity in the model.\n")
} else {
cat("There is no evidence of heteroscedasticity in the model.\n")
}
# Perform the White test
white_test <- bptest(model)
# Print the results
white_test
# Interpret the result
if(white_test$p.value < significance_level) {
cat("There is evidence of heteroscedasticity in the model.\n")
} else {
cat("There is no evidence of heteroscedasticity in the model.\n")
}
References:
- Wickham, Hadley; Bryan, J. Readxl: Read Excel Files. 2019. https://cran.r-project.org/package=readxl.
- Wickham, H. ggplot2: Elegant Graphics for Data Analysis. 2016. https://ggplot2.tidyverse.org.