Posts

Showing posts from March, 2026

Building an R Package: Jviz

For this project, I developed the skeleton of an R package, which I named Jviz. The intent is for this package to help beginner R users to easily summarize, explore, and visualize data. I thought of this idea since it can be complicated to write long and complex code to understand a dataset. The goal for this package is to allow basic data analysis and visualization to be easy, efficient, and beginner-friendly.  This package is intended for students, new analysts, and anyone who wants to start learning data visualization with R. There will be functions like quick_summary(), which is meant to easily summarize a dataset, plot_distribution(), which graphs the distribution of a variable, and plot_relationship() for comparing two variables, and group_avg() for calculating averages by groups.  In the DESCRIPTION file, I added the fields stating the package's name, version, and author. I added ggplot2 and dyplr in the import field because they support data visualization and manipulat...

Module 9 Assingment

Image
For this assignment, I chose the Guns dataset from the given list, and I used it to compare different variables in R using basic R graphics, lattice, and ggplot2. The dataset allowed me to compare variables such as income, violent crime rates, and other categories.  For the first comparison, I plotted income against violent crimes, because they are both dependent continuous variables and don't overlap. I also used gun law categories to compare distributions across different groups. These visualizations help show patterns within data, as well as how different visualization tools help show these patterns in different ways.  Base R Visualization This scatter plot shows the relationship between income and violent crime. It helps vizualize weather higher income levels are associated with lower or higher crime rates.  This histogram shows the distribution of violent crimes in the dataset. It helps us identify how crime rates are spread and whether there are any extreme values....

Module #8 Assignment

For this week's assignment, I worked with the given dataset containing four variables for a set of students. The dataset included both males and females, and it also included ages, grades, and names. The assignment tasked me with importing the file into R, calculating the mean grade by sex, filtering the dataset for names containing the letter "i", and export the results into a csv file.  The first step I did was importing the dataset into R using the read.table(). After importing the data, I used the ddply() function from the pylr packaged that we were tasked with installing for this assignment to group the dataset by sex and calculate the average of the grade column. This summarized the comparison between male and female students instead of having to look through them individually. After generating the mean, we get the following output:  Sex Grade_Average 1 Female 86.9375 2 Male 80.2500 After that I converted teh dataset into a datarame and filtered u...