Marathon Performance Analysis
This analysis investigates the relationship between environmental conditions and marathon performance across different ages and genders. Using data from five major marathons over 17-24 years, I employed exploratory plotting, mixed-effects regression modeling and, permutation importance analysis to examine how factors like temperature, humidity, and air quality affect runners’ performance. Key findings revealed that Wet Bulb Globe Temperature (WBGT) has the most significant environmental impact on performance, with stronger effects on women and older runners. The analysis uncovered a U-shaped relationship between age and performance that varies by gender, with women showing faster improvement in younger years but steeper decline post-peak. Implemented in R using packages like lme4 and ggplot2, this project showcases advanced statistical modeling, data visualization, and thorough documentation practices.
Keywords: Academic project, Regression analysis, Exploratory data analysis
View project on GitHub