<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Statistical Analysis | Nick Analytics</title>
	<atom:link href="https://www.nickanalytics.com/category/statistical-analysis/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.nickanalytics.com</link>
	<description></description>
	<lastBuildDate>Mon, 03 Jun 2024 19:11:50 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.1</generator>

<image>
	<url>https://www.nickanalytics.com/wp-content/uploads/2024/03/cropped-mini-logo-wordpress-32x32.jpg</url>
	<title>Statistical Analysis | Nick Analytics</title>
	<link>https://www.nickanalytics.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Boston Marathon: A Statistical Analysis</title>
		<link>https://www.nickanalytics.com/boston-marathon-predict-finish-times/</link>
		
		<dc:creator><![CDATA[Nick]]></dc:creator>
		<pubDate>Sat, 17 Feb 2024 18:23:59 +0000</pubDate>
				<category><![CDATA[Statistical Analysis]]></category>
		<guid isPermaLink="false">https://nickanalytics.com/?p=485</guid>

					<description><![CDATA[As a passionate runner myself, I'm always interested in knowing more about the marathon and in particular the data analysis part of it. So, I decided to look around on the internet to see if there are interesting datasets about this epic distance and its participants.]]></description>
										<content:encoded><![CDATA[
<div class="et_pb_section et_pb_section_0 et_section_regular" >
				
				
				
				
				
				
				<div class="et_pb_row et_pb_row_0">
				<div class="et_pb_column et_pb_column_4_4 et_pb_column_0  et_pb_css_mix_blend_mode_passthrough et-last-child">
				
				
				
				
				<div class="et_pb_module et_pb_text et_pb_text_0  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h1 data-sourcepos="3:1-3:62" style="text-align: left;">Boston Marathon &#8217;22 &amp; &#8217;23 Facts</h1>
<p>Welcome to this new post about my Data Analytics journey.</p>
<p>As a passionate runner myself, I&#8217;m always interested in knowing more about the marathon and in particular the data analysis part of it. So, I decided to look around on the internet to see if there are interesting datasets about this epic distance and its participants. I checked out the &#8216;big five&#8217; events and came across the <strong>Boston Marathon</strong>. This well known marathon publishes a lot of data about its participants like age, gender and lots of checkpoint data along the route. So, for me a true treasure trove to get my hands on.</p>
<p>I downloaded the 2022 and 2023 versions and had a great time analyzing all the ins and outs of these datasets. I cleaned them, added features, removed outliers and took the time to see if there are interesting statistical facts to be discovered. I also took it to the next level trying to predict each runner&#8217;s finish time during the course of the race. I put this part in a separate blog called &#8230;</p>
<p>So, if you are ready to learn more, read on and enjoy my findings.</p>
<p>&nbsp;</p>
<h1>The dataset (2022 and 2023)</h1>
<p><span>I could not download the data in one go, so I had to do it gradually and ended up with about 200 csv files  (100 for each year). It contained details about some 25.000 runners for each marathon. Most important elements for me were:</span></p>
<p>&#8211; <strong>Bib number</strong> &#8211; the unique identification runners wear on their shirt<br /><span></span><span></span><span></span><span>&#8211; <strong>Runner age</strong><br />&#8211; <strong>Runner gender</strong><br />&#8211; <strong>Passing times</strong> at 5k, 10k, 15k, 20k, Half way, 25k, 30k, 35k, 40k, Finish</span></p>
<h3><span></span></h3>
<h1><span></span></h1>
<h1><span>The pre-processing steps</span></h1>
<h3></h3>
<h3>Diagram<br /><span></span></h3>
<p>I created a diagram to illustrate the steps I took in the first phase of pre-processing the data. </p></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_0">
				
				
				
				
				<span class="et_pb_image_wrap "><img fetchpriority="high" decoding="async" width="1600" height="1854" src="https://nickanalytics.com/wp-content/uploads/2024/04/pre-processing_steps.png" alt="pre-processing" title="pre-processing_steps" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/pre-processing_steps.png 1600w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pre-processing_steps-1280x1483.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pre-processing_steps-980x1136.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pre-processing_steps-480x556.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 1600px, 100vw" class="wp-image-501" /></span>
			</div><div class="et_pb_module et_pb_text et_pb_text_1  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p><em>Overview of the pre-processing steps. I handled each year separately because the two Boston Marathons are unique. I had to remove some rows (runners) because of missing checkpoints. Outliers were also removed and a some very important features were added.</em></p>
<h3></h3>
<h3>Adding features to the data</h3>
<p>After the pre-processing steps I decided to add some new features to the data. I can use those for statistical purposes but also to enhance my machine learning model that will prevent finish times (see next blog).</p>
<p>Features (columns) that I added to the data are:</p>
<p><strong>&#8211; Average Pace between each checkpoint</strong></p>
<p><strong>&#8211; Percentage decay between each checkpoint</strong></p>
<p>I wanted to get an idea of how a runner is doing during the race. Is he/she losing pace or running a &#8216;flat&#8217; race. And on average what pace did we see at each passing point. These kind of questions can give insights in how &#8211; on average &#8211; the runners build up their race. The result can help runners improve their training and compare their performance with others. Another interesting aspect would be to compare the Boston Marathon with other marathons in terms of how easy or hard this run is.</p>
<p>Now let&#8217;s go to the statistics !</p></div>
			</div><div class="et_pb_module et_pb_text et_pb_text_2  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h1 style="text-align: left;">Statistical Analysis</h1>
<p style="text-align: left;">I did some very interesting analysis&#8217; with the Boston Marathon Data. Some of my findings speak for itself, so I won&#8217;t comment on those too much. At the more complicated ones I&#8217;ll add an explanation.</p>
<p style="text-align: left;"></p>
<p style="text-align: left;"></p>
<h3 style="text-align: left;">1. Male vs. Female participants</h3></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_1">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/gender_distribution_22_23.png" class="et_pb_lightbox_image" title="classified text"><span class="et_pb_image_wrap has-box-shadow-overlay"><div class="box-shadow-overlay"></div><img decoding="async" width="1440" height="903" src="https://nickanalytics.com/wp-content/uploads/2024/04/gender_distribution_22_23.png" alt="classified text" title="gender_distribution_22_23" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/gender_distribution_22_23.png 1440w, https://www.nickanalytics.com/wp-content/uploads/2024/04/gender_distribution_22_23-1280x803.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/gender_distribution_22_23-980x615.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/gender_distribution_22_23-480x301.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 1440px, 100vw" class="wp-image-514" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_3  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><ul>
<li><em><strong>2023:</strong> 10,517 females and 14,003 males participated.</em></li>
<li><em><strong>2022:</strong> 9,706 females and 13,283 males participated.</em></li>
</ul>
<p style="text-align: left;">
<p style="text-align: left;">
<p style="text-align: left;"><em></em></p>
<h3 style="text-align: left;">2. Average finish times for all runners</h3></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_2">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/avg_finish_times_22_23.png" class="et_pb_lightbox_image" title="clustered histogram"><span class="et_pb_image_wrap "><img decoding="async" width="1397" height="903" src="https://nickanalytics.com/wp-content/uploads/2024/04/avg_finish_times_22_23.png" alt="clustered histogram" title="avg_finish_times_22_23" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_finish_times_22_23.png 1397w, https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_finish_times_22_23-1280x827.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_finish_times_22_23-980x633.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_finish_times_22_23-480x310.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 1397px, 100vw" class="wp-image-510" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_4  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p style="text-align: left;"><em>It looks like 2023 was a bit faster on average.</em></p>
<p style="text-align: left;"><em></em></p>
<p style="text-align: left;"><em></em></p>
<h3 style="text-align: left;">3. Age Distribution (males and females)</h3></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_3">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/age_distribution_22_23.png" class="et_pb_lightbox_image" title="Age Distribution"><span class="et_pb_image_wrap "><img decoding="async" width="2780" height="1180" src="https://nickanalytics.com/wp-content/uploads/2024/04/age_distribution_22_23.png" alt="Age Distribution" title="age_distribution_22_23" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/age_distribution_22_23.png 2780w, https://www.nickanalytics.com/wp-content/uploads/2024/04/age_distribution_22_23-1280x543.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/age_distribution_22_23-980x416.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/age_distribution_22_23-480x204.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 2780px, 100vw" class="wp-image-509" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_5  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p style="text-align: left;"><em>2022 and 2023 show very similar age distributions. Both years have peak participant counts in the age groups around the mid-40s, with a broad spread from young adults to seniors. </em></p></div>
			</div><div class="et_pb_module et_pb_text et_pb_text_6  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3></h3>
<h3>4. Distribution of Finish times per gender (2022)</h3></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_4">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/distribution_finish_times_per_gender.png" class="et_pb_lightbox_image" title="Distribution of Finish times per gender (2022)"><span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="1401" height="450" src="https://nickanalytics.com/wp-content/uploads/2024/04/distribution_finish_times_per_gender.png" alt="Distribution of Finish times per gender (2022)" title="distribution_finish_times_per_gender" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/distribution_finish_times_per_gender.png 1401w, https://www.nickanalytics.com/wp-content/uploads/2024/04/distribution_finish_times_per_gender-1280x411.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/distribution_finish_times_per_gender-980x315.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/distribution_finish_times_per_gender-480x154.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 1401px, 100vw" class="wp-image-513" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_7  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p><em>Interesting insight on the distribution of finish times per gender. Mean finish time for all runners is 03:41:15 hrs. </em></p></div>
			</div><div class="et_pb_module et_pb_text et_pb_text_8  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3 style="text-align: left;">5. Pace comparison</h3>
<p style="text-align: left;">In my case &#8216;pace&#8217; is defined as the speed per km. </p></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_5">
				
				
				
				
				<span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="2079" height="1626" src="https://nickanalytics.com/wp-content/uploads/2024/04/pace_comparison_across_years.png" alt="Pace Comparison Across Years" title="pace_comparison_across_years" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_comparison_across_years.png 2079w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_comparison_across_years-1280x1001.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_comparison_across_years-980x766.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_comparison_across_years-480x375.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 2079px, 100vw" class="wp-image-521" /></span>
			</div><div class="et_pb_module et_pb_text et_pb_text_9  et_pb_text_align_left et_pb_text_align_justified-phone et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p>The 2 years show similar results with paces for men around 12.50 to 13.00 km/hr and for women 11.50 to 11.00 km/hr. Near the 25k mark speed in 2023 crosses 2022 to the upside. The sudden peak at 35k in 2023 is not something I can explain. I know that there is a <strong>strong descent</strong> from 33k to 38k. Maybe due to the weather conditions or the course itself it lead to faster paces.</p>
<h3><strong></strong></h3>
<h3><strong>6. Mean Pace at Checkpoints (all runners)</strong></h3></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_6">
				
				
				
				
				<span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="1781" height="1250" src="https://nickanalytics.com/wp-content/uploads/2024/04/mean_pace_at_checkpoints.png" alt="Mean Pace at Checkpoints" title="mean_pace_at_checkpoints" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/mean_pace_at_checkpoints.png 1781w, https://www.nickanalytics.com/wp-content/uploads/2024/04/mean_pace_at_checkpoints-1280x898.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/mean_pace_at_checkpoints-980x688.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/mean_pace_at_checkpoints-480x337.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 1781px, 100vw" class="wp-image-516" /></span>
			</div><div class="et_pb_module et_pb_text et_pb_text_10  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p style="text-align: left;"><em>This is the average pace decline over both years for all runners.</em><em></em></p></div>
			</div><div class="et_pb_module et_pb_text et_pb_text_11  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3>7. Std Pace at Checkpoints</h3>
<p>The standard deviation of finish times at various checkpoints indicates the variability of runner performance at those points. For 2023, the variability (standard deviation) tends to be lower at earlier checkpoints and increases towards the end, suggesting more divergence in performance as the race progresses. This pattern is similar in 2022, but the increase in variability is more pronounced, especially towards the 40k mark.</p></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_7">
				
				
				
				
				<span class="et_pb_image_wrap "><img decoding="async" src="https://nickanalytics.com/wp-content/uploads/2024/04/mean_std_at_checkpoints-e1713614245292.png" alt="Std Deviation at Checkpoints" title="mean_std_at_checkpoints" /></span>
			</div><div class="et_pb_module et_pb_text et_pb_text_12  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p>&nbsp;</p>
<p>&nbsp;</p>
<h3>8. Which runners ran a perfectly &#8216;flat&#8217; race</h3>
<p>There are runners that can keep the same pace during the entire marathon. It is truly amazing that some only deviate 1 second on average between all checkpoints. I created a top 10 list of runners in 2023 with their performance. Note: those runners are not necessarily the top ranking athletes, but they just walk &#8216;machines&#8217;. </p></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_8">
				
				
				
				
				<span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="704" height="416" src="https://nickanalytics.com/wp-content/uploads/2024/04/top10-with-lowest-std.png" alt="" title="top10 with lowest std" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/top10-with-lowest-std.png 704w, https://www.nickanalytics.com/wp-content/uploads/2024/04/top10-with-lowest-std-480x284.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) 704px, 100vw" class="wp-image-530" /></span>
			</div><div class="et_pb_module et_pb_text et_pb_text_13  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p>The first runner on average deviated less than 1 second at each checkpoint and on the finish line. Truly amazing ! <img decoding="async" src="https://nickanalytics.com/wp-content/themes/Divi/includes/builder/frontend-builder/assets/vendors/plugins/emoticons/img/smiley-surprised.gif" alt="surprised" /></p>
<p>&nbsp;</p>
<h3>9. What is the perfect age to run a marathon?</h3>
<p>In order to answer this question we can look at the plots below. The plots displays the average finish times of the top 20 runners at each age. We can conclude that roughly between the age of <strong>25 to 35</strong>, males and females run their fastest times. After that age the line clearly starts to rise indicating a decline in pace.</p>
<p>This plot also enables runners to calculate what average decline is &#8216;reasonable&#8217;. <br /><strong>Example</strong>: the fastest women age group (25-35) could run the marathon in 175 minutes. The fastest 50 year old women could do it in 200. That is a 25 minute decline, purely based on the fact of getting older. So, if you are a 50 year old female, you&#8217;re &#8216;entitled&#8217; to adjust your marathon time with 25 minutes, compared to your 35 year old you.</p>
<p>We could do the same trick if we want women back their disadvantage for having less strenght than man. If you want to compare you performance with a male, you could then correct your score with the difference between the 2 lines, which is also around 25 minutes.</p></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_9">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/pace_per_age_and_gender.png" class="et_pb_lightbox_image" title="top 20 finishers average finish time"><span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="2780" height="2379" src="https://nickanalytics.com/wp-content/uploads/2024/04/pace_per_age_and_gender.png" alt="top 20 finishers average finish time" title="pace_per_age_and_gender" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_per_age_and_gender.png 2780w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_per_age_and_gender-1280x1095.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_per_age_and_gender-980x839.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/pace_per_age_and_gender-480x411.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 2780px, 100vw" class="wp-image-508" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_14  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3></h3>
<h3>10. Does Age influence Pace and Decay during a race</h3>
<p>In simple terms I want to know what effect age has during a race.</p>
<ul>
<li><strong>Negative Correlation with Pace:</strong> There is a consistent negative correlation between age and average pace at all checkpoints for both years, which implies that older runners generally have slower paces. This trend strengthens slightly as the race progresses.</li>
<li><strong>Positive Correlation with Standard Deviation (STD):</strong> In both years, there&#8217;s a positive correlation between age and the standard deviation at checkpoints, starting very weakly at earlier checkpoints and increasing towards the end. This suggests that older runners might show more variability in their pace as the race progresses.</li>
</ul></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_10">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/corr_age_std.png" class="et_pb_lightbox_image" title="Does Age influence Pace and Decay"><span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="3167" height="2380" src="https://nickanalytics.com/wp-content/uploads/2024/04/corr_age_std.png" alt="Does Age influence Pace and Decay" title="corr_age_std" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/corr_age_std.png 3167w, https://www.nickanalytics.com/wp-content/uploads/2024/04/corr_age_std-1280x962.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/corr_age_std-980x736.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/corr_age_std-480x361.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 3167px, 100vw" class="wp-image-511" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_15  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3></h3>
<h3></h3>
<h3>11. Average Pace Decay by Age Group</h3>
<p><span>This final plot shows the average pace decay from 5k to the finish line by age group for the 2022 and 2023 Boston Marathons. This visualization helps compare how pace decay differs across age groups between the two years. </span></p>
<p><span>Conclusion is that pace decline is small in the younger years of a runner (2-4%) and might go up to 5-7% for older runners. In order words, the pace a runner has at the 5k checkpoint may decline from 2% to 7% depending on the age.</span></p></div>
			</div><div class="et_pb_module et_pb_image et_pb_image_11">
				
				
				
				
				<a href="https://nickanalytics.com/wp-content/uploads/2024/04/avg_pace_decay.png" class="et_pb_lightbox_image" title="Average Pace Decay by Age Group"><span class="et_pb_image_wrap "><img loading="lazy" decoding="async" width="1693" height="1101" src="https://nickanalytics.com/wp-content/uploads/2024/04/avg_pace_decay.png" alt="Average Pace Decay by Age Group" title="avg_pace_decay" srcset="https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_pace_decay.png 1693w, https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_pace_decay-1280x832.png 1280w, https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_pace_decay-980x637.png 980w, https://www.nickanalytics.com/wp-content/uploads/2024/04/avg_pace_decay-480x312.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) and (max-width: 980px) 980px, (min-width: 981px) and (max-width: 1280px) 1280px, (min-width: 1281px) 1693px, 100vw" class="wp-image-507" /></span></a>
			</div><div class="et_pb_module et_pb_text et_pb_text_16  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h1></h1>
<h1>Conclusion</h1>
<p>In this blog post, I have done a data cleaning and analysis exercise of the Boston Marathon 2022 and 2023 edition. My goal was to give insights into the elements that can influence a runner&#8217;s performance, like age, gender, pace and decay during the race.</p>
<p>Many plots will not be so surprising, but some provide information that is not readily available. Examples are the comparison of two marathons in different years, or plots that explain to what extend age influences the runner&#8217;s <strong>pace or decay</strong> during the race. I also concluded that the &#8216;ideal age&#8217; to run a fast marathon is between <strong>25 and 35</strong>. On top of that I noted that women due to their strength have a disadvantage over men of about <strong>25 minutes</strong>. This difference doesn&#8217;t change over time.</p>
<p>Next blog regarding the Boston Marathon covers the <strong>prediction of finish times</strong> during the race by using Machine Learning. If you&#8217;re interested in this topic as well, please click this link:</p>
<p>Thanks for reading my blog. Nick.</p></div>
			</div><div class="et_pb_module et_pb_text et_pb_text_17  et_pb_text_align_justified et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3>The coding I&#8217;ve done (in VS Code)</h3>
<p>Check out the code of this project on Github (sweetviz part): <a href="https://github.com/nickanalytics/Credit-Card-Fraud" title="Nick Analytics - Credit Card Fraud pt. 1">Nick Analytics &#8211; Credit Card Fraud pt. 1</a><a href="https://colab.research.google.com/drive/12fzJuVqZ5-AMaH_h2nXqaicj_Z8doglX?usp=sharing" title="PyCaret use case"></a></p></div>
			</div><div class="et_pb_module et_pb_divider et_pb_divider_0 et_pb_divider_position_ et_pb_space"><div class="et_pb_divider_internal"></div></div>
			</div>
				
				
				
				
			</div>
				
				
			</div>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
