Cross Country Predictive Rankings

Motivation

As a sports fan, I’ve always been drawn to both the games themselves and the advanced models behind sites like KenPom for basketball and ESPN’s FPI for football. Those tools not only indicate who is good, but also attach probabilities to real questions people care about: who makes a bowl game, who wins a conference, and who reaches the playoff.

As a Division I cross country athlete studying math and computer science, I realized nothing similar existed for my own sport. XCPRS is my attempt to fill that gap and bring explainable analytics to a sport I know from the inside.

Project Overview

Cross Country Predictive Rankings is an end-to-end system that generates ratings for both individual athletes and teams using only their race performances. There is no subjective inputs. From those ratings, the engine produces rankings, projections, and probabilities for outcomes like winning a conference meet or qualifying for nationals.

Behind the scenes, an always on service ingests new meet results, recomputes ratings and simulations, and publishes updated projections to the website. People following the sport get a live view of where things stand as the season evolves.

System Design

Engine: An analytics engine that ingests race results, formats and normalizes them before they hit the database. It also computes individual and team ratings, runs Monte Carlo simulations, and encodes the NCAA cross country qualification logic (including the Kolas system) so it can answer "who’s in and who’s out?" at any point in the season.
Database: A PostgreSQL database that stores results, normalized performances, ratings, simulations, and qualification states in a schema designed specifically for queries like "how did this team’s odds change after last weekend?"
Frontend & API: A Next.js frontend that exposes server-side API routes for reading analytics out of the database. The web app turns those queries into dashboards, rankings tables, and detail views that are fast to navigate and easy to explore.

Engine & Modeling Challenges

The engine is an iterative, purely relative rating system that fits a curve for each race and updates athlete and team grades over several passes.

The two hardest pieces of the engine were the rating system and the Kolas logic. For the ratings, I needed a system that was monotonic. If athlete A beats athlete B in a race, A has to come out with a better grade for that race. I also wanted each second to be worth roughly one unit on the scale. Cross country races do not move as one solid pack. The front, middle, and back of the field can run very different race patterns, so a one second gap can mean different things depending on where it happens.

I ended up fitting a curve for each race that turns times into grades. Most of the time one second of time difference maps to about one second of grade difference. Around that baseline the curve bends to match the shape of the actual race, while still respecting the finish order. That lets the system notice when a gap at the front is more meaningful than the same gap in the pack without turning every tiny fluctuation into a big rating swing.

On top of those grades, a simulation engine uses the observed variance around each athlete’s rating to run many Monte Carlo simulations. Those simulations feed into the probabilities for conference titles, regional qualification, and nationals podium chances.

Implementing NCAA Division I cross country qualifying, the Kolas system, was the other big challenge. The rules are deterministic but fussy. Automatic qualifiers come out of each region. Other teams pick up points by beating those auto qualifiers. "Push" situations can move extra teams into the meet. The engine and database schema both had to support recalculating those states throughout the season and looking at how upcoming races might change them.

On top of all that, many teams do not race their full lineup or full effort until later in the year. To make the output useful I had to combine the math with knowledge of the sport so I could decide when to treat a result as real information and when to treat it as noise. Being both an athlete and the person writing the model made those calls much easier.

Data Ingestion & Reliability

Cross country results live across many different timing companies, each with its own formats and scraping rules. Some sources can be scraped directly, while others effectively have to be imported by hand. Because each vendor has its own constraints, the engine and database are built so they can handle both paths cleanly while still presenting one coherent view of the season.

Scheduled jobs run the ingest, rating, and ranking pipeline so things stay up to date during the season.
The ingest layer is written to be tolerant of messy real-world data so a bad file or missing meet doesn’t break the rest of the system.
New results can be layered in as they appear, and the engine reruns the relevant pieces so projections reflect the latest races.

Tech Stack & Deployment

Engine: Python, using libraries like requests, BeautifulSoup, and lxml for ingest, plus numpy and pandas for normalization, rating calculations, and simulations.
Database: PostgreSQL hosted on Railway. A relational database is a good fit for joining results across seasons, tracking qualification logic, and indexing the kinds of queries people who follow the sport care about.
Frontend & API: Next.js, React, TypeScript, and Tailwind CSS deployed on Vercel. This stack gives me strong type safety, fast developer iteration, server-side rendering for rankings pages, and simple deployments that stay close to the edge.

How the Model Performs

Looking at all predictions the model made over the 2024 and 2025 seasons shows that model was generally well calibrated. When it indicates an outcome has a 10-20% or 60-70% probability, the actual hit rates are very close to those ranges. At the high end, some of the 90%+ predictions occur slightly less often than the model anticipates.

Predicted Range	Actual Hit Rate
0-5%	0.6%
5-10%	8.8%
10-20%	17.4%
20-30%	28.1%
30-40%	42.6%
40-50%	49.4%
50-60%	52.8%
60-70%	57.5%
70-80%	68.8%
80-90%	74.7%
90-95%	86.3%
95-100%	94.9%