jdeko.me/bball
jdeko.me
about jdeko.me
jdeko.me source code
jdetok on github
linkedin

about jdeko.me/bball

created & maintained by Justin DeKock
/-|-\
jdeko.me/bball is a site i created to meld my love for the nba/wnba with my desire to design, build & maintain information systems. learn more about the systems i designed to support this site below
/-|-\

systems designed to source, store, & display official nba/wnba data

sourcing nba/wnba stats with go

all data is sourced from nba.com with the go-etl program referenced above. the cli element of the program allows me to run the etl code in different ways in different scripts. it is used in "build" mode in my database build script (/scripts/bld.sh in the github repo) to fetch & insert data for every nba & wnba regular season/post season game since 1970. it's also run in "daily" mode in a cronjob (runs /scripts/dly.sh) every day at approximately midnight to fetch & insert data only for the previous day
/-|-\
the go-etl process takes advantage of go's concurrency features to make several http requests to nba.com in quick succession. the package then processes and structures the data to match the postgres database design, chunks the large volume of data into small chunks, and inserts the chunks concurrently
/-|-\
this site was originally powered by a MariaDB database with data sourced from nba.com using the nba_api python package. this system worked well, but i wanted to learn more of the lower-level concepts of http requests abstracted away by this package, so i decided to rewrite the entire etl process, with my own http requests, in Go. the documentation from the nba_api was incredibly helpful in figuring out this process
legacy python ETL | py-nba-mdb

storing the stats in postgres

all stats on the site are served from a postgres database server running in a docker container. the database was designed following the data normalization principles outlined in Codd's third normal form
/-|-\
the database is built by a single shell script - /scripts/bld.sh in the github repo. the script builds & runs the docker container, which is configureed in the Dockerfile & compose.yaml files, executes SQL statements (scripts from /sql in the github repo) to create all schemas, tables, procedures, etc., uses the go-etl cli to source & insert nba/wnba data since 1970, and runs several stored procedures to process & load the inserted data into their destination tables
/-|-\
the go-etl program inserts data only into the tables in the intake schema. each table in this schema is designed to match the structure of the json response from a specific endpoint on nba.com. this keeps the changes made to the source data minimal before being inserted, which makes errors less likely and the pipeline more maintainable. the jdeko.me/bball api primarily interacts with tables in the database's api schema, which contains tables specifically designed for quickly accessing aggregated player stats. the data in these tables is deleted and reaggregated each night after new data is inserted into the database