Code: Scraping DOGE

code
Published

June 8, 2025

I’ve been experimenting with GitHub Actions to automatically scrape the DOGE API once a week. This should make it possible to compare snapshots of each endpoint to one another over time.

You can clone the repository and run the script doge.sh from the command line to download a JSON file of all the contracts, grants, leases or payments displayed on the DOGE website. Or you can grab the link to the most-recent JSON file the scraper has downloaded and analyze from there.

For example, here’s how you might pull down the latest contracts.json file using R:

# Libraries
library(tidyverse)
library(jsonlite)

url_gist <- "https://raw.githubusercontent.com/DiPierro/doge/refs/heads/main/contracts.json"
raw_contracts <- fromJSON(curl::curl(url_gist))
contracts <- map_dfr(raw_contracts, bind_rows)

contracts %>%
  head(5) %>% 
  select(1:4) %>% 
  knitr::kable()
piid agency vendor value
140D0424F0005 Department of the Interior Family Endeavors, Inc 3329900357
2032H524A00020 Department of Treasury CENTENNIAL TECHNOLOGIES INC. 1900000000
HT001523D0002 Department of Defense A1FEDIMPACT 1826530973
FA872624FB071 Department of Defense Accenture 1491605888
FA701420D0007 Department of Air Force Deloitte Consulting LLP 2750000000

A word of caution: several news reports have called out errors in DOGE’s accounting of its government restructuring efforts, so take care when working with this data. Here’s some further reading:

Found a bug? Got a comment, criticism or suggestion for improvement? Drop me a note: adipierro.edsource.org.