Python

NASA Data Warehouse

Pulls data from NASA DONKI API, and stages data in S3 to load to Redshift using boto3

Beatles Bops

Builds an ETL pipeline of album information by The Beatles using the Spotify API, and loads to PostgreSQL

Excel Python Migration

Migrates hundreds of Excel files with the same format to SQL using Python.

TV Series NoSQL Database

Pulls TV season episode information for longest running TV series and stores in NoSQL database.

ISS Stream

Streams coordinates of the International Space Station (ISS) using Kafka

COVID-19 Data Project

Places Johns Hopkins University COVID-19 data into S3 bucket, and is processed using PySpark in Databricks to create a [dashboard](https://covid-19-jacobs.herokuapp.com/).

Consumer Complaints

Reads federal government consumer complaints csv and aggregates summary statistics.

Animal Crossing Popularity Data

Scrapes data on *Animal Crossing* villager popularity, joins to Kaggle table of villager traits, and appends to MySQL table using a CRON job.

Harry Potter NLP Project

Sentiment analysis conducted on the Harry Potter book series using a naive bayes classifier and natural language processing.

Anonymize

A simple find and replace script developed to anonymize confidential code in multiple text-based files without having to physically open them.