Ester Tsai

Logo

M.S. Data Science @ Columbia University
B.S. Data Science @ UC San Diego

Email: tsaiester@gmail.com

View My LinkedIn Profile

View My GitHub Profile

Database Management Task Automation

Project Overview

For my Database Management internship at Easy Event Planning, I worked directly under the CEO, Kathleen Kilmer, to automate tasks that used to be manual and time-consuming. I picked up Python web scraping and advanced Excel to expedite repetitive manual processes such as data collection, data preparation, and data validation, then wrote instructions and held training sessions for the team of seven students. Because I care about the company’s objective and the quality of task results, the CEO trusted me with tasks she would normally assign to her Database Management employee and gave me precious opportunities to update the company’s Main Database. The CEO loved how I care enough to go beyond the bare minimum and imagine bigger possibilities, stating how it was one of her best decisions to appoint me as team lead.

One of the tasks involved identifying which interns’ SEO works helped advance the company’s focus keywords and web pages to the first page of Google each week. This was a complicated, multi-stepped task that I tackled on and solved using Google Sheets, Excel, and Jupyter Notebook. Because this is a crucial task that needs to be done weekly, the CEO asked me to train the next cohort of interns, so they can carry on performing this task. In order to make it easier for the next cohort, I created a Python data appending template, an instructions document, and a tutorial video, then Zoom-called the interns to make sure they know how to use these resources.

In another task, the team received a list of 5000+ invalid city and state names (misspelled or completely wrong), and for each location we had to search up the correct location name and its coordinates, or mark the name as “remove”. The team started out searching every name up manually, and each row took 30 seconds to 2 minutes to complete. If each team member contributed 2 hours a day, it would take 7 days or more to complete the task, not counting the time to proof-read. It felt very inefficient to do this task row by row manually, so I attempted to web scrape with the invalid city and state names as input, and the Google-corrected names and coordinates as outputs. With some finer tweaking, I created a Python program that replaced 90% of manual work, so we were able to finish this task much earlier than planned.

I loved this internship experience because I could really see the effects of my contribution. It finally feels like I’m using the machines rather than being used by them. Some representative tasks are:

Conduct Training Sessions on Data Appending and Web Scraping

Instructions - Setting up Jupyter Notebook for Data Appending and Web Scraping

Template - Pandas Data Appending

Template - BeautifulSoup Web Scraping

Append Student Names from Multiple Sources to SEO Analytics Report and Create Tutorial for Future Interns

Instructions - Append to SEO Analytics Report

Template - Append to SEO Analytics Report

Assign Vendors with Invalid Locations to the Correct Market Region

Allow People to Search for a Market Based on Neighborhood

Scrape Google for Vendors’ Missing Info

Resolve Duplicates in Bridal Show Sales