Aspiring Data Scientist Ezhil Thaenraj, Websites, IT & Software
EDUCATION:
University of California, San Diego | B.S. Data Science | Track: Machine Learning and Artificial Intelligence
Halıcıoğlu Data Science Institute, UC San Diego
California High School
Coursework: Principles of Data Science , Programming and Basic Data Structures for Data Science ,
Theoretical Foundations of Data Science I , Calculus and Analytic Geometry for Science and
Engineering , Linear Algebra, Statistical Methods , Exploratory Data Analysis and Inference
SKILLS :
Concepts: Data Structures, Classification, Regression, Machine Learning, Data Pipelining, Databases,
Distributed Computing
Languages: Java, Python, SQL, R, C#, Spark, Scala, HTML, CSS , Django
Tools: Azure DataBricks, Azure SQL, Azure WebApps, Pandas, Numpy, Scikit-learn, MLLib, Librosa,
RStudio, Microsoft Office
PROFESSIONAL EXPERIENCE:
Student Developer, STUDEV Aug. 20 – Dec 25· Part-Time · Performed extensive analysis of client’s
data using Azure Databricks and R to provide insights · provided statistical and graphical analysis of sales
data for multiple clients
Web Developer, Keneland LLC Aug. 20 – Sept. 20 · Full-Time · Designed Keneland LLC's webpage using
Django and hosted through Azure Web Apps and Django · Built a responsive site that dynamically
handles presentation in desktop and mobile browsers through Azure Webapps · Created a sleek user
interface with affordances to improve navigability and overall user experience
Code Coach, The Coder School Jun. 18 – Jun. 19 · Instructed students from age 8-18 regarding how to
code in a multitude of programming languages · Led group instruction sessions regarding hardware usage
such as the Raspberry Pi and Arduino Uno · Managed front desk and greeted incoming customers as well
as managed back end scheduling
PROJECTS:
Credit Risk Classifier: Used pyspark, pandas, numpy, matplotlib, sclearn and MLlib packages to classify
whether or not a given record classifieds as a credit risk. Created a Machine Learning pipeline that
accurately predicts this information using Logistic Regression on enormous datasets (> 10 million records)
Link : https://github.com/pranavt2001/CreditRisk_LogisticRegression
NYC Yellow Cab Project:: Took NYC Taxi data from the city archives and performed data analytics to
collect insights regarding needless cab usage for Intra-Borough Travel and how the reduction cab usage
in these cases can reduce ones carbon footprint. Link: https://github.com/pranavt2001/NYC-Taxi
Movies MetaData Project: Took the Movies dataset from Kaggle and performed multiple data
transformations to filter and produce movies that were most apt for children given their particular
interests by genre. Link: https://github.com/pranavt2001/Kaggle-Movies-Metadata
COVID-19 Project: Took COVID-19 data from covidtracking.org and performed multiple transformations to
collect insights regarding the daily increases in cases per state, The testing rates of each state, and more.
Took these individual insights and presented them in understandable visual representations. Went on to
create widgets that automated filters for user ease. Link: https://github.co
m/pranavt2001/COVID-19
MLB LSLR: Using R to perform Linear regression on a dataset of MLB salaries and other features that
either categorize as indicators of performance or indicators of experience and performed single variable
and multiple variable regressions to see if features of experience, features of performance, or a
combination of both made the best model for predicting player salaries Link:
https://github.com/pranavt2001/MLB_LSLR
Certifications:
Databricks Certified Associate Developer for Apache Spark 3.0: Issued Sept 2020
Databricks Certified Associate ML Practitioner for Apache Spark 2.4: Planned for Mar 2020
Websites, IT & Software
Java
Python
R Programming Language
Scala
SQL