Yash Agrawal

MS in Information Technology and Management (Data Analytics and Management)

Email: yagrawal1@hawk.iit.edu

Phone: 312-684-6967

About Me

Hi, I am Yash Agrawal

I am a graduate student with study emphasis on Information Technology and Management and a data evangelist with a keen sense for data, its modeling and mining, pattern recognition, and trend analysis to discover valuable insights and visually represent them. I am knowledgeable in forecasting, pattern and trend identification and data insights. Highly skilled in public speaking, data analysis, and database design. Experienced in data visualization, customer segmentation, and analytics. Adept in classroom instruction and collaboration with students and faculty to create sustainable solutions in the information technology field.

I will be graduating from the Illinois Institute of Technology in May 2021 with a master’s degree in Information Technology and Management with Data Management as my major. I am currently seeking a new opportunity to use my knowledge and skills to help companies forecast trends, patterns, and behaviors to make metric and prediction-based decisions to improve accuracy, success rate, sales volume, or reduce cost and risk of the outcome.

Scroll down to learn more about me!

Projects

Chicago Transit Crime Analysis | Feb 2020

Binary Classification Problem to predict and forecast incident happening or not at CTA locations using Various Significant Attributes.

Contributions:

  • Fetched and developed the train data using Here API, implemented Feature Selection using Filter method - Chi2 using Python
  • Used correlation matrix to evaluate the correlation among variables and then used N-fold cross evaluation.
  • Developed Predictive models from classsification algorithms Adaboost, Bagging and gradient Descent and random Forest.
  • Evaluated and Measured Metrics with test data in Accuracy, Precision and recall and got Accuracy of 79% in test data.

Artificial Intelligence-based Fault Prediction Model | December 2019

Implemented Fault Prediction Model to Forecast happening of network fault and what type of fault was that.

Contributions:

  • Used R script to import data from telecom network, performed cleansing and preprocessing data using Power queries.
  • Used correlation matrix to evaluate the correlation among variables and then used N-fold cross evaluation.
  • Interpreted the Mining frequent pattern using sequential pattern discovery using equivalence classes (SPADE) using R studio.
  • Evaluated and Measured Metrics with test data in Accuracy, Precision and recall and got Accuracy of 79% in test data.

End to End Data Warehouse for a Retail Store | April 2019

Data Warehousing (DW) is a process for collecting and managing data from varied sources to provide meaningful business insights. Data Warehouse plays an important role. It helps to analyze key aspects to improve sales of an enterprise.

Contributions:

  • Designed an end-to-end sales data warehouse using Pentaho DI and performed various transformations on the data (Row Normalizer, Database Lookup, if field value is null)
  • Designed a database using Star Schema and built a multi-dimensional cube using Mondrian
  • Configured the Pentaho BI server to deploy reports by creating a database connection in Pentaho Enterprise Console for central usage.

Bike Sharing Prediction Model | Sep 2019

This is Bike sharing prediction for forecasting the Sharing business of bike over the USA.

Contributions:

  • Applied Data processing of data and feature Engineering forgetting insights, Implemented Residual Analysis on developed models
  • Developed Linear Models on different parameters metrics AIC/BIC , Applied ANOVA on the relational Databases.
  • Trained model with methods such as Random Forest , Support Vector Machine(SVM) algorithm got accuracy of 87% on test data.

Credit Risk Analysis of CredX Bank | November 2018

Have you ever used a credit card at a store only to be declined? Or have the payments been blocked because you were charged a higher amount. Credit Card Fraud is a wide-ranging term for theft and fraud committed using a credit Card or any similar payment mechanism as a fraudulent source of funds in a transaction

Contributions:

  • Built a model that identifies the factors influencing the world happiness with 92% accuracy
  • Used classification algorithms like KNN, Naïve Bayes and logistic regression to validate the results.
  • Performed data cleansing and preprocessing using R statistical packages.
  • Identified significance variable by Weight of Evidence(WOE) and IV analysis, to solve imbalance issue where 0.17% are fraudulent.
  • Evaluated developed data model by machine learning algorithms a Logistic Regression, Decision Trees and Random Forest to predict Fraudulent Transactions and validated the models using confusion matrix and F1 score.

Experience

DIGBY'S DETECTIVE AND SECURITY AGENCY

https://digbysecurity.com/

Data Analyst Intern

May 2020-present

Responsibilities

  • Building a recommendation engine in python to enable strategic planning for predicting crime in 200+ Chicago transit stations.
  • Implementing various classification models such as XGBoost, Decision Trees, SVM, KNN, Ensemble methods to improve predictions
  • Diagnosed neural networks to elevate the performance up to 85.22%.
  • Implementing Mutomatic Machine Learning algorithms for better performnaces matrix and to enhance the result.

Illinois Institute of Technology

https://www.iit.edu/

Graduate Teaching Assistant

2020-present

Responsibilities

  • Graduate Teaching Assistant for Artificial Intelligence and Machine Learning Course.
  • Assisting professor and teaching Artificial Intelligence concepts and Machine Learning Classification to approximately 80 students.
  • Grade and provide constructive feedback on assignments
  • Provide guidance and mentoring for students as they learn new concepts

Tata Communications Pvt Ltd

http://tatacommunications.com/

Data Analyst

2018-2019

Responsibilities

  • Developed Extract Transform Load(ETL) pipeline process -Data Ingestion, data wrangling, preparation, visualization & AI-ML based model prototype implemented by python language.
  • Interpreted the telecom Module, it’s related process via eTOM, and Big Data Analytics process tracks and understood the flow.
  • Implemented fault prediction model based on Telecom Alarm Sequence Analyzer and Sequential Pattern Discovery Using Equivalence classes (SPADE)algorithm.Explored Spark Python API (PySpark) for faster Processing of data

Illinois Institute of Technology

https://www.iit.edu/

Community Desk Assistant

2020- Nov 2020

Responsibilities

  • Meet and greet staff, students and guest over the place.

Education

Illinois Institute of Technology ,Chicago

M S in Information Technology and Management (Data Management & Analytics)

3.66/4.0

2019-present

"Illinois Tech stands at the crossroads of exploration and innovation, advancing Chicago and the world".

  • Courses: Data Analytics, Data Warehousing, Data Mining, Database Management, Rich Internet Applications, Object Oriented Modeling and System Design, Java, Project Management
  • Projects: Chicago Transit Crime Analysis, Artificial Intelligence-based Fault Prediction Model , Bike Sharing Prediction Model, Handwritten Digit Recognition Using SVM, Credit Risk Analysis, UBER Cab Supply-Demand Gap Analytics

International Institute of Information and Technology,Bangalore

Post Graduate Diploma In Data Science

4.0/4.0

2018-2019

"The International Institute of Information and Technology, is among top 50 college in india, believes in kindling the spirit of this unique and creative discipline in every student who enters its portals"

  • Relevant Courses: Introduction to Data Management,Exploratory and Statistical Analysis, Big Data, Predictive Analytics 1 & 2, Data Warehousing,Visualization using Tableau, Neural Network.
  • Projects: Data Ingestion and Analyzing Bigdata on Apache-Hive Platform, FIFA World Cup Prediction

Acropolis Technical Campus,Indore

Bachelor's of Engineering In Computer Science

3.8/4.0

2013-2017

"Acropolis Technical Campus is one of the refined college in computer science."

  • Courses: Data Structure,Algorithm in Data Science, Theory of Computation

Certificate

IBM

Data Analysis using Python

May2020

Topics

  • Analyze data in Python using multi-dimensional arrays, Manipulating DataFrames in pandas, using SciPy library of mathematical routines and performing machine learning using scikit-learn.

IBM

Machine Learning using Python

May2020

Topics

  • Supervised vs. Unsupervised Learning, applications of different types of machine learning models

AWS

www.linkedin.com/in/yash-agrawal5/

Data Analytics Fundamentals

June2020

Topics

  • 𝗔𝗪𝗦 𝗗𝗮𝘁𝗮 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 𝗙𝘂𝗻𝗱𝗮𝗺𝗲𝗻𝘁𝗮𝗹𝘀 walks through the process for planning data analysis solutions and the various data analytic processes that are involved. The course introduces the 5 V’s that indicate the need for specific AWS services in collecting, processing, analyzing, and presenting your data. It discusses everything from raw data storage in #AmazonS3 or within an Amazon S3 data lake, to the visualization of analytical data using #AmazonQuickSight. There are so many services that fall between the beginning and end of the data flow.

Tableau

www.linkedin.com/in/yash-agrawal5/

Tableau Analyst

May2020

Topics

  • Desktop I, Desktop II, Analytics Best Practices, Analyst Skills Assessment Badge

Linked-in

www.linkedin.com/in/yash-agrawal5/

Linked-in Certifications

Mar20-May20

Topics

  • Python Data analysis
  • Recommendation System using Python Machine Learning & AI
  • SQL for Exploratory Data Analysis

Technical Skills

  • Data Science Skills: Neural Network, Data Analysis, Pattern and Trend Identification, Visualization and Data Insights, Data Warehousing, Data Mining, Regression Analysis, Data Modeling, Predictive Analytics
  • Programming Languages: R Programming, Python, Java,Jupyter Notebook
  • Machine Learning: Decision Tree, KNN, SVM, K- Means, Random Forest, Adaboost, Extra Tree and bagging.
  • Big Data Technology: Hadoop, Hive, Pig
  • Data Visualization: Excel, Tableau, PowerBI, Pentaho CDE
  • BI and Reporting tools: MS Power BI, Tableau, Excel VBA, VLOOKUP
  • Data Analytics: Excel, R, SQL, Python, Saiku Analytics
  • Database: MS Access, Oracle SQL developer, MYSQL, MS SQL
  • ETL Tools: Pentaho
  • Web Analytics: Google Analytics

Others

  • Recipient of “Tech Edge Scholarship”, Illinois Institute of Technology, Chicago
  • Held Workshop and gave seminar on “Training on Python –Beginners to expertise” to 800 members of different technical field in campus of Tata Communications Pvt Ltd, Pune in 2018
  • Secured 1st position in “Model Presentation using predictive analysis implementing python” organized by TATA Consultancy Services held at Oriental College Bhopal in 2015
  • Secured 1st position in Google Webpage Design held at IIT Indore in 2015
  • Got 3rd rank in “Line Follower Robotic Competition” at IIT Indore in 2016 .On the spot award from Development Lead for additional effort in delivering the zero bug free code.