Compilation of personal and online courses projects. This portfolio, as a whole, aims to demonstrate proven experience in Data Science principals including obtaining/cleaning data, building Extract, Transform, Load, (ETL) pipelines, Exploratory Data Analysis (EDA), and building and validating Machine Learning models.
This competition aims to build counterfactual models to predict buildings’ energy usage. A successful model should scale well and minimize the Root Mean Squared Log Error. Counterfactual models are estimates of energy usage before any improvements are made within the building. This estimate is then compared with the actual energy usage after the improvements to calculate energy usage and confirm that the improvements are in fact working.
This project focuses on analyzing interactions between users and articles on the IBM Watson Studio platform. New article recommendations are made to users based on their interactions with articles. Based on the data available, we can use various methods to make these recommendations. The methods used here are Rank Based, Collaborative Filtering, and Matrix Factorization.
Dashboards developing using Flask, Plotly, Pandas and NumPy. Using data acquired from WorldBank, it is cleaned and transformed into usable form using Pandas framework. Flask is used to create a HTML template for visualizations using Plotly.
Scans over a newspaper article and images looking for occurrences of a specified keywords and detecting faces. OpenCV is used to detect faces, tesseract to perform optical character recognition and PIL to put together resulting images onto a new contact sheet.