Skip to content

levist7/portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Portfolio

This portfolio is a compilation of data projects that I have done for research and portfolio purposes.

Skills

Methodologies: Data analysis, machine learning, deep learning, natural language processing, statistics, experiment design

Currently using 💻

Languages: Python, SQL
Libraries: Pandas, Numpy, pySpark, Scrapy, MySQL, PostgreSQL, Matplotlib, Plotly, Sklearn, Tensorflow, Keras
Deployment: Streamlit, Docker, Heroku
Framework: Databricks, Tableau, PowerBI, AWS, VSCode, Jupyter Notebook, Google Colab, Git and GitHub

Learning 🏗️

Language: C++
Cloud: Google Cloud Computing
Framework: MLFlow

Past tech ⌛

Visual Basic, some VBA, some Matlab, little Fortran90.
Scripting for finite element analysis software such as Ansys APDL, Opensees and CAST3M

Projects

Civil Work Bidding Price Prediction in San Francisco. AI-Powered

Based on machine learning algorithms, it helps users estimating construction cost of future housing or appartment projects in San Francisco, California.

Domains: supervised machine learning, feature engineering, data visualization, model performances, construction cost, investment, app deployment
App

Getaround Car Rental Price Predictor and Dashboard on a New Feature

Deployment of an online API to predict Getaround car rental price with an endpoint containing an XGBoost model and then production of a dashboard to give insights on implementing a new feature

Domains: data analysis, dashboard, supervised machine learning, FastAPI, Streamlit, app deployment, customer engagement
Dashboard, API/docs and API/predict

Credit Risk Model Development

Building a credit risk model by using Loan Data to provide a scorecard and a pipeline to calculate exposure loss

Domains: data cleaning, data analysis, supervised machine learning, statistical modeling, hypothesis testing, risk, finance

Hotspot Zone Segmentation in Uber Pickup Data

Creation of pipelines that determine the hot-zones guiding UBER drivers for optimal pickups

Domains: data analysis, data visualization, unsupervised machine learning, clustering, optimization

Disaster Tweet Analysis with Natural Language Processing

Building a deep learning model that predicts which Tweets talks about real disasters and which ones do not

Domains: natural language processing,spacy, tokenizing, deep learning, RNN, GRU, LSTM, word clouds, disasters

YouTube Comment Classifier

Trained Naïve Bayes algorithms to classify spam comments in Youtube videos

Domains: natural language processing, vectorizing, machine learning, confusion matrix, social media

Kayak Trip Planning: Extract, Transform and Load

Development of an app that recommend best destinations in France with up-to-date weather and hotels information

Domains: ETL, web scrapping, data lake, data warehouse, AWS, client engagement

Speed Dating Challenge

Extraction of insights on key factors for a second date

Domains: data cleaning and organizing, exploratory data analysis, insights

Closed-source projects

  • Python tools on spectrum compatible record selection and modification with cycle-and-shift algorithm

Domains: algorithm development, automation, optimization, large data sets, data analytics, signal processing, engineering, research
Manuscript on its development

  • ALCAMBER - Software package providing improved estimates of camber in concrete bridge girders

Domains: user interface, debugging, predictive models, visual basic, bridge engineering, research
Book chapter with a summary on its development
App on Demand

Micro Projects

A project on combined capabilities of Hadoop and Apache Spark on data analytics of a student score dataset

Learning material for C++

About

Compilation of my data projects

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published