GitHub - marcoshsq/IBMDataScience: Repository containing the projects developed during the IBM Data Science Professional Certificate Specialization. Six in total.

IBM: Data Science Certificate Projects

INSTRUCTORS

Instructors: Rav Ahuja, Alex Aklson, Aije Egwaikhide, Svetlana Levitan, Romeo Kienzler, Polong Lin, Joseph Santarcangelo, Azim Hirjani, Hima Vasudevan, Saishruthi Swaminathan, Saeed Aghabozorgi, Yan Luo

This repository contains the projects developed during the IBM Data Science Professional Certificate.

About the Specialization:

There are 10 Courses in this Professional Certificate:

Projects:

Stock Market Data Analysis.

Project developed during module 05/10 of the IBM Data Science Professional Certificate Specialization. During the course, subjects such as web scraping and libraries were reviewed, in addition to laboratories and activities, we ended with this project. The objective of the project was to collect data to later develop a dashboard.

For the development of the project the following libraries were used: pandas, requests, bs4, html5lib, lxml, plotly, bs4, BeautifulSoup, yfinance.

Analyzing Data Using SQL and Python.

Project developed during module 06/10 of the IBM Data Science Professional Certificate Specialization. During the course, subjects such as Cloud Databases, Python Programming, Ipython, Relational Database Management System, SQL statements and etc., in addition to laboratories and activities, we ended with this project. The objective of the project was to create a table using IBM Db2 SQL, after filling the table with data from three CSV files about the city of Chicago, we performed an analysis using Python in a Jupyter Notebook.

House Sales Analysis with Scikit-Learn.

Project developed during module 07/10 of the course, the project scenario is: "You are a Data Analyst working at a Real Estate Investment Trust. The Trust will like to start investing in Residential real estate. You are tasked with determining the market price of a house given a set of features. You will analyze and predict housing prices using attributes or features such as square footage, number of bedrooms, number of floors, and so on."

For the development of the project the following libraries were used: pandas, matplotlib, numpy, seaborn and scikit-learn.

US Domestic Airline Flights Performance Dashboard.

Project developed during module 08/10 of the course, the goal was to build a dashboard using an internal tool provided by IBM, unfortunately, I couldn't use the tool for technical reasons, so instead, I used the Google Colaboratory notebook. However, this implies that my code will be slightly different from the one expected in the labs of the course, I've used a different library to create e.g. jupyter_dash. But the result was fine, and I pretty much enjoyed it!

Best Classifier Model.

Project developed during module 09/10 of the course, we used a dataset about past loans. The Loan_train.csv data set includes details of 346 customers whose loan are already paid off or defaulted. the goal was to practice all the classification algorithms thaught in the course.

Capstone Project.

This is it boys, the final and big one, the Capstone Project, the last course in the specialization, the scenario is: A company called SpaceY wants to compete with SpaceX, because of yes!

Now we (the Data Scientist need to develop a analysis to predict the success of the operation.For this project we needed to:

Collect data from public SpaceX API and SpaceX Wikipedia page. Explore data using SQL, visualization, Folium maps, and dashboards. Gather relevant columns to be used as features. Change all categorical variables to binary using one-hot encoding. Standardize data and use Grid Search CV to find the best parameters for machine learning models. And visualizing the accuracy score of all models.

Four machine learning models were produced during the project: Logistic Regression, Support Vector Machine, Decision Tree Classifier, and K Nearest Neighbors. All produced similar results, with an accuracy rate of about 83.33%. All models overpredicted successful landings. Anyways, was a fun project to do, but most important, it's was my first step in this data journey, and as the saying goes: Greatness in small beginnings!

Thanks to the instructors, and a huge shout-out to the people on Coursera.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
01 - Extracting and Visualizing Stock Data		01 - Extracting and Visualizing Stock Data
02 - Analyzing Data Using SQL and Python		02 - Analyzing Data Using SQL and Python
03 - House Sales Analysis with Scikit-Learn		03 - House Sales Analysis with Scikit-Learn
04 - Development of a Dashboard about Airline Performance		04 - Development of a Dashboard about Airline Performance
05 - Machine Learning with Python		05 - Machine Learning with Python
06 - Capstone Project		06 - Capstone Project
Course Certificates		Course Certificates
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

01 - Extracting and Visualizing Stock Data

01 - Extracting and Visualizing Stock Data

02 - Analyzing Data Using SQL and Python

02 - Analyzing Data Using SQL and Python

03 - House Sales Analysis with Scikit-Learn

03 - House Sales Analysis with Scikit-Learn

04 - Development of a Dashboard about Airline Performance

04 - Development of a Dashboard about Airline Performance

05 - Machine Learning with Python

05 - Machine Learning with Python

06 - Capstone Project

06 - Capstone Project

Course Certificates

Course Certificates

.gitattributes

.gitattributes

LICENSE

LICENSE

README.md

README.md

Repository files navigation

IBM: Data Science Certificate Projects

Instructors: Rav Ahuja, Alex Aklson, Aije Egwaikhide, Svetlana Levitan, Romeo Kienzler, Polong Lin, Joseph Santarcangelo, Azim Hirjani, Hima Vasudevan, Saishruthi Swaminathan, Saeed Aghabozorgi, Yan Luo

About the Specialization:

There are 10 Courses in this Professional Certificate:

Projects:

Stock Market Data Analysis.

Analyzing Data Using SQL and Python.

House Sales Analysis with Scikit-Learn.

US Domestic Airline Flights Performance Dashboard.

Best Classifier Model.

Capstone Project.

About

Releases

Packages

Languages

License

marcoshsq/IBMDataScience

Folders and files

Latest commit

History

Repository files navigation

IBM: Data Science Certificate Projects

Instructors: Rav Ahuja, Alex Aklson, Aije Egwaikhide, Svetlana Levitan, Romeo Kienzler, Polong Lin, Joseph Santarcangelo, Azim Hirjani, Hima Vasudevan, Saishruthi Swaminathan, Saeed Aghabozorgi, Yan Luo

About the Specialization:

There are 10 Courses in this Professional Certificate:

Projects:

Stock Market Data Analysis.

Analyzing Data Using SQL and Python.

House Sales Analysis with Scikit-Learn.

US Domestic Airline Flights Performance Dashboard.

Best Classifier Model.

Capstone Project.

About

Topics

Resources

License

Stars

Watchers

Forks

Languages