SpaceX Falcon-9 First
stage landing prediction
IBM Data Science Capstone Project
Fatih
n
İlha n
01 Executive Summary
02 Introduction
Outline 03 Methodology
04 Results
05 Conclusion
06 Appendix
01
Collection Data
Data Wrangling
Executive
Summary Exploratory Data
Stages
EDA by Visualization
Predective Analysis by Machine Learning
02
Introduction
SpaceX's Goal
Sending spacecraft to the international space
station
Providing satellite internet to the whole world with
Starlink technology
Taking people and cargo into space and
contributing to space exploration.
02
FALCON 9 ROCKETS
Falcon 9 is a two-stage reusable rocket developed and manufactured by
SpaceX, a private aerospace company founded by Elon Musk. Here are
some key features and missions of Falcon 9 rockets
Reusability: One of the notable features of Falcon 9 is its reusability. The
first stage of the rocket is designed to return to Earth after launch,
landing vertically either on land (at SpaceX's landing zones) or on an
autonomous drone ship in the ocean. This reusability significantly
reduces the cost of space launches.
Payload Capacity: Falcon 9 is capable of delivering a variety of payloads
to orbit, including satellites, cargo resupply missions to the International
Space Station (ISS), and even crewed missions. It has a payload
capacity of up to 22,800 kilograms (50,300 pounds) to low Earth orbit
(LEO) and up to 8,300 kilograms (18,300 pounds) to geostationary transfer
orbit (GTO).
Starlink: Falcon 9 plays a crucial role in SpaceX's ambitious Starlink
project, which aims to provide global broadband internet coverage from
a network of thousands of small satellites in low Earth orbit. Falcon 9
launches numerous batches of Starlink satellites to gradually build up the
constellation.
NASA Missions: Falcon 9 has been selected by NASA for various missions,
including resupplying the ISS as part of the Commercial Resupply
Services (CRS) contract. Falcon 9 was also used to launch the Crew
Dragon spacecraft, enabling NASA to resume crewed launches from U.S.
soil.
Satellite Deployment: Falcon 9 is frequently used to deploy satellites for
commercial customers, government agencies, and scientific research. It
offers the flexibility to deliver satellites into different orbits, such as LEO,
GTO, and sun-synchronous orbit (SSO).
03
Methodology
03
Data Collection methodology
Require the data from SpaceX API
Collect data from a Wikipedia page
Data Collection methodology
Perform EDA to find some problems
Determine what would be the label for training
supervised learning
Perform exploratory data analysis (EDA) using visualization and SQL
Perform interactive visual analytics using Folium and Plotly Dash
Perform predictive analysis using classification models
03
Data Collection API spacex_url="https://api.spacexdata.com/v4/launches/past"
WEB "https://en.wikipedia.org/w/index.php?title=List_of_Falcon_9_and_
PAGE Falcon_Heavy_launches&oldid=1027686922"
03
Data Collection - SpaceX API
Request and parse the SpaceX launch data using the GET request
Filter the dataframe to only include Falcon 9 launches
Data Scraping - Wikipedia
Web scraping Falcon 9 and Falcon Heavy Launches Records from
Wikipedia
Request the Falcon9 Launch Wiki page from its URL
Extract all column/variable names from the HTML table header
Create a data frame by parsing the launch HTML tables
03
Data Wrangling
Calculate the number of launches on each site
Calculate the number and occurrence of each orbit
Calculate the number and occurence of mission outcome per orbit type
Create a landing outcome label from Outcome column
03
EDA with Data Visualization
We can plot out the FlightNumber vs. PayloadMassand overlay the
outcome of the launch
Visualize the relationship between Flight Number and Launch Site
Visualize the relationship between Payload and Launch Site
Visualize the relationship between success rate of each orbit type
Visualize the relationship between FlightNumber and Orbit type
Visualize the relationship between Payload and Orbit type
Visualize the launch success yearly trend
03
EDA with Data Visualization
We see that different launch sites have different success rates. CCAFS LC-40,
has a success rate of 60 %, while KSC LC-39A and VAFB SLC 4E has a success rate of 77%.
03
EDA with Data Visualization
Use the function catplot to plot FlightNumber vs LaunchSite, set the parameter x parameter to FlightNumber,
set the y to Launch Site and set the parameter hue to 'class'
03
EDA with Data Visualization
We also want to observe if there is any relationship between launch sites and their payload mass.
03
EDA with Data Visualization
Visualize the relationship between success rate of each orbit type
03
EDA with Data Visualization
Visualize the relationship between FlightNumber and Orbit type
03
EDA with Data Visualization
Visualize the relationship between Payload and Orbit type
03
EDA with Data Visualization
Visualize the launch success yearly trend
03
EDA with SQL
Display the names of the unique launch sites in the space mission
Display 5 records where launch sites begin with the string 'CCA'
Display the total payload mass carried by boosters launched by NASA
(CRS)
Display average payload mass carried by booster version F9 v1.1
List the date when the first succesful landing outcome in ground pad was
acheived.
List the names of the boosters which have success in drone ship and
have payload mass greater than 4000 but less than 6000
List the total number of successful and failure mission outcomes
List the names of the booster_versions which have carried the
maximum payload mass. Use a subquery
List the records which will display the month names, failure
landing_outcomes in drone ship ,booster versions, launch_site for the
months in year 2015
Rank the count of successful landing_outcomes between the date 04-
06-2010 and 20-03-2017 in descending order.
03
EDA with SQL
Display the names of the unique launch sites in the space mission
03
EDA with SQL
Display 5 records where launch sites
begin with the string 'CCA'
03
EDA with SQL
Display the total payload mass carried by boosters launched by NASA (CRS)
03
EDA with SQL
Display average payload mass carried by booster version F9 v1.1
03
EDA with SQL
List the date when the first succesful landing outcome in ground pad was acheived.
03
EDA with SQL
List the names of the boosters which have success in drone ship and have
payload mass greater than 4000 but less than 6000
03
EDA with SQL
List the total number of successful and failure mission outcomes
03
EDA with SQL
List the names of the booster_versions which have carried the maximum payload mass. Use a subquery
03
EDA with SQL
List the records which will display the month names, failure landing_outcomes in drone ship ,booster versions, launch_site for
the months in year 2015.
03
EDA with SQL
Rank the count of successful landing_outcomes between the date 04-06-2010 and 20-03-2017 in descending order.
03
Interactive Visual Analytics with Folium
Mark all launch sites on a map
Mark the success/failed launches for each site on the map
Calculate the distances between a launch site to its proximities
03
Machine Learning Prediction
Create a NumPy array from the column Class in data, by applying the
method to_numpy() then assign it to the variable Y,make sure the
output is a Pandas series (only one bracket df['name of column']).
Standardize the data in X then reassign it to the variable X using the
transform provided below.
Use the function train_test_split to split the data X and Y into training
and test data. Set the parameter test_size to 0.2 and random_state to
2. The training data and test data should be assigned to the following
labels.
Create a logistic regression object then create a GridSearchCV
object logreg_cv with cv = 10. Fit the object to find the best parameters
from the dictionary parameters.
03
Machine Learning Prediction
We output the GridSearchCV object for logistic regression. We display the best
parameters using the data attribute best_params_ and the accuracy on the
validation data using the data attribute best_score_.
Calculate the accuracy on the test data using the method score:
0.8333333333333334
03
Machine Learning Prediction
03
Machine Learning Prediction
Create a support vector machine object then create a GridSearchCV object svm_cv
with cv - 10. Fit the object to find the best parameters from the dictionary parameters.
accuracy : 0.8482142857142856
Calculate the accuracy on the test data using the method score:
0.8333333333333334
03
Machine Learning Prediction
Create a decision tree classifier object then create a GridSearchCV object tree_cv
with cv = 10. Fit the object to find the best parameters from the dictionary parameters
accuracy : 0.8714285714285713
Calculate the accuracy of tree_cv on the test data using the method score:
0.8333333333333334
03
Machine Learning Prediction
Create a k nearest neighbors object then create a GridSearchCV object knn_cv with
cv = 10. Fit the object to find the best parameters from the dictionary parameters
accuracy : 0.8482142857142858
Calculate the accuracy of knn_cv on the test data using the method score:
0.8333333333333334
03
Machine Learning Prediction
"Practically all these algorithms give the same result"
Konu 05
Conclusion
There is a correlation between launch site and success rate
Payload mass is also associated with the success rate.: the more
massive the payload, the less likely the first stage will return
For orbit type, SO has the least success rate while ES-L1, GEO, HEO and
SSO have the highest success rate According to the yearly trend
• There has been an increase in the success rate since 201 3 kept
increasing till 2020 • With best parameter provided, decision tree
classifier used in prediction yielded the highest accuracy of 84% .
Thank you!
Fa tih
İlh an
https://github.com/fatihilhan42/SpaceX_Falcon
Appendix -9_First_stage_landing_prediction/tree/main
Resources