0% found this document useful (0 votes)
21 views2 pages

Assignment 2

The document outlines a Python notebook for a Titanic survival prediction assignment using logistic regression. It includes data preprocessing steps such as handling missing values and encoding categorical variables, followed by model training and evaluation. The final results indicate an accuracy of 81.01% and a ROC AUC score of 0.80.

Uploaded by

vaibhavi.darda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views2 pages

Assignment 2

The document outlines a Python notebook for a Titanic survival prediction assignment using logistic regression. It includes data preprocessing steps such as handling missing values and encoding categorical variables, followed by model training and evaluation. The final results indicate an accuracy of 81.01% and a ROC AUC score of 0.80.

Uploaded by

vaibhavi.darda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

4/14/25, 12:10 PM assignment2.

ipynb - Colab

!pip install -q scikit-learn pandas matplotlib seaborn

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, roc_auc_score, confusion_matrix

from google.colab import files


uploaded = files.upload()

Choose Files train.csv


train.csv(text/csv) - 61194 bytes, last modified: 4/14/2025 - 100% done
Saving train csv to train csv
 

import pandas as pd

df = pd.read_csv('train.csv')
df.head()

PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked

0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S

Cumings, Mrs. John Bradley


1 2 1 1 female 38.0 1 0 PC 17599 71.2833 C85 C
(Florence Briggs Th...

STON/O2.
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 7.9250 NaN S
3101282

Futrelle, Mrs. Jacques Heath


3 4 1 1 female 35.0 1 0 113803 53.1000 C123 S
(Lily May Peel)

Next steps: Generate code with df toggle_off View recommended plots New interactive sheet

# Select relevant features and copy to avoid chained assignment warnings


titanic_data = df[['Survived', 'Pclass', 'Sex', 'Age']].copy()

# Fill missing Age values with median


titanic_data['Age'] = titanic_data['Age'].fillna(titanic_data['Age'].median())

# Convert 'Sex' to numeric: female = 0, male = 1


titanic_data['Sex'] = titanic_data['Sex'].map({'female': 0, 'male': 1})

# Check cleaned data


titanic_data.head()

Survived Pclass Sex Age

0 0 3 1 22.0

1 1 1 0 38.0

2 1 3 0 26.0

3 1 1 0 35.0

4 0 3 1 35.0

Next steps: Generate code with titanic_data toggle_off View recommended plots New interactive sheet

X = titanic_data[['Pclass', 'Sex', 'Age']]


y = titanic_data['Survived']

# Train-test split (80% train, 20% test)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)

# Predict on test data


y_pred = model.predict(X_test)

accuracy = accuracy score(y test y pred)


https://colab.research.google.com/drive/1_2f18ZOIF0czondiv0Npfco5OKfbvbLJ#scrollTo=QJSHF4eaDQnt&printMode=true 1/2
4/14/25, 12:10 PM assignment2.ipynb - Colab
accuracy accuracy_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred)

# Print results
print(f"Accuracy: {accuracy * 100:.2f}%")
print(f"ROC AUC Score: {roc_auc:.2f}")

Accuracy: 81.01%
ROC AUC Score: 0.80

https://colab.research.google.com/drive/1_2f18ZOIF0czondiv0Npfco5OKfbvbLJ#scrollTo=QJSHF4eaDQnt&printMode=true 2/2

You might also like