# Midterm Exam Template
## Instructions:
- You have 180 minutes to complete the exam.
- Answer each question in the provided cells.
## Section 1: Return Analysis (35 points)
### Question 1 (5 points)
Calculate the following annualized excess return statistics for each of the 14
assets:
- Mean
- Volatility
- Sharpe Ratio
```python
import pandas as pd
# Load the data
data = pd.read_excel('path_to_your_data.xlsx', sheet_name='commodities')
# Calculate statistics
stats = data.describe().T
stats['mean'] = data.mean() * 252 # Annualize
stats['volatility'] = data.std() * np.sqrt(252) # Annualize
stats['sharpe_ratio'] = stats['mean'] / stats['volatility']
print(stats[['mean', 'volatility', 'sharpe_ratio']])
```
### Question 2 (5 points)
Calculate the following statistics (no annualization needed):
- VaR (0.01)
- CVaR (0.01)
- Maximum Drawdown
```python
# Calculate VaR and CVaR
VaR_01 = data.quantile(0.01)
CVaR_01 = data[data <= VaR_01].mean()
max_drawdown = (data.cumsum().max() - data.cumsum()).max()
print(f"VaR (0.01): {VaR_01}")
print(f"CVaR (0.01): {CVaR_01}")
print(f"Maximum Drawdown: {max_drawdown}")
```
### Question 3 (5 points)
Suppose you could only invest in one of these assets, not an entire portfolio.
Which statistics would you recommend that an investor consider in choosing the
single asset?
*Answer the question here.*
### Question 4 (5 points)
Suppose you already have a large portfolio and are considering which of these
assets to hold as an additional investment, along with your other assets. Would
your answer be the same? Conceptually, which other statistics might interest you?
*Answer the question here.*
### Question 5 (10 points)
Run a Linear Factor Decomposition of sugar (SB1) on coffee (C1). Report the
following (annualized) statistics:
- Market Alpha
- Market Beta
- Market Information Ratio
```python
import statsmodels.api as sm
# Linear Factor Decomposition
X = sm.add_constant(data['C1'])
Y = data['SB1']
model = sm.OLS(Y, X).fit()
alpha = model.params['const']
beta = model.params['C1']
information_ratio = alpha / model.resid.std()
print(f"Alpha: {alpha}")
print(f"Beta: {beta}")
print(f"Information Ratio: {information_ratio}")
```
### Question 6 (5 points)
Based on the statistics above, what is the hedge ratio between sugar (SB1) and
coffee (C1)?
*Answer the question here.*
## Section 2: Mean-Variance Optimization (35 points)
### Question 1 (10 points)
Calculate the weights of the tangency portfolio formed from the 14 assets.
```python
import cvxopt as opt
from cvxopt import blas, solvers
# Calculate tangency portfolio weights
n = len(data.columns)
returns = np.array(data.mean()) * 252
cov_matrix = np.array(data.cov()) * 252
P = opt.matrix(cov_matrix)
q = opt.matrix(np.zeros(n))
G = opt.matrix(np.diag(-np.ones(n)))
h = opt.matrix(np.zeros(n))
A = opt.matrix(returns).T
b = opt.matrix(1.0)
sol = solvers.qp(P, q, G, h, A, b)
weights = np.array(sol['x'])
print(f"Tangency Portfolio Weights: {weights}")
```
### Question 2 (5 points)
What are the weights of the optimal portfolio, w*, with a targeted mean excess
return of 0.0075 per month?
```python
# Calculate optimal portfolio weights for a targeted mean excess return
target_return = 0.0075 / 12 # Monthly target return
A = opt.matrix(np.vstack((returns, np.ones(n))))
b = opt.matrix([target_return, 1.0])
sol = solvers.qp(P, q, G, h, A, b)
optimal_weights = np.array(sol['x'])
print(f"Optimal Portfolio Weights: {optimal_weights}")
```
### Question 3 (5 points)
Does the tangency portfolio weight securities in order of their individual Sharpe
ratios? Why or why not?
*Answer the question here.*
### Question 4 (5 points)
Report the mean, volatility, and Sharpe ratio for the optimized portfolio, w*.
Annualize the statistics.
```python
optimized_mean = np.dot(optimal_weights.T, returns)
optimized_volatility = np.sqrt(np.dot(optimal_weights.T, np.dot(cov_matrix,
optimal_weights)))
optimized_sharpe_ratio = optimized_mean / optimized_volatility
print(f"Mean: {optimized_mean * 252}")
print(f"Volatility: {optimized_volatility * np.sqrt(252)}")
print(f"Sharpe Ratio: {optimized_sharpe_ratio}")
```
### Question 5 (5 points)
How does Harvard make their portfolio allocation more realistic than a basic mean-
variance optimization would imply?
*Answer the question here.*
### Question 6 (5 points)
State one reason that Mean-Variance optimization is not robust (i.e., the solution
is fragile with respect to the inputs).
*Answer the question here.*
## Section 3: Pricing (30 points)
### Question 1 (10 points)
Estimate the time-series test of the pricing model for each asset. Report the
following (annualized) statistics:
- Alpha
- Beta
- R-squared
```python
# Pricing model
factors = pd.read_excel('path_to_your_data.xlsx', sheet_name='factors')
for asset in data.columns:
Y = data[asset]
X = sm.add_constant(factors)
model = sm.OLS(Y, X).fit()
alpha = model.params['const'] * 252
beta = model.params[1:] * 252
r_squared = model.rsquared
print(f"Asset: {asset}")
print(f"Alpha: {alpha}")
print(f"Beta: {beta}")
print(f"R-squared: {r_squared}")
```
### Question 2 (5 points)
If the pricing model worked perfectly, what would these statistics be?
*Answer the question here.*
### Question 3 (5 points)
Which asset does the pricing model fit best?
*Answer the question here.*
### Question 4 (5 points)
Which factor has a higher risk premium in the estimated model above?
*Answer the question here.*
### Question 5 (5 points)
Instead of the 2-factor model above, suppose the CAPM is true and fits perfectly in
our sample. For n assets, what do we know about their:
- Time-series r-squared metrics?
- Treynor Ratios?
- Information Ratios?
*Answer the question here.*
## Section 4: Forecasting (20 points)
### Question 1 (7 points)
Consider the lagged regression, where the regressor (X) is a period behind the
target (rGLD). Estimate and report the R2, as well as the OLS estimates for α and
β.
```python
forecast_data = pd.read_excel('path_to_your_data.xlsx', sheet_name='forecasting')
X = forecast_data[['Tbill_rate', 'Tbill_change']].shift(1).dropna()
Y = forecast_data['GLD'].loc[X.index]
model = LinearRegression().fit(X, Y)
R2 = model.score(X, Y)
alpha = model.intercept_
beta = model.coef_
print(f"R2: {R2}")
print(f"Alpha: {alpha}")
print(f"Beta: {beta}")
```
### Question 2 (5 points)
Use the forecasted GLD returns (r̂ GLD) to build trading weights. Calculate the
return on this strategy.
```python
forecasted_returns = model.predict(X)
weights = 0.2 + 80 * forecasted_returns
strategy_returns = weights * forecast_data['GLD'].loc[X.index]
print(strategy_returns.head())
print(strategy_returns.tail())
```
### Question 3 (3 points)
For both r̂ x and rGLD, report the following statistics:
- Mean
- Volatility
```python
mean_rx = strategy_returns.mean()
volatility_rx = strategy_returns.std()
mean_GLD = forecast_data['GLD'].mean()
volatility_GLD = forecast_data['GLD'].std()
print(f"Strategy Mean: {mean_rx}")
print(f"Strategy Volatility: {volatility_rx}")
print(f"GLD Mean: {mean_GLD}")
print(f"GLD Volatility: {volatility_GLD}")
```
### Question 4 (5 points)
Which signal would likely lead to a result where the long-term forecast compounds
the effect over long horizons?
*Answer the question here.*