Portfolio Optimization Problem

Markowitz Mean-Variance Method

Shizhu Kathy Liu
New York University

Modern Portfolio Theory

Which portfolio is the best?

This question is probably as old as the stock-market itself.

People spend a lot of time developing methods and strategies that come close to the "perfect investment", that brings high returns coupled with low risk.

As one of the most important and influential theories dealing this problem, Modern Portfolio Theory was developed by Harry Markowitz and published under the title "Portfolio Selection" in the 1952 Journal of Finance.

Let's review some math!

Basic Notions

\(m\) risky assets: \(i= 1, 2, \dots, m\)
Single - Period Returns: \(\mathbf{R} = [R_{1,t}, R_{2,t}, \dots, R_{m,t}]^T\)
Return calcualated from price: \(R_{1,t} = \frac{R_{1,t}-R_{1,t-1}}{R_{1,t-1}}\)
Mean (Expected Returns) and Variance/Covariance of Returns: \[ E[\mathbf{R}] = \mathbf{\alpha} = \begin{bmatrix} \alpha_1 \\ \vdots \\ \alpha_m \end{bmatrix} , \quad \text{Cov}[\mathbf{R}] = \mathbf{\Sigma} = \begin{bmatrix} \Sigma_{1,1} & \cdots & \Sigma_{1,m} \\ \vdots & \ddots & \vdots \\ \Sigma_{m,1} & \cdots & \Sigma_{m,m} \end{bmatrix} \]
Portfolio: \(m\) weights indicating the fraction of portfolio wealth held in each asset, \(\mathbf{w} = [w_1, \dots, w_m] : \, \sum^m_{i=1} = 1\) (fully invested)

Basic Notions

Portfolio Return: \[R_\mathbf{w} = \mathbf{w}^T \mathbf{R} = \sum^m_{i=1} w_i R_i\] \[\alpha_{\mathbf{w}} = E[R_{\mathbf{w}}] = \mathbf{w}^T \mathbf{\alpha}\] \[ \sigma^2_{\mathbf{w}} = \text{var} [R_{\mathbf{w}}] = \mathbf{w}^T \Sigma \mathbf{w}\]
People are risk averse, we evaluate different portfolios using the mean-variance pair of the portfolio: \((\alpha_{\mathbf{w}},\sigma^2_{\mathbf{w}} )\) with preference for:
- Higher expected return \(\alpha_{\mathbf{w}}\)
- Lower variance \(\text{Var}_{\mathbf{w}}\), or its square root \(\sigma_{\mathbf{w}}\), standard deviation

Problem 1 - My Project

Risk Minimization:

For a given choice of target mean return \(\alpha_0\), choose the portfolio \(\mathbf{w}\) to

Minimize: \(\frac{1}{2} \mathbf{w}^T \Sigma \; \mathbf{w}\)

Subject to: \(\mathbf{w}^T \mathbf{\alpha} = \alpha_0\)

\(\qquad \qquad \sum^m_{i=1} = 1\)

Solution:

Apply the method of Lagrange multipliers to the convex minimization problem subject to linear constrains, we can get optimal portfolio weights and variance.

Matrix operation steps: Check reference [3]

Based on this problem, my project is to build and backtest strategies for 8 selected stocks since 2006.

Problem 1 - My Project

There are many other popular problems involved with more complicated constrains. My project is an application of Problem 1 - Risk Minimization.

Investment Universe - Data

Import closing stock price data from Yahoo Finance from 01/01/2006 to 03/31/2016.

Select following 8 stocks
Code is reproducible for different total number, different date range and different companies of stocks chosen

Company	Ticker	Industry
JPMorgan Chase & Co.	JPM	Banks
Alphabet Inc	GOOGL	Internet
Walt Disney Co	DIS	Entertainment
JC Penney Company Inc	JCP	Department Stores
Johnson & Johnson	JNJ	Drug Manufacturers
Starbucks Corporation	SBUX	Eateries
Exxon Mobil Corporation	XOM	Oil & Gas
Starwood Hotels	HOT	Lodging

Investment Strategies -Global Setting

Objective: Minimize Portfolio volatility given a 15% annual return target
Porfolio will be fuly invested (sum of weights = 1)
Long Short is allowed ( weights between 200% and -200% )
Portfolio is weekly rebalanced

Note: all parameters are replaceable in the code

Estimation of Expected Returns and Covariance

Expected Returns

Historic returns are often used to estimate expecetd returns

Sample Estimator for daily returns with 60 days of observation for expected returns.

Covariance Candidates are:

Sample Covariance with 200 days of observation for returns, configuration referred as "long term", "long" or "l"
Sample Covariance with 120 days of observation for returns, configuration referred as "medium term","medium" or "m"
Sample Covariance with 60 days of observation for returns, configuration referred as "short term", "short" or "s"

Note: all parameters are replaceable in the code

Libraries and Datasets

library(rJava); library(tseries); library(quadprog) 
library(PerformanceAnalytics); library(ggplot2)
library(xlsx); library(data.table);library(xtable)

JPM <- get.hist.quote(instrument="JPM", start="2006-01-01", end="2016-3-31", quote="Close")
GOOGL <- get.hist.quote(instrument="GOOGL", start="2006-01-01", end="2016-3-31", quote="Close")
DIS <- get.hist.quote(instrument="DIS", start="2006-01-01", end="2016-3-31", quote="Close")
JCP <- get.hist.quote(instrument="JCP", start="2006-01-01", end="2016-3-31", quote="Close")
JNJ <- get.hist.quote(instrument="JNJ", start="2006-01-01", end="2016-3-31", quote="Close")
SBUX <- get.hist.quote(instrument="SBUX", start="2006-01-01", end="2016-3-31", quote="Close")
XOM <- get.hist.quote(instrument="XOM", start="2006-01-01", end="2016-3-31", quote="Close")
HOT <- get.hist.quote(instrument="HOT", start="2006-01-01", end="2016-3-31", quote="Close")


price_raw <- cbind(JPM, GOOGL, DIS, JCP, JNJ, SBUX, XOM, HOT)

Daily Returns

return_daily <- diff(price_raw)/lag(price_raw, k=-1)
colnames(return_daily) <- c("JPM", "GOOGL", "DIS", "JCP", "JNJ", "SBUX", "XOM", "HOT")

# head(return_daily, 4)

	date	JPM	GOOGL	DIS	JCP	JNJ	SBUX	XOM	HOT
1	2006-01-04	-0.01418	0.02300	-0.01680	0.01382	0.01541	0.02592	0.00171	-0.00906
2	2006-01-05	0.00303	0.01348	0.01751	-0.03180	-0.00415	-0.00189	-0.00495	0.00142
3	2006-01-06	0.00705	0.03196	0.01352	0.01047	0.00449	0.00032	0.01973	0.00850
4	2006-01-09	0.01624	0.00266	0.01051	0.00196	0.00623	-0.01012	-0.00050	0.00703

Solve the optimization problem

n1 <- nrow(return_daily)
n2 <- ncol(return_daily)
return_daily <- cbind(return_daily, portfolio=rep(NA,n1))
weight <- matrix(NA, nrow = n1, ncol = n2)
i = 1 + 60  # short: i = 1 + 60 
             # medium: i = 1 + 120
             # long:i = 1 + 200
             # If you don't want to observe from 2006-01-04
             # change 1 to b <- which(format(index(return_daily), "%Y-%m-%d") == '2008-01-05'

Solve the optimization problem

while (i <= n1){
  return_expect <- colMeans(return_daily[(i-60):(i-1), 1:n2], na.rm=TRUE) 
          # 60 days of observation for expected returns, fixed in my project, can change if you desire different estimator
  Dmat <- cov(return_daily[(i-60):(i-1), 1:n2], use = "complete.obs") 
          # Sample Covariance with 60 days of observation for returns, change to m: 120; l: 200
  Amat <- cbind(1, return_expect, diag(1,n2), diag(-1,n2))
  target <- 0.15 / 250
         # target yearly return 15%, fixed in my project, can change if you desire different constrain
  bvec <- c(1, target, -2+rep(0,n2), -2+rep(0,n2))
  dvec <- rep(0,n2)
  opt_sol <- solve.QP(Dmat,dvec,Amat,bvec, meq=2, factorized=FALSE)$solution
         # solve the optimization problem
  weight[i+0:min(4,n1-i), ] <- matrix(rep(opt_sol, min(5,n1-i+1)), ncol=n2, byrow=T)
         # record weight for each asset
  return_daily[i+0:min(4,n1-i), n2+1] <- return_daily[i+0:min(4,n1-i), 1:n2] %*% opt_sol
         # record new daily return for each asset and portfolio overall
  i = i + 5 
         # weekly rebalance portfolio, adjust weight per 5 trading day
         # fixed in my project, can change if you desire different condition
}

Record Solution: short-term estimator

Record new daily return for each asset and portfolio overall

daily_s <- return_daily

# tail(daily_s,2)

	date	JPM	GOOGL	DIS	JCP	JNJ	SBUX	XOM	HOT	portfolio
1	2012-01-03	0.05203	0.03021	0.02160	-0.00370	0.00457	-0.01565	0.01463	0.03044	0.00041
2	2012-01-04	-0.00086	0.00431	0.01410	-0.00314	-0.00607	0.01943	0.00023	0.00020	-0.01102

extract portfolio daily return

portfolio_s <- return_daily[,"portfolio"]

Record Solution: short-term estimator

record weight for each asset

weight_s <- data.frame(weight)
rownames(weight_s) <- as.character(index(daily_s))
colnames(weight_s) <- colnames(daily_s)[1:8]

In the following page, we will check our global setting:

full investment: sum of the weight = 1
portfolio is weekly balanced

Record Solution: short-term estimator

Fully invested: sum of weight = 1

sum(weight_s[100,1:8])

## [1] 1

sum(weight_s[200,1:8])

## [1] 1

Record Solution: short-term estimator

weight adjusted every five days

# tail(weight_s, 9)

	date	JPM	GOOGL	DIS	JCP	JNJ	SBUX	XOM	HOT
1	2016-03-18	-0.10548	0.29112	0.18929	0.04125	0.62734	-0.13745	0.11437	-0.02044
2	2016-03-21	-0.10548	0.29112	0.18929	0.04125	0.62734	-0.13745	0.11437	-0.02044
3	2016-03-22	-0.07360	0.28233	0.24746	0.03308	0.65960	-0.16801	0.05094	-0.03180
4	2016-03-23	-0.07360	0.28233	0.24746	0.03308	0.65960	-0.16801	0.05094	-0.03180
5	2016-03-24	-0.07360	0.28233	0.24746	0.03308	0.65960	-0.16801	0.05094	-0.03180
6	2016-03-28	-0.07360	0.28233	0.24746	0.03308	0.65960	-0.16801	0.05094	-0.03180
7	2016-03-29	-0.07360	0.28233	0.24746	0.03308	0.65960	-0.16801	0.05094	-0.03180
8	2016-03-30	-0.05966	0.29868	0.24171	0.03069	0.61849	-0.17080	0.06497	-0.02408
9	2016-03-31	-0.05966	0.29868	0.24171	0.03069	0.61849	-0.17080	0.06497	-0.02408

Visualize Solution: short-term estimator

Pick some dates to visuaize weight under each stock on a particular date

can change to the following "2008-09-15" to any trading day

weight_s <- setDT(weight_s, keep.rownames = TRUE)[]
bar <- subset(weight_s, rn == "2008-09-15")
bar_melt <- melt(bar, id.vars="rn")

ggplot(bar_melt, aes(x=variable, y=value)) + 
  geom_bar(stat="identity", position = "identity",aes(fill=variable)) +
  geom_text(aes(x=variable, y=value, label=round(value, digits = 4))) +
  labs(x="Ticker Symbol",y="weight_s") +
  theme(legend.title=element_blank())

Visualize Solution: short-term estimator

2008-09-15： when Lehman Brothers filed bankruptcy

plot of chunk unnamed-chunk-12

Visualize Solution: short-term estimator

2015-12-16： when Fed Raises Key Interest Rate for First Time in Almost a Decade

plot of chunk unnamed-chunk-13

Visualize Solution: short-term estimator

autoplot(daily_s$portfolio[1:2577], main="Portfolio_s Daily Return", facets=NULL) + 
    xlab("time") + ylab("Daily Return") + 
    geom_vline(xintercept=as.numeric(index(daily_s[c(674,879),])), linetype="dashed")

plot of chunk unnamed-chunk-14

Visualize Solution: short-term estimator

Some observations:

During crisis period (2008-2010), return volatility is much lager than usual.
For a extreme volatility in one day, it is usually caused by company splitted its stock, the price became half instantly

Record Solution: short-term estimator

Graphs about daily return is not natural, investors feel comfortable to look at Cumulative PnL (returns)

Record Cumulative returns

cummu_s <- daily_s
for (j in 1:n1)
  cummu_s[j, ] <- apply(daily_s[1:j, ], 2, Return.cumulative) + 1

autoplot(cummu_s$portfolio[1:2577], main="portfolio_s cumulative PnL full period", facets=NULL) + 
      xlab("time") + ylab("Cumulative Daily Return") +
     geom_vline(xintercept=as.numeric(index(cummu_s[c(674,879),])), linetype="dashed")

Visualize Solution: short-term estimator

plot of chunk unnamed-chunk-16

Visualize Solution: short-term estimator

plot of chunk unnamed-chunk-17

Same steps for medium and long term estimator

Now we can apply the same methods by using medium and long term estimator.

Recall:

Sample Covariance with 200 days of observation for returns, configuration referred as "long term", "long" or "l"
Sample Covariance with 120 days of observation for returns, configuration referred as "medium term","medium" or "m"
Sample Covariance with 60 days of observation for returns, configuration referred as "short term", "short" or "s"

Comparison between estimators

cummu <- cbind(cummu_s[,"portfolio"], cummu_m[,"portfolio"], cummu_l[,"portfolio"])
colnames(cummu) <- c("short","mediam","long")



cummu <- data.frame(cummu)
cummu_date <- setDT(cummu, keep.rownames = TRUE)[]
cummu_melt <- melt(cummu_date, id.vars="rn")
cummu_melt$rn <- as.Date(cummu_melt$rn)

Comparison between estimators

plot of chunk unnamed-chunk-19

Is it the end of our story?

Comparison between estimators

Crisis:
- Begin of the crisis: 2008 Sep 7, The Federal Housing Finance Agency (FHFA) places Fannie Mae and Freddie Mac in government conservatorship. A week later, Lehman Brothers filed bankruptcy.
- It ended in June 2009, according to the U.S. National Bureau of Economic Research (NBER)
data set (return_daily)
- Before crisis: return_daily[,1:674]
- During crisis: return_daily[,675:878]
- After crisis: return_daily[,879:1512]
Nobody evaluates a portfolio by is just looking at the cumulative PnL

Performance Indicators

Notation	Definition	Reference
mean_daily_return	average of daily return
mean_daily_return_annual	annualized average of daily return	mean_daily_return * 252
min_daily_return	minimum daily return
max_daily_return	maximum daily return
volatility	standard deviation of daily return	click
volatility_annual	annualized standard deviation of daily return	volatility * \(\sqrt{252}\)
Sharpe_ratio	Sharpe Ratio	click
Sharpe_ratio_annual	annualized Sharpe Ratio	click
Sortino_ratio	Sortino Ratio	click
Skewness_Kurtosis_ratio	Skewness divide Kurtosis	click
Modifed_VaR	Modified Value at Risk	click
Modifed_VaR_annual	annualized Modified Value at Risk
Max_drawdown	Maximum Drawdown	click

Performance Report: Before the Crisis

library(xtable)
print(xtable(report_b,digits=5), type = "html")

	before crisis_s	before crisis_m	before crisis_l	vote
mean_daily_return	0.00052	0.00045	0.00005
mean_daily_return_annual	0.13169	0.11281	0.01282	1
min_daily_return	-0.03136	-0.03578	-0.03955	1
max_daily_return	0.02967	0.02840	0.02276	1
volatility	0.00806	0.00804	0.00816
volatility_annual	0.12788	0.12758	0.12947	2
Sharpe_ratio	0.06487	0.05570	0.00624
Sharpe_ratio_annual	1.02772	0.86472	0.03428	1
Sortino_ratio	-0.01311	-0.02549	-0.08548	1
Skewness_Kurtosis_ratio	-0.06596	-0.08434	-0.13038
Modifed_VaR	-0.01314	-0.01337	-0.01447
Modifed_VaR_annual	-0.01314	-0.01337	-0.01447	3
Max_drawdown	0.15402	0.17585	0.18136	1

Before the Crisis:

Short-term estimator works the best
- highest daily return: closest to 15% target
- annualized Shape Ratio is greater than 1
volatility
- medium estimator has smallest volatility
- volatility from short estimator is about the same

Performance Report: During the Crisis

	during crisis_s	during crisis_m	during crisis_l	vote
mean_daily_return	-0.00196	-0.00126	-0.00145
mean_daily_return_annual	-0.49460	-0.31791	-0.36456	2
min_daily_return	-0.09695	-0.09400	-0.09198	3
max_daily_return	0.11205	0.11263	0.11630	3
volatility	0.02366	0.02383	0.02336
volatility_annual	0.37553	0.37823	0.37084	3
Sharpe_ratio	-0.08297	-0.05295	-0.06193
Sharpe_ratio_annual	-1.15087	-0.85277	-0.94799	2
Sortino_ratio	-0.13586	-0.10435	-0.11782	2
Skewness_Kurtosis_ratio	-0.03721	0.02628	0.05161
Modifed_VaR	-0.04056	-0.03733	-0.03515
Modifed_VaR_annual	-0.04056	-0.03733	-0.03515	1
Max_drawdown	0.41948	0.39171	0.40982	2

During the Crisis:

medium and long term estimator work better than short
- medium estimator has highest return: though it is negative, far from our target
- long term estimator has smallest volatility
- medium estimator has better result in Shape Ratio, Sortino ratio and Maximum Drawdown
Note: the portfolio during crisis has negative return.

Performance Report: After the Crisis

	after crisis_s	after crisis_m	after crisis_l	vote
mean_daily_return	0.00020	0.00026	0.00033
mean_daily_return_annual	0.05022	0.06577	0.08299	3
min_daily_return	-0.04767	-0.04425	-0.04489	2
max_daily_return	0.06541	0.05727	0.05206	1
volatility	0.01045	0.01003	0.01001
volatility_annual	0.16582	0.15928	0.15898	3
Sharpe_ratio	0.01908	0.02601	0.03288
Sharpe_ratio_annual	0.22428	0.34241	0.45858	3
Sortino_ratio	-0.05314	-0.04639	-0.03757	3
Skewness_Kurtosis_ratio	0.03770	0.01298	0.02145
Modifed_VaR	-0.01526	-0.01525	-0.01517
Modifed_VaR_annual	-0.01526	-0.01525	-0.01517	1
Max_drawdown	0.20524	0.19178	0.22438	2

After the Crisis:

Long-term estimator works the best
- highest daily return
- lowest volatility
- better result in Shape Ratio and Sortino ratio

Summary

Same estimation method would have different performances under different market enviroment.
When I write the code, I tried to make it more reproduceble; you may simply change/add/remove the ticker symbols, change the number of days for estimators, or adjust the time period, to create your own portfolio and visualization.
Next step:
- Maybe a shiny app ?
- Make expected return estimator fancier: currently I am using return in previous 60 trading days to estimate, this naive attempt can be replaced by many other techniques

Portfolio Optimization Problem

Markowitz Mean-Variance Method

Modern Portfolio Theory

Basic Notions

Basic Notions

Problem 1 - My Project

Risk Minimization:

Solution:

Based on this problem, my project is to build and backtest strategies for 8 selected stocks since 2006.

Other Classic Problems

Expected Return Maximization:

Risk Aversion Optimization:

Problem 1 - My Project

There are many other popular problems involved with more complicated constrains. My project is an application of Problem 1 - Risk Minimization.

Investment Universe - Data

Import closing stock price data from Yahoo Finance from 01/01/2006 to 03/31/2016.

Investment Strategies -Global Setting

Note: all parameters are replaceable in the code

Estimation of Expected Returns and Covariance

Expected Returns

Historic returns are often used to estimate expecetd returns

Covariance Candidates are:

Note: all parameters are replaceable in the code

Libraries and Datasets

Daily Returns

Solve the optimization problem

Solve the optimization problem

Record Solution: short-term estimator

Record Solution: short-term estimator

Record Solution: short-term estimator

Record Solution: short-term estimator

Visualize Solution: short-term estimator

Pick some dates to visuaize weight under each stock on a particular date

Visualize Solution: short-term estimator

Visualize Solution: short-term estimator

Visualize Solution: short-term estimator

Visualize Solution: short-term estimator

Some observations:

Record Solution: short-term estimator

Record Cumulative returns

Visualize Solution: short-term estimator

Visualize Solution: short-term estimator

Same steps for medium and long term estimator

Now we can apply the same methods by using medium and long term estimator.

Recall:

Comparison between estimators

Comparison between estimators

Is it the end of our story?

Comparison between estimators

Performance Indicators

Performance Report: Before the Crisis

Before the Crisis:

Performance Report: During the Crisis

During the Crisis:

Performance Report: After the Crisis

After the Crisis:

Summary

Reference

The End