Optimal Hedging with Continuous Action Reinforcement Learning
Mikkilä, Oskari (2020)
Mikkilä, Oskari
2020
Degree Programme in Industrial Engineering and Management, MSc (Tech)
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2020-05-11
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202004294607
https://urn.fi/URN:NBN:fi:tuni-202004294607
Tiivistelmä
This thesis proposes an application of state-of-the-art continuous action reinforcement learning for hedging European options in a market with discrete time rebalances and transaction costs. The hedging problem is defined as a utility function maximization problem. The utility function is defined as the expected change in the wealth minus a risk aversion parameter times the variance of the wealth. The utility function turns the hedging problem into a trade-off between expected wealth and variance. The proposed method is an application of the Twin Delayed Deep Deterministic Policy Gradients.
The method is tested in three different settings. Two of the settings are simulations of the asset and option prices. In the first setting the stock price process is a geometric Brownian process with constant volatility. The second setting uses the stochastic volatility model of Heston. SPX index prices and prices of call options written on SPX index from years 2012 and 2013 are used in the empirical data section. In the empirical setting this thesis proposes Sim-to-Real transfer learning: training the agent on asset price paths simulated by the Heston model and testing on historical data.
The proposed method performs better than selected analytical benchmarks in the stochastic volatility setting. In the constant volatility setting the reinforcement learning method has equal performance with the benchmarks. In the empirical testing section the method has equal performance with the benchmarks over the two-year period, performing better in more than 50% of the weeks. In general, this thesis shows that continuous action reinforcement learning can be used to balance the trade-off between cost and risk when hedging a European option.
The method is tested in three different settings. Two of the settings are simulations of the asset and option prices. In the first setting the stock price process is a geometric Brownian process with constant volatility. The second setting uses the stochastic volatility model of Heston. SPX index prices and prices of call options written on SPX index from years 2012 and 2013 are used in the empirical data section. In the empirical setting this thesis proposes Sim-to-Real transfer learning: training the agent on asset price paths simulated by the Heston model and testing on historical data.
The proposed method performs better than selected analytical benchmarks in the stochastic volatility setting. In the constant volatility setting the reinforcement learning method has equal performance with the benchmarks. In the empirical testing section the method has equal performance with the benchmarks over the two-year period, performing better in more than 50% of the weeks. In general, this thesis shows that continuous action reinforcement learning can be used to balance the trade-off between cost and risk when hedging a European option.