Enhanced tech capabilities makes reinforcement learning viable

Computing power has advanced to the point that the once-impractical process of reinforcement learning is now a viable tool for asset owners, the Top1000funds.com Fiduciary Investors Symposium has heard.

Reinforcement learning trains software to make decisions by mimicking trial and error and is used in investment decision making to generate the best potential result.

John Hull, Maple Financial chair in derivatives and risk management at the Joseph L. Rotman School of Management, told the symposium that reinforcement learning has several advantages and outperforms simpler modelling approaches.

“It gives you the freedom to choose your objective function – it’s a danger with some of the simpler hedging strategies and so on that you’re just assuming good outcomes are as bad as bad outcomes,” he said.

“You can choose your time horizon, tests indicate that it’s robust… and gives good results during stress periods and there’s a big saving in transaction costs. Why are we talking about it now? Well, because computers are now fast enough to make it a viable tool.”

Hull said reinforcement learning techniques can reduce transaction costs by as much as 25 per cent compared with traditional hedging approaches.

“It’s a way of generating a strategy for taking decisions in a changing environment – you’re not just taking one decision, but a sequence of decisions,” he said.

“Perhaps you’re taking a decision today and then you take another decision tomorrow, and so on. Let’s suppose you’re interested in a strategy for investing in a certain stock and say what’s a good strategy for this stock – I think it’s going to work out okay, but it may not. What strategy should I use over the next three months. What do you do?”

Hull said normally a stochastic process – which assesses different outcomes based on changing variables – would be used to assess a stock.

“It’s uncertain how the stock price is going to evolve and you might use a mathematical stochastic process, you might use a historical data on the stock price behaviour, something like that. You have some model for how the stock price behaves,” Hull said.

“Then your problem is defined by what we call states/actions/rewards.”

Hull said the aim is quite simply to decide what action should be taken in each possible state to maximise the expected reward.

“You’d say okay, we don’t know how this stock price is going to evolve but it will evolve in some way, and so there will be certain states we find ourselves in. We should take a certain action, and that’s what we’re trying determine, and there will be a certain reward,” Hull said.

“In other words, you’ll make a profit or a loss. The way I think about it, it’s just sophisticated trial and error.”

This means by starting off with having “no idea at all” about what a good action to take is and to then try different hypothetical outcomes.

“It works well or it doesn’t work well, then you try a different action and so on and then eventually you come up with what seems to be the best action to take when a particular state is encountered,” Hull said.

Hull said reinforcement learning traditionally is computationally expensive, takes a lot of computation time and is “data hungry”, but that’s not the case these days.

“But fortunately, the other thing that’s happened that makes this a viable tool… is that we can now generate unlimited amounts of synthetic data that’s indistinguishable from historical data,” he said.

“You collect some historical data… maybe a couple of thousand items of historical data [and] you can generate as much synthetic data as you want to that is indistinguishable from that historical data.”

Hull said that while his experience has mostly been in applying reinforcement learning to the hedging of derivatives, he noted there’s many other areas where it can also be applied.

“Because really it can be applied in any situation where the goal is to develop a strategy for achieving a particular objective in changing market,” he said.

“There’s something out there that’s going to change in a way you don’t know, and you have to model that.”

Financial Innovation Hub, or FinHub for short, carried out the research that Hull prestned to the symposium.

Hull said one of the distinctive features of FinHub is that it’s not just academics within the Rotman School of Management that work on its projects, but also practitioners and the university’s engineering faculty.

Reinforcement learning is just one of the projects FinHub has been working on, with Hull explaining the centre has also been doing work on natural language processing, amongst other initiatives.

“We’ve worked with the Bank of Canada on monetary policy uncertainty,” Hull said.

“We’ve done work on modelling volatility services and using natural language processing to forecast different market variables.”

Identifying best practice in pension management is not a straightforward task. As much as asset allocators may want there to be a definitive answer, differences in size, mandate and resources between different pension funds means an investment approach that works for one may not work for others.

Managing the multiple drivers of long-term investing

For asset owners to stay the course of a long-term investing view, not only do their investment teams need to be behind the objective, but also their board and external managers. Or investors might find themselves fighting an uphill battle in a market where short-termism is prevalent.

Turning AI loose inside asset-owner organisations

The power of artificial intelligence to makes sense of huge volumes of data and produce real business gains has obvious appeal for asset owners. Working out how to apply the technology can be overwhelming, but the Fiduciary Investors Symposium heard that the most important thing is to start.

Looking past the hype to the real benefits (and risks) of AI

AI is on every investor’s lips as a technology that will revolutionise businesses and industries. The Fiduciary Investors Symposium heard that looking past the hype to the tangible, on-the-ground benefits presents some genuine challenges for asset owners and the managers they often employ to do it for them.

How to think about the economics of AI

The pace of development in artificial intelligence and machine learning is head-spinning. The Fiduciary Investors Symposium heard there are examples from the industrial past that serve as good indicators of how the new technology will be adopted, its likely impact, and both short- and long-term strategies for effective adoption.

Solid foundations allow Canadian funds to innovate and grow

The foundations of the modern Canadian pension fund industry were laid decades ago, and organisations today continue to reap the benefits. The Fiduciary Investors Symposium in Toronto heard that the potential of the industry is immense, built on solid principles and an embrace of new technology and processes.

Sponsored Content

Join the discussion Cancel reply

More from this fund

Subscribe now to