Vous êtes sur la page 1sur 12

MS&E 448 T REND F OLLOWING : F INAL R EPORT

Chenxu Shao , Enming Mao, Zheming Zheng


Department of Management Science and Engineering
June 8, 2014
A BSTRACT
In this report, we present a trend following algorithm for trading in the foreign exchange market (forex market). By applying machine learning techniques and intuitive trading strategies, a Sharpe ratio of around 2 is achieved. In addition, several aspects of the
algorithm as well as trading strategies were also explored, and an improved trading strategy with simple risk control was developed.
Finally, backtesting of a very similar algorithm was performed on the paper trading platform Quantopian, reaching a similar Sharpe
ratio. The results render the -support vector machine learning technique a very feasible method in trend following, thus providing
a significant reference for researches in applying machine learning techniques in trading.

Introduction
In terms of currencies, the most actively traded currency

1.1

pair is USD/EUR, followed by USD/JPY, USD/GBP, etc. 3 (see

The Foreign Exchange Market

Figure 2). In terms of cash flow, trades between forex dealers


foreign exchange market (forex market) is a global,

are usually large, involving hundreds of millions of dollars.

technology-based marketplace in which banks, corpo-

Close to $4 trillion worth of currency traded daily (almost

rations, governments and institutional inverstors trade cur-

twice of the gross domestic product of United Kingdom in

rencies around the clock. It determines the relative values of

2011), making it by far the largest financial market operating

different currencies.

in the workd. Of this, approximately $1.5 trillion is traded by

HE

There are many participants in this market, and they can

retail traders, in the Forex spot market 4 .

be grouped into four categories: exporters and importers,


investors, speculators, and governments 1 , the top 10 par-

Actively Traded Currencies

ticipants include Citigroup, Deutsche Bank, Barclays, UBS,


HSBC, JPMorgan, BofA Merrill Lynch, RBS, BNP Paribas, and
Goldman Sachs 2 (see Figure 1).

EUR/USD
28%

Top 10 Forex Market Participants

28%

USD/JPY
GBP/USD

3.10%

AUD/USD

2.53%

3.25%

USD/CAD

Citi

16.04%

4.38%

Deutsche Bank
Barclays Capital

5.55%

UBS
HSBC

USD/CHF

3%

EUR/JPY

3%

14%

4%

EUR/GBP
Other

5%
6%

9%

JPMorgan

7.12%
15.67%

BofA Merrill Lynch


RBS
BNP Paribas

10.88%
10.91%

Figure 2: Most actively traded currency pairs in the forex market,


data source: Department of Monetary & Economic, BIS 3 .

Goldman Sachs

The forex market assists international trade and investFigure 1: Major participants in the forex market, data source:

ments by enabling currency conversion. In a typical forex

Meakin 2 .

transaction, a party purchases some quantity of one cur-

Email: chenxu@stanford.edu

rency by paying for some quantity of another currency. Fig-

1.2

Data

ure 3 shows a typical forex market time series:

The data used in this work is the historical forex prices of


USD/EUR, the most actively traded currency pair, during all
the trading dates in March, 2014.
For each trading day, the data provides the quotes of ask

1.373 1.374 1.375 1.376 1.377 1.378 1.379

price, ask size, bid price, bid size, tier, as well as time stamp.
The time interval between two consecutive quotes is not
uniform, but roughly differs by 300 milliseconds. For some
time stamp, several different quotes exist, considering the
tiny differences they have, only the first one is accounted. 0
values exist for some date (very few occasions), which gets
replaced by the interpolated value of the previous and next
entry.

22:00

20:00

18:00

16:00

14:00

12:00

10:00

08:00

06:00

04:00

02:00

Bid
Ask
00:00

Price (EUR/USD)

Intraday bid and ask prices

Time

Figure 3: Bid and ask prices for EUR/USD in March 3, 2014.

1.3

Methodology
With various technical analysis available for the forex

market, the quantitative analysis, especially the machine


learning approach to forecast the forex prices is probably
new. Using significantly large amount of data, machine
learning models can be trained and applied to uncover the

The forex market has many special characteristics, mak4

ing it very unique :

patterns hidden in the data. It is thus natural to explore the


possibility of using machine learning techniques to discover
the trend through the study of past market data, as the forex

- Availability: the forex market is available 24 hours a


day and 6 days a week (from Sunday evening to Friday
night);
- Commission Free;

market posses a strong market trend.


In this work, the forex currency pairs were treated as similar to equity tickers, and the bid prices were used as training labels and predictions were made on the bids. When the
prediction for the future bid price is made, an order is placed

- Liquidity: the forex market is extremely liquid due to

based on a programmed trading strategy.

its sheer size. There are always buyers and sellers active in the marketplace;
- Leverage: only a small margin is needed to purchase a
contract with much higher value, i.e. one can use this
leverage to enhance profit and loss margins;

1.3.1

Execution of The Order

In real life, one order may not be filled at the instant it


gets placed or at its desired price. However, in this work,
the forex market is assumed to be efficient and each order
placed will be filled exactly at the desired price.

- Identifiable trends: compared to other markets, the

To meet this assumption and approximate trading in real

forex market has a stronger and more lasting market

life as close as possible, a trading frequency of 5 minutes was

trend which can be tracked.

set. That is, starting from the opening of the market, one
does trading every 5 minutes (The every-5-minute trading

Of all the characteristics, the strong market trend is what


many traders rely on and use to make profits. This project
aims at exploring algorithmic methods in tracking the trend

moment is later referred as "trading node"), with the order


placed at current trading node and filled at the next, thus
providing an elasticity of time for the filling of the order.

in forex market, developing a working trading strategy and


making profits as well.
T REND F OLLOWING : F INAL R EPORT

Prediction features

Prediction label

Training Data - Labels(bids)

Training Data - Features

Place order

t-5min

t-10 min

Fill order
t+5min

Time

Figure 4: The machine learning mechanism and trading work-flow.

To check the robustness of the assumption, the mean

5 to 10 minutes ahead of current trading node was used as

absolute percentage error (MAPE) of the bid prices in a 10

training features, while the bid prices between 5 minutes

second time interval of the trading node, with respect to the

ahead and current trading node was used as training labels.

price at the trading node, is calculated using the following

With this setup, the prediction is one step ahead, i.e., the pre-

formula:

diction is made for the bid price of next trading node (with
MAPE =

the data of current trading node). The whole work-flow of

n A F
100% X
k
i
n k=1 A i

(1)

learning and trading is illustrated in Figure 4.


The features used in the machine learning process in-

where A i is the price at the i -th trading node, and F k s are

clude bid price, bid size, ask price, ask size and tier. The

prices within 10 seconds of that node.

methods used include support vector machine (SVM,


regression type) which is one of the most successful off-

MAPE of the 10s bid prices

the-shelf machine learning technique by far, as well as the


widely-used generalized linear model (GLM, only as a com-

1.5

parison to SVM).
Backtesting was performed on the data introduced in

1.0

1.4

Trading Strategy

0.5

Percentage

Section 1.2.

Two kinds of asset were initiated at the beginning of trad0.0

ing: cash and risky asset (shares), with a initial cash amount
of $10000 and risky asset of 0 share. Three strategies were
0

50

100

150

200

tested based on this initial setting.

Trading nodes

1.4.1
Figure 5: The MAPE of all the prices within 10s of the trading

Strategy A

Strategy A is very intuitive:

nodes, with respect to the price at the trading node.

If the predicted one-step-ahead bid price is higher


It can be seen from Figure 5 that the prices do not vary a

than the current bid price, then all the cash is used

lot, so that an order placed will be filled at similar prices of

to purchase the risky asset (with the bid-ask spread

the desired one, if not the same.

paid);

1.3.2

Machine Learning Mechanism and Methodology

An online learning mechanism was applied, that is, the


machine learning model is retrained and updated for each
trade. To construct the training data set, the data between
3

If the predicted one-step-ahead bid price is smaller,


then all the holding shares get sold and an extra 10000
shares will be shorted.
The algorithm is as follows:
T REND F OLLOWING : F INAL R EPORT

P1 = P1, P2 = P2

Algorithm of Strategy A:

else if B p <= B c && B p > A c then

Require: B c , B p , A c , P 1,2

P1 = P1 + P2 Bc

if B p >= B c then

P 2 = P 1 /A c + 10000, P 1 = 10000 A c

P 2 = P 2 + P 1 /A c

else if B p <= min (B c , A c ) then

P1 = 0

P 1 = (P 2 + 10000) B c , P 2 = 10000

else

end if

P 1 = P 1 + (P 2 + 10000) B c
P 2 = 10000

where Tr ai n is the training error.

end if
where B p is the predicted future bid price, B c the current bid

1.4.3

price, A c the current ask price, P 1 the amount of cash and P 2


the amount of shares one holds.

Modified Strategy B

Strategy B only uses training error to measure the accuracy of the prediction, which is certainly not enough. However, with an online learning mechanism the past test error

1.4.2

Strategy B

will not help either. As it is observed that negative returns

Strategy A is apparently nave, which does not consider

of the strategies tend to cluster and accumulate, it is then

the accuracy of the prediction, as well as the loss caused by

natural to add a simple risk control, which prevents further

the bid-ask spread. Strategy B is a slight improvement on the

trading if the last trade yields negative return. The algorithm

two aspects:

of the modified strategy is as follows:


The algorithm is as follows:

If the training error is large, then we hold the positions;


Algorithm of Modified Strategy B:

otherwise the trade is made based on prediction.


If the predicted bid price is higher than current bid

Require: B c , B p , A c , P 1,2 , Tr ai n , R
if Tr ai n > 104 || R < 0 then

and ask prices, then all cash is spent to long the risky

P 1 = P 1 , P 2 = P 2 . Either training error is large, or last

asset;

trade ended with negative return, hold the positions.


else if B p > max (B c , A c ) then

If the predicted bid price is higher than current bid

P 2 = P 2 + P 1 /A c , P 1 = 0

price, but smaller than the ask price, we again hold the

else if B p > B c && B p < A c then

positions;

P1 = P1, P2 = P2

If the predicted bid price is smaller than current bid

else if B p <= B c && B p > A c then

price but still larger than the ask price, then all the

P1 = P1 + P2 Bc

holding shares get sold, all the cash will be used to pur-

P 2 = P 1 /A c + 10000, P 1 = 10000 A c

chase the shares, and long extra 10000 shares;

else if B p <= min (B c , A c ) then


P 1 = (P 2 + 10000) B c , P 2 = 10000

If the predicted bid price is smaller than both current

end if

bid and ask prices, we simply sell all the holding shares
and short another 10000 shares.

where R is the return for last trade.

The algorithm is as follows:


Algorithm of Strategy B:
Require: B c , B p , A c , P 1,2 , Tr ai n

Backtesting Performances

2.1

if Tr ai n > 104 then


P1 = P1, P2 = P2

. Hold the positions.

Learning Performances
To calibrate the accuracy of the error, the absolute errors

else if B p > max (B c , A c ) then

between the predictions and actual values were calculated.

P 2 = P 2 + P 1 /A c , P 1 = 0

For each day and each trade, the deviation of the predicted

else if B p > B c && B p < A c then


T REND F OLLOWING : F INAL R EPORT

bid price from the actual bid price is calculated, and then the
4

errors were averaged over all the trades within a day (which
Daily mean absolute error

is the mean absolute error (MAE) of the prediction):


n
1X
|A i f i |
t i =1

SVM
GLM
average SVM
average GLM

(2)
8

MAE =

The snapshot of the absolute errors (|A i f i |) within a


trading day is shown in Figure 6:

Error (*1e4)

predicted value.

where A i is the bid price at i -th trading node and f i is its

50

Absolute error of the predictions

SVM
GLM

10

15

20

25

30

40

Trading dates

30
20

horizontal lines indicate the averaged MAEs.

Cumulative Daily Return

10

Error (*1e4)

Figure 7: Monthly mean-absolute error (MAE) in March, 2014, the

Strat.B, SVM
Strat.B, GLM

60

80

100

120

140

Trading nodes

40

Percentage

20

Figure 6: Typical errors for the intraday predictions, data drawn

from predictions on the data of March 03, 2014, USD/EUR, with


2

137 trading nodes.

It can be seen from the above figure that within a typ-

ical trading day, the absolute error is considerably small.

10

15

20

25

30

Trading dates

Compared to the GLM model, -SVM is more accurate. The


standard deviation of the errors for -SVM is 3.578 104 ,

Figure 8: Cumulative returns using Strategy B, with -SVM and

showing that the performance of the model is mostly stable.

GLM.

For the GLM model, the standard deviation of the errors is


6.597 104 , almost twice that of -SVM.
Figure 7 shows the MAE of the predictions across the
month of March 2014, calculated using Equation 2.

The difference in the accuracy of the predictions will certainly affect the performance of the strategy as well. The cumulative returns using Strategy B with -SVM and GLM are
shown in Figure 8.

Clearly, -SVM has a better performance than GLM, with

It is more evident from the figure that -SVM outper-

the averaged MAE 25% smaller. The predictions made by -

formed GLM, with a much larger cumulative return at the

SVM is surprisingly accurate (with the largest error at the or-

end of the month. Although at each day, the accuracy of

der of 104 ). In addition, the errors remain stable across the

GLM seems to be just slightly worse than that of -SVM

whole month.

(which is actually very important here, since the bid prices

T REND F OLLOWING : F INAL R EPORT

sometimes vary only in the 4th digit), but the effect persists

days, but also suffers larger loss when the strategy fails to

and accumulates, eventually caused a negative cumulative

catch the trend of the market. On the other hand, while be-

return.

ing able to maintain the improvements introduced by Strat-

It is also very worth mentioning that in terms of the


speed of the two methods, GLM is slightly better, which

egy B, the modified strategy managed to avoid large losses in


some trading days.

takes 3-4 seconds on average (with Gaussian distribution and


identity link function), for a training data of around 250

Cumulative Daily Return


10

points. On the other hand, training -SVM model takes 5-10


seconds on average (with Gaussian radial basis function ker-

Mod. Strat. B
Strategy B
Strategy A

nel), for 250 data points. With the growth of the number

of data points, the training time for GLM increases slower

to consider GLM (specially linear regression and logistic re-

frequency trading scenario, from this perspective, it is better

Percentage (%)

size of the training data gets larger than 1000). In a high-

than -SVM (which may take more than 30 seconds if the

gression) and work on the improvement.


2

Strategy Performance (with -SVM)


0

2.2

The daily and cumulative returns for Strategy A, B, and

10

15

20

25

30

modified B are shown in Figure 9 and Figure 10, respectively.


Trading dates

Daily return

Figure 10: Cumulative daily returns of the strategies in March,

2.0

2014 (with -SVM).

1.5

The total return for Strategy A is 3.20%, corresponding to

1.0

0.5

turn is 7.25% which corresponds to a daily return of 0.280%,

while that of modified strategy B is 10% which is 0.384%

0.0

Percentage (%)

an average daily return of 0.126%. For Strategy B, the total re-

daily ( for 25 trading days, assuming continuous compound-

ing). The Sharpe ratios calculated for the month are 0.2778,

0.5

1.0

0.4416, and 0.5549 for Strategy A, B, and modified B, respec-

1.5

Mod. Strat. B
Strategy B
Strategy A

tively.

Ideally, the cumulative return will accumulate everyday


without large drawdowns. To achieve this, it is useful to in-

10

15

20

25

30

Trading dates

Figure 9: Daily returns of the strategies in March, 2014.


It is not difficult to observe from cumulative daily re-

corporate more features into the machine learning model, as


well as to consider external factors that may affect the prices,
such as news about the macroeconomics. As these aspects
are beyond the scope of this work, they will not be discussed
in the report.

turns, that modified Strategy B outperformed Strategy B,

Nevertheless, monthly return of around 7.5% 10% is

while the latter outperformed Strategy A. This shows that the

pretty high, rendering the -SVM very feasible for the pre-

mechanisms introduced in Section 1.3.2 and Section 1.3.3

diction of forex prices. It is then natural to explore the possi-

worked well.

bility of applying similar methods and strategy on other as-

It can be seen from Figure 9 that compared to Strategy


A, Strategy B managed to boost the daily returns in many
T REND F OLLOWING : F INAL R EPORT

sets in other markets. As a demonstration of idea, a similar


algorithm is backtested on a paper trading platform.

the next minute is predicted, as under this setup, the

Paper Trading

order placed will be filled at the next minute, and will


be traded by instant price.

To further test the robustness of the algorithm, the algorithm is deployed in the Quantopian platform, where algo-

3. The machine learning method used is -SVM, but

rithms can be tested and applied to real life trading.

3.1

with a classification setup, as predicting the trend is a


classification problem (whereas predicting the actual

Introduction to Quantopian Platform

price is a regression problem).

Quantopian is a paper trading platform where trading al-

4. The training label is built in such a way that a future

gorithms can be backtested and then deployed to real life

rise of stock price corresponds to label "1" and a fu-

trading. The programming language it uses is Python, with

ture fall corresponds to label "-1".

limited support. In the platform, the trade can be made ei5. The features used include volume, open price, close

ther daily or in minutes. And many indicators and parame-

price, high price (of the day), low price, and current

ters, such as alpha, beta, Sharpe ratio, information ratio, etc.

price, which are provided by the platform. Due to the

will be calculated automatically.

limitation to access historical data, no other indica-

Currently, there are several limitations for the platform,

tors/features can be calculated or acquired.

such as limited support for native Python data structures


and machine learning methods, limited data access (for

As for the strategy, since the bid and ask prices, as well

instance, current positions of the portfolio cannot be ac-

as the current positions in the portfolio cannot be accessed,

cessed), etc. In addition, equity is the only asset that can be

a simple strategy is applied: a certain amount of shares (10

traded on the platform.

in the backtest) will be longed if the predicted trend is "1"

Due to such limitations, the algorithm and strategy backtested on this platform is slightly different from those of the

(i.e., rise); otherwise the same amount of shares will be sold


or shorted.

forex data, but they share the same general idea.

3.3
3.2

Methodology and Strategy


The algorithm is backtested both on daily trading and

intraday trading (trading in minutes). The methodology is

Backtesting Performances
The backtesting was performed with data from January

01, 2010 to May 30, 2014, on NASDAQ: AAPL. Below shows


the open prices for NASDAQ: AAPL during this time period:

generally similar to that described in Section 1.2, but due to


Stock Price of AAPL (Daily Data)

the different characteristics of equity market compared to


700

forex market, and also limited by the platform, some mod-

is because the equity market does not have a strong


trend, which makes the actual price very hard to predict, and also due to the lack of features (which will be

400
300

made for the trend of the open price (whether it is


going to rise or fall), instead of the actual price. This

200

taken as training data. In addition the prediction is

Stock Price (USD)

1. For daily trading, the data from the past 15 days were

500

600

ifications were made:

will be traded by open price.


2. For intraday trading, equity data from the past 3 days
(which contains a large amount of data points) were
7

used as training data. The trend of the instant price in

Jan 14

May 14

Sep 13

Jan 13

May 13

Sep 12

Jan 12

May 12

Sep 11

Jan 11

May 11

Sep 10

Jan 10

May 10

Sep 09

is by default filled at the opening of the market, which

Jan 09

how the order is placed and filled is vague. The order

May 09

the instant price is predicted because for daily trading,

100

explained in 5). The trend of the open price, instead of

Trading dates

Figure 11: Open prices for NASDAQ: AAPL on all trading dates,
from January 01, 2010 to May 30, 2014.

T REND F OLLOWING : F INAL R EPORT

Figure 12: Performance panel for the backtesting of the algorithm. The trade was made daily, from January 01, 2010 to May 30, 2014, on
NASDAQ: AAPL. Figure acquired from Quantopian.com.

300

Monthly Return on AAPL (Daily Data)

Monthly Sharpe Ratio on AAPL (Daily Data)

Sharpe Ratio

200

100

Percentage (%)

2010

2011

2012

Trading dates

2013

2014

2010

2011

2012

2013

2014

Trading dates

Figure 13: Left: Monthly return of the algorithm, backtested with the daily data, on NASDAQ: AAPL. The red dashed line
indicates the average level of the return, which is 18.13%. Right: Monthly Sharpe ratio of the algorithm, backtested with the
daily data on NASDAQ: AAPL. The red dashed line indicates the average level of the Sharpe ratio, which is 0.6271.

T REND F OLLOWING : F INAL R EPORT

3.3.1

Daily Trade Performance

Figure 12 shows the result for the backtesting on daily


data. The top chart graphs the cumulative performance,
which is the cumulative return of the algorithm (blue), compared to the benchmark performance (red, S&P 500 ETF).
The second chart plots the training error of -SVM classification. The third chart plots the returns in each week, with
blue bars above the axis showing positive returns and red
bars below the axis showing negative returns. Finally, the
bottom chart records each transaction, with blue bars on top
of the axis representing purchases of shares and red bars below the axis short selling.
The algorithm yields a total return of 9520.5% for AAPL
over the past 4.5 years, compared to a benchmark return of
only 85.3%. In terms of Sharpe ratio, the algorithm gives
13.66 cumulative for the 4.5 years.
The training errors are relatively large, ranging from 10%
to 20%, which is mainly due to the lack of features.
The detailed monthly return, as well as the monthly
Sharpe ratio of the strategy is plotted in Figure 13.
Monthly return shows that the algorithm is doing well,
with most months getting returns of around 18%. The standard deviation of the returns is 0.5737, which is small.
Limited by the access to data and performance of the

3.3.2

Intraday Trade Performance

Figure 14 shows the result for the backtesting on intraday


data. Same as Figure 12, the top chart shows the cumulative performance, compared to the benchmark performance
(red, S&P 500 ETF), the second chart shows the training error, the third plots the weekly returns, and the bottom chart
records each transaction.
The cumulative return of this backtesting is 454880.2%
over the 4.5 years, which is much larger than that of the
backtesting with daily data. This may be due to the more
active transactions in intraday trade. It can be seen from
the transactions chart that transactions were more actively
made compared to the daily trade. In addition, for the intraday trade, long and short transactions spread through all
the time, with larger transactions in purchases; while for the
daily trade, long or short transactions tend to cluster, with almost similar sizes in the transactions. The differences makes
intuitive sense, as within a day the price of the stock may
move back and forth, whereas for daily trade, there may be a
general trend so that transactions cluster.
In terms of Sharpe ratio, the algorithm gives 16.66 cumulative for the 4.5 years, which is larger than that of the daily
trade.

portfolio, it is difficult to set up any indicators to prevent

In terms of training error, the errors for the intraday data

loss in the strategy. Although seemingly very profitable, this

is smaller compared to the errors for the daily data, ranging

strategy is believed to be risky with no measures to prevent

from 0 to 10%. The improvement in the training error may

loss (this is more evident in intraday trading, see the re-

be due to the increase in the training data points, but note

sults for intraday trading). It turned out that this strategy

this does not mean that there is a significant improvement

works very well with momentum driven stocks but poorly

in the prediction, as the training model may have a very high

with stocks whose prices are relatively stable and move up

variance. The actual test error is difficult to retrieve, as the

and down frequently.

result of the prediction cannot be saved and compared. By

In terms of Sharpe ratio, the values vary from month to

checking the actual values of the portfolio, it can be roughly

month. This also indicates that the strategy might be risky.

calculated that the accuracy of the prediction is 65% for in-

The average level of the Sharpe ratio is 0.6271, which corre-

traday data and 62% for daily data. The low accuracy may be

sponds to an annualized Sharpe ratio of 2.173.

due to the lack of features.

Since the Sharpe ratio is automatically calculated by the

The detailed monthly return, as well as the monthly

platform with unknown measures, and the overall Sharpe

Sharpe ratio of the strategy is plotted in Figure 15. Monthly

ratio (13.66) is much larger than the annualized average

return shows that the algorithm performed relatively stable,

Sharpe ratio (2.173), a manual calculation was performed

with most months getting returns of around 17%.

using data from Yahoo! Finance (with same features and

The monthly Sharpe ratio varies from month to month,

same strategy, of course). The calculation gives a monthly

with an average level of 0.6803, which corresponds to an an-

Sharpe ratio of 0.5950, which is 2.062 when annualized.

nualized Sharpe ratio of 2.3566.

T REND F OLLOWING : F INAL R EPORT

Figure 14: Performance panel for the backtesting of the algorithm. The trade was made in minutes, from January 01, 2010 to May 30,
2014, on NASDAQ: AAPL. Figure acquired from Quantopian.com.

Monthly Return on AAPL (Minute Data)

Monthly Sharpe Ratio on AAPL (Minute Data)

1000

500

Sharpe Ratio

500

Percentage (%)

1000

2010

2011

2012

Trading dates

2013

2014

2010

2011

2012

2013

2014

Trading dates

Figure 15: Left: Monthly return of the algorithm, backtested with the intraday data, on NASDAQ: AAPL. The red dashed line
indicates the average level of the return, which is 17.62%. Right: Monthly Sharpe ratio of the algorithm, backtested with the
intraday data on NASDAQ: AAPL. The red dashed line indicates the average level of the Sharpe ratio, which is 0.6803.

T REND F OLLOWING : F INAL R EPORT

10

Compared to the daily performances, the monthly re-

10 shares at a time will not do much to reduce the trend of

turn as well as the monthly Sharpe ratio of intraday data are

loss. If, on the other hand, the predictions can be more accu-

slightly more stable, but with large spikes at several months,

rate (using more features), and the capital can be reinvested,

where the predictions failed. This shows that the intraday

the situation will be much different. Figure 16 shows a sim-

trading is more sensitive to the robustness of the predic-

ilar algorithm (with $ 1000 initial fund) where the machine

tions, compared to the daily trade whose trend is also easier

learning model used 29 features for training and prediction,

to detect.

and applied a reinvestment strategy. It can be seen, such

Acceptably, if comparing the cumulative return charts of

strategy can actually avoid large losses by better predictions

both Figure 12 and Figure 14 to the stock prices of AAPL in

as well as reinvestment of the capital, although several kinks

Figure 12, the shapes of the curves look very alike. This is

do show up in the return curve.

due to the nature of the strategy, which really has a delay effect.

4
Present value, SVM model

Execution Discussion
Stock Price, AAPL
Forex Trading

700

4.1

8000

The algorithms on the forex data worked well, and it is

likely that it will also work in real life.


600

In real life, it is important to acquire enough historical

6000

data and more features (such as moving average, relative


strength index, etc., for they are important indicators of the
500

USD

USD

trend of the market), to train the -SVM model.

4000

The time used in training, making predictions and exe-

cuting the strategy matters. As discussed in Section 2.1, the

2000

400

training speed of SV M depends on the size of the training data. To reach a good accuracy, at least 200 or more

data points will be needed, so that the training time will be


300

SVM model
Labeled

Trading dates

Nov 13

Jul 13

Sep 13

May 13

Jan 13

Mar 13

Nov 12

Jul 12

Sep 12

May 12

Jan 12

Mar 12

Nov 11

Jul 11

Sep 11

May 11

Jan 11

Mar 11

Nov 13

Jul 13

Sep 13

May 13

Jan 13

Mar 13

Nov 12

Jul 12

Sep 12

May 12

Jan 12

Mar 12

Nov 11

Jul 11

Sep 11

May 11

Jan 11

Mar 11

around 5 seconds. If taking the prediction making and strategy executing time as 2 seconds, the total time will be at least
7 seconds, which makes such strategy not very suitable for
Trading dates

high frequency trading. Setting a trading frequency can be


Figure 16: Performance of a similar algorithm, with rein-

very useful, since after the last trade, it is also necessary to

vestment of the capital and more features, backtested with

gather new data points so as to update the prediction model.

the daily data of NASDAQ: AAPL. The red dashed line indi-

To speed up the total execution time of the program, it is

cates the reference performance, assuming future prices are

necessary to switch to faster programming language such as

known. Figure reproduced from Shao and Zheng 5 .

C++.
The real problem is to place the order, as the order might

When the the stock price is rising, the strategy intends to

not get placed at the desired time and price. Since how and

purchase 10 shares a time, which will accumulate to a sig-

when the order gets placed is hard to figure out, it is impor-

nificant amount after some time. Since with a rising trend,

tant that the prices or trend within several seconds of the

the prediction accuracy is not very high, when the algorithm

desired trading time and price, be predicted. So instead of

fails to predict the turning point (and it is always hard to

predicting a single price at a time spot, one needs to predict

predict the turning point, which may be caused by a sud-

a set of prices during a period. Moreover, getting close to the

den event, such as the filling of 10-K report), the portfolio

exchange help as well.

will suffer great loss, because of the large amount of shares

In addition to the original strategies, it is also very im-

hold in the portfolio. Then, even if the predictions get cor-

portant to hedge the risks. One way is to purchase different

rect afterwards, failing to reinvest all the capital and selling

forex assets so as to diversify the risks.

11

T REND F OLLOWING : F INAL R EPORT

4.2

Equity Trading

In addition, a similar algorithm was backtested in the paper trading platform Quantopian, with the historical data of

The equity trading is generally similar to the forex trading, except that commission is required for each trade, so
that the predictions must be more accurate, otherwise a
working strategy in forex market might end up profitless.

NASDAQ: AAPL, from January 2010 to May 30, 2014. High returns were achieved both for daily and intraday trading. Our
work thus provides a good reference for trading algorithms
using machine learning techniques.

Trading in stocks requires more features (P/E ratio, 10day volume, risk premium, Williams.R%, etc.), as well as data
points.
It is very different whether one is trading in minutes or
days. For daily trading, it is helpful to incorporate the influences of the news into the algorithm, so that the algorithm
gets more sensitive to turning points. For intraday trading,
the accuracy becomes very important, since within a day,
the prices may fluctuate while not changing much, and bad
predictions can cause large loss.
The big challenge is still placing and filling of orders. For

Acknowledgement
We thank Dr. Lisa Borland for teaching this great course
where we learned a lot. And thank you for bringing the guest
lecturers into the classes, we enjoyed the lectures very much.
We thank Dr. Lisa Borland and Mr. Enzo Busseti for the
many intelligent instructions and suggestions they provided
us. It has been really a very enjoyable quarter, each meeting
with you has been very fruitful. We really appreciate your
help and we are planning to continue the research.

daily trading, one can always place an order before the close
of the market, which requires one paying a commission of
0.3% for NYSE; for intraday trading, a strategy of setting trading frequency may work as well.
To hedge the risk, options on the asset can be longed or
shorted accordingly, with different positions in the asset.

References
[1] Levinson, M. Guide to financial markets; Bloomberg
Press, 2009.
[2] Meakin, L. Deutsche Bank currency crown lost to Citigroup on low volatility; 2014.

Summary
[3] Monetary&Economic, D. Foreign exchange turnover in
To summarize, an algorithm on forex trading was devel-

oped. The prediction accuracy for the bid price is high for
-support vector machine model. Three different strategies
originated from intuitive thoughts were tested, whose performances were evaluated. Using the data of USD/EUR in
March, 2014, a high monthly return of 10% was achieved.

T REND F OLLOWING : F INAL R EPORT

April 2013: preliminary global results; 2013.


[4] Driver, M. An Introduction to Forex Trading-A Guide for
Beginners; Matthew Driver, 2011.
[5] Shao, C.; Zheng, Z. Algorithmic Trading Using Machine
Learning Techniques; 2013.

12

Vous aimerez peut-être aussi