0% found this document useful (0 votes)
44 views15 pages

Stock Market Prediction

The document discusses stock market prediction using machine learning techniques like LSTM neural networks. It describes implementing a stock prediction model in Python by loading and preparing data, building and compiling an LSTM model, and making predictions. It also discusses common reasons why stock prediction projects fail, such as selection bias, improper portfolio construction, issues with data preprocessing, lookahead bias, and failing to backtest strategies.

Uploaded by

Ajay Jayakumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views15 pages

Stock Market Prediction

The document discusses stock market prediction using machine learning techniques like LSTM neural networks. It describes implementing a stock prediction model in Python by loading and preparing data, building and compiling an LSTM model, and making predictions. It also discusses common reasons why stock prediction projects fail, such as selection bias, improper portfolio construction, issues with data preprocessing, lookahead bias, and failing to backtest strategies.

Uploaded by

Ajay Jayakumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

THE STOCK MARKET PREDICTION

Stock market prediction is the act of trying to determine the future


value of company stock or other financial instruments traded on an
exchange.
The successful prediction of a stock’s future price could yield a
significant profit. In this application, we used the LSTM network to
predict the closing stock price using the past 60-day stock price.

For the application, we used the machine learning technique called Long
Short Term Memory (LSTM). LSTM is an artificial recurrent neural
network (RNN) architecture used in the field of deep learning.
Unlike standard feed-forward neural networks, LSTM has feedback
connections. It can not only process single data points (such as images),
but also entire sequences of data (such as speech or video).

LSTM is widely used for the problems of sequence prediction and been
very effective

Implementation of Stock Price Prediction in Python

1. Importing Modules

First step is to import all the necessary modules in the project. we will be
using basic modules like numpy, pandas, and matplotlip. In addition to
this, we will be using some submodules of keras to create and build our
model properly

2. Loading and Preparation of Data

3. Understanding the Data

3.1 Getting Unique Stock Names

3.1 Getting Unique Stock Names

3.3 Visualizing the stock data


4. Creating a new Dataframe and Training data

5. Building LSTM Model

6. Compiling the Model

The LSTM model is compiled using the mean squared error (MSE) loss
function and the adam optimizer.

7. Testing the model on testing data

8. Error Calculation

RMSE is the root mean squared error, which helps to measure the
accuracy of the model.
9. Make Predictions

The final step is to plot and visualize the data. To visualize the data we
use these basic functions like title, label, plot as per how we want our
graph to look like

Why Stock Prediction Projects Fail

With the resurgence of machine learning and artificial intelligence, never


has it been easier to implement predictive algorithms both new and old.
With just a few lines of code, state of the art models can be readily
accessible at the fingertips of the budding data enthusiast, ready to
conquer whatever insurmountable digital task may lay at hand. But a
little bit of knowledge can be a dangerous thing. While much of machine
learning can be attributed to statistics and programming what is equally
important, but often skipped over in favor of instant gratification, is
domain knowledge.

Nowhere is this more true than in investing. While the environment is rich
with stock price, and fundamental data that is both accessible and free,
indiscriminate application of pre-processing techniques and machine
learning algorithms will produce indiscriminate results. Financial time
series data are incredibly nuanced with the signal to noise ratio
systemically low, practitioners spend their careers trying to achieve the
elusive aim of generating consistent outperformance, with only a few
succeeding. Thus the need for a more intimate understanding of the
data is pertinent to achieving some semblance of success. As such, this
article aims to shed light on some common reasons why stock
prediction projects may fail once put into production.

1. Selection Bias

Many projects start with the arbitrary selection of a stock with which an
algorithm is to be applied to, this stock is often a tech stock such as
Apple or Amazon, the simple reason being these companies are well
known and ingrained in the everyday lives of consumers. This is
problematic as stock selection is not an arbitrary process, it is part of the
investment decision making process that requires a model in itself.
Take Apple for example, if we look at its performance in 2019 vs the
broader SP 500 index we see that it outperformed the index by nearly
60%.

Apple vs S&P500
the profile is broadly the same for Amazon, Microsoft and Google as the
US tech was the best performing sector over 2019. Arbitrarily picking a
stock from that sector as a starting point will materially misrepresent the
characteristics of the investment opportunity set.

2. Portfolio Construction

Controlling risk is as important to a robust investment strategy as


generating returns. If stock selection is the first part of the investment
process, then portfolio construction is the vital next step. Many projects
will suggest a strategy buying or selling a particular stock but often with
the assumption 100% of the potential portfolio will be invested in that
stock. This is rarely the case in practice, a single exposure leaves an
investor vulnerable to a tremendous amount of concentration risk.
Prudently constructed portfolios are well diversified as it is one of the
most important sources of risk control. A viable machine learning
investment strategy should consider both stock selection and portfolio
construction.

3. Incorrect Application of Pre-processing


Standard rinse, wash and repeat data pre-processing techniques like
standardization cannot be directly applied to stock prices. The below
plot of the yearly distribution of S&P500 price levels should give some
intuition as to why:

S&P500 Index Price Levels from 2007–2008


Within the standard train/test split paradigm of machine learning,
pre-processing is applied by taking a transformation, using the
parameters of the training set, applied to the test set with the explicit
assumption that the training and test samples are drawn from the same
distribution.

We can see clearly that the distribution of stock prices change from year
to year, meaning that the mean and standard deviations will also change.
This property of financial time series is called non-stationarity and it
remains an open problem in financial prediction. It can also be observed
that the distributions are rarely normal, rendering parametric measures
such as mean and standard deviation meaningless.

In addition, applying other common procedures like min-max


normalization does not solve this problem as the lower bound will also
change from year to year and there is no theoretically upper bound on
prices. Practitioners will often apply a price differencing transformation
(stock price returns) however this does not entirely remove some of the
non-favorable properties of stock prices.
4. Look Ahead Bias

While nowadays it only takes a couple of lines of code to pull down a


meaningful history of stock and macroeconomic fundamental data we
need to be cognizant that this data is plagued with look ahead bias.
Frequently, observations associated with particular dates would not have
actually been available at that date. For example stock fundamental data
is reliant on reporting quoted as at an effective date which usually
corresponds with company’s fiscal calendar, however, this reporting is
not released until months after the effective date, reflecting the time for
preparation.

In macroeconomic data this bias as a result of revisions to prior period


data that are made sometimes a quarter after the initial information was
released. This is particularly problematic for short term trading
strategies. Any project using these datasets should take into
consideration the appropriate lags and revisions.
5. The Project is Unfinished!

Many stock prediction projects will conclude like a regular machine


learning project with the disclosure of a performance metric such as
accuracy or RMSE with a line plot of the test vs training performance,
reasoning that if the two lines are sufficiently close and the error
reasonably low then the project was a success. This premature
conclusion omits a vital step of discovering a successful strategy;
testing the financial outcome. Investing cannot be reduced down to a
simple exercise of minimizing an intangible error rate as the
consequences of being wrong is very real. The final step should be to
backtest this strategy as if it was held through time and calculate
profit/loss or returns. The tester also need to consider that if the
portfolio had suffered significant drawdowns during testing before
recovering, whether they would have had the risk tolerance to proceed
with the strategy given the losses.

A simple example can be drawn using an Exponential Weighted


Moving Average strategy(EWMA), which incorporates a decaying
average of past prices as a prediction of future prices.
At first glance the EWMA predicts the S&P500 extremely well, but if
we take a closer look around the period of market drawdown early this
year we see things aren’t as they appear to be.
Even though the blue and orange lines still appear to

be closely anchored, the EWMA strategy cannot navigate


the intra-day volatility as it only incorporates past information,

it looks to be constantly chasing the true price level, often

causing it to predict the market to be up when it is actually

down and vice versa. Following this strategy over this period

would have underperformed the S&P500.

Conclusion
Before embarking on a stock prediction project, especially one

that you intend to invest real money in, it pays to do some prior

research on the subject and to understand the data. If the

results are too good to be true(e.g accuracy materially over

50%) than it probably is, with the number of participants and

the increasing sophistication of these participants, the market

is extremely efficient with price discovery, especially in stocks.

Although this might not preclude the possibility of potential

opportunities, it does mean that it requires a bit more effort


than an out of the box algorithm and standard pre-processing

techniques to find it.

You might also like