From the research paper authored by Peter Brown and Robert Mercer while at the IBM Watson
Research Center in the early 90s.
http://www.naxa.com/downloads/J93-2003.pdf
It seems the initial 5 models are based on Bayes Theorem for probability analysis. They use a
series of analysis to compare one against another (.e.g French vs English) to find patterns to see
how connection relate with each other. This is for translation purposes.
P
Pg 7: We generally follow the common convention of using uppercase letters to denote random
variables and the corresponding lowercase letters to denote specific values that the random
variables may take. We have already used I and m to represent the lengths of the strings e and L
and so we use L and M to denote the corresponding random variables.
As stated here by James Baker, https://www.quora.com/What-are-the-investment-strategies-of-
James-Simons-Renaissance-Technologies-I-understand-he-employs-complex-mathematical-
models-along-with-statistical-analyses-to-predict-non-equilibrium-changes
We need to understand the corresponding random variables. There is mention of Lagrange
multipliers and normalizing predictive data.
Auxiliary functions are used as well to generate desirable parameters using extrema/maxima
analysis. Conditional probabilities are used as well. On pg 276 there are translation, distortion
and fertility probabilities.
Page 282: Model 5 is a powerful but unwieldy ally in the battle to align translations. It must be
led to the battlefield by its weaker but more agile brethren Models 2, 3, and 4. In fact, this is the
raison d'etre of these models. To keep them aware of the lay of the land, we adjust their
parameters as we carry out iterations of the EM algorithm for Model 5. That is, we collect counts
for Models 2, 3, and 4 by summing over alignments as determined by the abbreviated S
described above, using Model 5 to compute Pr(ale, f). Although this appears to increase the
storage necessary for maintaining counts as we proceed through the training data, the extra
burden is small because the overwhelming majority of the storage is devoted to counts for t(fle ),
and these are the same for Models 2, 3, 4, and 5.
Page 283 shows the number of translations done which goes into the millions to determine a
small subset of useful words. EM algo used with maximum likelihood.
Pg 283 Although the entire t array has 2,437, 020,096 entries, and we need to store it twice, once
as probabilities and once as counts, it is clear from the preceeding remarks that we need never
deal with more than about 25 million counts or about 12 million probabilities. We store these
two arrays using standard sparse matrix techniques. We 283 Computational Linguistics Volume
19, Number 2 keep counts as pairs of bytes, but allow for overflow into 4 bytes if necessary. In
this way, it is possible to run the training program in less than 100 megabytes of memory. While
this number would have seemed extravag…
Page 293 speaks of Viterbi algo training:
We have already used this algorithm successfully as a part of a system to assign senses to
English and French words on the basis of the context in which they appear (…
Page 297 table of notation
Appendix B has summary of models. Note especially Log-Likelihood Objective Function.
Note page 300 iterative improvement.
In order to apply these algorithms, we need to solve the maximization problems of Steps 2 and 4.
For the models that we consider, we can do this explicitly. T
Page 301: Parameter Reestimation Formulae: In order to apply these algorithms, we need to
solve the maximization problems of Steps 2 and 4. For the models that we consider, we can do
this explicitly.
Equation (73) is useful in computations since it involves only O(lm) arithmetic operations,
whereas the original sum over alignments (72) involves 0(I m) operations.
Other styles:
Jan Dil answer:
According to what I read on Bloomberg (Inside a Moneymaking Machine Like No Other ) and
the responses here, RenTech makes on average 41%/year since 1988 with a maximum drawdown
of -4.1% (Bloomberg) to 70%/year (reported here in one of the answers) on, say, $50 Billion,
using some 200,000 trades/day. This daily volume amounts to about 1.4% of Nasdaq’s daily
trading, which sounds reasonable to me. There were times that RenTech occupied 10% of
Nasdaq’s trading volume. They have a reservoir of stocks to trade from and which they know
how to rank (StatArb) in mean-reversion mode, and where they find the volatility to produce the
kind of risks and returns of 70%/year consistently….
The reservoir that will do that is a collection of some 1300 Wall Street stocks with daily-dollar
volumes in excess of $1 Million. The ranking system that is able to rank each quarter the 6 top
ranks from this reservoir of 1300 is called Ergodic ranking. You don’t need any breaking news,
TA, or FA. You just need the historical eod data for this ranking system. CSI is our dataprovider
of choice, and we use Finance.Yahoo as a reference. The portfolio returns are considered as a
weighted sum of the individual asset returns. The weighting system is Kelly’s system where
portfolio weightings are computed each quarter so as to maximize annual returns. You do this for
10 or more years and you assume, with Jim Simons, that past performance is your best predictor
of success.
Other weighting systems like factoring are possible too. Like in signal and RADAR processing,
you may assume a propagation model and/or a probability density function to hold. In addition to
the portfolio weightings, factoring, propagation modelling, and probability density functions just
add fitting parameters that need to be computed. Computing these parameters imply additional
assumptions and fictitious information. With increasing number of parameters, the CPU
increases usually quadratically as do the chances on overfitting. Ergodic ranking reduces this to a
linear dependence, but also this implies an extra assumption that needs to be tested. The proof is
always in the puddin
https://www.quora.com/What-are-the-investment-strategies-of-James-Simons-Renaissance-
Technologies-I-understand-he-employs-complex-mathematical-models-along-with-statistical-
analyses-to-predict-non-equilibrium-changes
Note
https://quantlabs.net/blog/2019/02/c-source-code-and-research-papers-from-renaissance-
technologies/
See the latest of Metaprogramming C++ library
https://github.com/tjolsen/tmpl
open source project from RenTech in 2008
https://github.com/silpol/mrsync
https://nypost.com/2017/06/21/regulators-probing-legendary-hedge-funds-secret-trading-code/
https://whalewisdomalpha.com/renaissance-technologies-13f-strategy/
https://news.efinancialcareers.com/ca-en/3002461/pay-renaissance-technologies
Sample holdings
https://www.sec.gov/Archives/edgar/data/1037389/000103738910000308/0001037389-10-
000308.txt
Rentech software used
https://quant.stackexchange.com/questions/30509/what-is-advent-softwares-geneva
https://www.forexfactory.com/printthread.php?t=434829&pp=40&page=41
Howard Morgan, President, Renaissance Technologies Corp,
"The Microcomputer and DecisionSupport," Computerworld,
Aug 19 1985, pp. 39-45.
https://apps.dtic.mil/dtic/tr/fulltext/u2/a217408.pdf
https://news.ycombinator.com/item?id=16649002
In other words, they're a step above traditional "fundamental" hedge funds, but they focus on the
wrong problem (but not for lack of trying!). In contrast, the truly successful quant funds have
automated the data processing and feature extraction pipeline end to end. The data is a pure
abstraction to them. They don't bother with forming hypotheses and trying to find data to test
them, they allow their algorithms to actively discover new correlations from the ground up. So
many quantitative funds advertise how much data they work with, and how they have all these
exotic sources of data at their disposal...but the data does not matter. The models for the data do
not matter. The mathematics of efficiently processing that data are what matters….
In most cases, a trading strategy is sufficiently multidimensional that any particular set of data
can be completely public. Exclusive data is helpful, but not required. In many cases people
become too dependent on exclusive data and lose sight of the methodology.
https://news.efinancialcareers.com/uk-en/298218/renaissance-technologies-secrets-to-quant-
hedge-funds-vc-career-success
https://www.fxleaders.com/forex-signals/forex-signals-articles/algo-trading-rentec/
https://www.afr.com/technology/inside-the-medallion-fund-a-74-billion-moneymaking-machine-
like-no-other-20161122-gsuohh
--→Peter Brown and Robert Mercer audio speech in 2013 -
https://cs.jhu.edu/~post/bitext/
Old Jim Simons interview from 2000
https://www.institutionalinvestor.com/article/b151340bp779jn/the-secret-world-of-jim-
simons#.WC87Y7IrIuU