Investment evaluation as measure of forecast success

Please see my markdown hosted on github which shows some surprising results regarding investments based on forecasts and uncorrelated data. It turns out that uncorrelated data can yield better returns.

Forecast ensemble

Over at github I have put the following:

This introduces a few known and a few new forecast functions. It then builds an ensemble forecast out of 13 models. It has the following steps:

1. Learn all models over training period
2. Predict h periods ahead and build a weighted Bayesian model of the forecasts
3. Retrain the model on training + h to give new forecasts beyond this period (using previous weights)

It introduces four Bayesian models in stan

1. ARMA(2, 1)
2. ARMA(2, 1) with weighting of obs
3. Local linear trend
4. Weight model (eg it can model 13 weights on 13 X variables and 10 time steps which is not possible in frequentist setup)

Note that most code has tests around the functions. You need to load all scripts to get the `forecastEns()` to run.

GCSE mean imputation

Many GCSE results are reported in a very compact form. I have written some R code which via simulation allows to translate grade brackets into numeric grades. You can read it here. I specifically look at grades and dispersion by gender.

Matching sets/distributions

I was interested to see how to match 2 different sets with slightly different distributions. This can be relevant when you want to test the differences between 2 groups.

Assume you have a long (1) and short set (2). First I sort set 2 by its values. I also estimate the mean difference between consecutive sorted values (mean diff).

My algorithm passes once through set 1 and tries to find a match for every set 1 item in set 2. If the current difference/distance is better than the previous or the next then we have a match (because it’s sorted), and we remove the matched item from set 2. I added a condition where the difference has to be between X multiples of the mean difference. This ensures that I don’t match some remaining large value/distance just because few match items remain. I also added another break condition: if distances get bigger, stop.

I matched 93% in my test. It takes 6% of n1*n2 possible iterations in my test with n1=60k and n2=10k.

The resulting distributions of the matches is an average between distribution 1 and 2. This means you cannot longer assume that the matched items represent the full sets 1 or 2. However you can compare matches to each other.

The code can be seen and run here https://repl.it/BU9Z