Forecast ensemble

Over at github I have put the following:

This introduces a few known and a few new forecast functions. It then builds an ensemble forecast out of 13 models. It has the following steps:

1. Learn all models over training period
2. Predict h periods ahead and build a weighted Bayesian model of the forecasts
3. Retrain the model on training + h to give new forecasts beyond this period (using previous weights)

It introduces four Bayesian models in stan

1. ARMA(2, 1)
2. ARMA(2, 1) with weighting of obs
3. Local linear trend
4. Weight model (eg it can model 13 weights on 13 X variables and 10 time steps which is not possible in frequentist setup)

Note that most code has tests around the functions. You need to load all scripts to get the `forecastEns()` to run.

Till change code

Nowadays supermarket customers can check out themselves and scan products themselves. Many obviously pay by card but I was interested to see how a cash dispensing till would work.

I have written some Python code to simulate the whole setup. We start with a till that has a distibution of coins (1000 1p etc). The key function is change(), which given a price and total paid (difference rem) has to issue the right coin change. It finds the largest coin below the rem and then works from largest to smallest to issue correct change. This way we try to avoid running out of small coins. This large-to-small algorithm is very simple and probably close how humans would do it. One improvement could be to under-sample coins that we have few off in the till.

The trickiest bit (draw()) is simulating what a customer would give in change. I had to add some randomness to it to stock up small coins (10% of time customer gives exact change in 1, 2 or 5p coins). Without this part the till will run out of change very quickly.

I work a lot with global variables in this example as I don’t have to handle function IO so much.

Dynamic programming example

In this example I want to show the principles of dynamic programming and recursion in a simple example. Imagine you have coins of different values and you want to count the many ways that those coins can make up 200 pennies/cents. To solve this problem we use a recursive function which grows like a tree. You start with the biggest coins first.

Define a function add() which first loops through all coins. If we have an empty current set or the current coin is less or equal the last coin in the current set, then continue: if the sum of the set plus the proposed coin is less than 200, then add the current coin to the set and pass the new set to add(); if the sum of the set plus the current coin equals 200, add the coin to the set and add the new set to the set of solutions (do nothing if 200 is exceeded). Finally print the length of all solutions which is 73682.

This a recursive setup where add() references itself – you pass a current unfinished set to this function.

Here is the code in Python (v3) – it takes 7 seconds on repl.it.

Matching sets/distributions

I was interested to see how to match 2 different sets with slightly different distributions. This can be relevant when you want to test the differences between 2 groups.

Assume you have a long (1) and short set (2). First I sort set 2 by its values. I also estimate the mean difference between consecutive sorted values (mean diff).

My algorithm passes once through set 1 and tries to find a match for every set 1 item in set 2. If the current difference/distance is better than the previous or the next then we have a match (because it’s sorted), and we remove the matched item from set 2. I added a condition where the difference has to be between X multiples of the mean difference. This ensures that I don’t match some remaining large value/distance just because few match items remain. I also added another break condition: if distances get bigger, stop.

I matched 93% in my test. It takes 6% of n1*n2 possible iterations in my test with n1=60k and n2=10k.

The resulting distributions of the matches is an average between distribution 1 and 2. This means you cannot longer assume that the matched items represent the full sets 1 or 2. However you can compare matches to each other.

The code can be seen and run here https://repl.it/BU9Z

Finding number sequences in JS

I have played around with finding number sequences of the form `a+b*c^d`, where a/b/c/d can be the iterator x. The input is the first five numbers of the sequence. The JS code can be found here JSfiddle. It takes c 2.5 seconds to find all equations. This obviously doesn’t cover all interesting sequences, but might be a nice example how to use the `eval()` function for this.

Conditional formatting using JS

I have been writing some Javascript which reads data, creates the <table> HTML tags and then does conditional formatting of given numbers (based on five equal sized bins). Let me know if you find any bugs. 