Unhist
Contents |
Purpose
Create a vector whose values follow an empirical distribution.
Synopsis
- d = unhist(x,y,n);
Description
Given the x and y values for an empirical distribution (histogram) where y is the number of times each value of x is seen, UNHIST returns a vector as close to length n as possible whose values follow the provided distribution as close a possible.
UNHIST is useful when attempting to derive statistical information on the distribution of X values in a given x,y relationship, (the output can be passed into SUMMARY, for example, to get information on the empirical distribution in x) or when a set of values is needed which follow a certain empirical distribution.
The output, d, from:
d = unhist(x,y,n);
will be a vector close to length n such that the command:
[hy, hx] = hist(d,x);
would give an hy where hy is an approximation of y except for scale.
The values within y are divided up into n bins and negative values in y are ignored. Note that the output vector may differ from length n because of rounding error while creating bins.
Inputs
- x = vector of bin centers.
- y = vector of frequency of occurrence of each bin in x
- n = (optional) target length for output vector. This also defines the resolution over which y is divided. Larger n leads to finer resolution of y (such that the hy output from [hx,hy]=hist(d) will be a closer approximation of y). Default value = 1000. Actual output length may vary because of rounding on the scaled y values.
Outputs
- d: vector close to length n which contains y_i occurrences of each corresponding x_i value.
Example
Create example x and y, then use UNHIST to create a larger (n=100) sampling which follows the same distribution:
x=1:15; y = [1 3 4 2 0 0 1 5 8 9 7 4 2 1 0]; d = unhist(x,y, 100);
View a histogram of the new larger set:
hist(d,x);
See the distribution of the larger sample set:
[hy, hx] = hist(d,x);
and a summary of the distribution:
summary(d');