I’ve created a Matlab function that properly normalises a histogram with an upper and lower boundary. The main reason for this is that if you have a relatively small number of bins and try to normalise using the trapezium rule just using the area under those bins then you’ll often slightly underestimate the area. This can also be particularly prevalent if your histogram is up against a hard bound, but with a large value e.g. imagine a histogram of positive data peaked close to zero. My function just adds bits on the edges of the histogram data to try and mitigate this effect a bit. Here it is (hopefully self-explanatory):
function [n, x] = histnormbounds(y, m, low, high) % [n, x] = histnormbounds(y, m, low, high) % % Create a normalised historam (normalised using the trapezium rule to % calculated the area). The low and high values are the boundaries of the % histogrammed data, and ensure that the histogram gets properly normalised % if close to the edges of the boundaries. m is the number of histogram % bins or a vector of bins values. % set default bounds if none are set (low = -inf and high = inf) if ~exist('low', 'var') low = -inf; end if ~exist('high', 'var') high = inf; end % check that data is within bounds if min(y) < low && max(y) > high error('Data is outside of bounds'); end % get histrogram if isscalar(m) if m < 2 error('Need more points in histogram'); end [n, x] = hist(y, m); else if length(m) < 2 error('Need more points in histogram'); end n = hist(y, m); x = m; end botbinwidth = x(2)-x(1); topbinwidth = x(end)-x(end-1); % just in case x isn't uniform % if histogram points are not close to boundaries (i.e. within a bin of the % boundaries) then add zeros to histrogram edges if x(1)-botbinwidth > low n = [0, n]; x = [x(1)-botbinwidth, x]; end if x(end)+topbinwidth < high n = [n, 0]; x = [x, x(end)+topbinwidth]; end % if the histogram is closer to the boundary edge than the bin width then % set a new bin on the boundary with a value linearly extrapolated from the % gradiant of the adjacent points if x(1)-botbinwidth < low dx = x(1) - low; x = [low, x]; dn = n(2)-n(1); nbound = n(1) - (dn/botbinwidth)*dx; n = [nbound, n]; end if x(end)+topbinwidth > high dx = high - x(end); x = [x, high]; dn = n(end)-n(end-1); nbound = n(end) + (dn/topbinwidth)*dx; n = [n, nbound]; end % normalise n by area n = n/trapz(x, n);