Commits · 8d999994627c02efceb3e7eadaddae62650a22a9 · mantidproject / mantid

This project is mirrored from https://github.com/mantidproject/mantid.git. Pull mirroring updated 29 minutes ago.

Jan 20, 2020

Abins Broadening: remove non-optional kwargs · 8d999994
Adam J. Jackson authored 5 years ago

8d999994

Abins interpolated broadening: tolerate direction changes in sigma · 7c26355b

Adam J. Jackson authored 5 years ago

The initial implementation assumed that the input series of sigma
values would monotonically increase, which is reasonable for TOSCA but
not for other instruments.

The logic is modified so that instead of
the input data, the list of sigma values used for convolution is
required to be sorted. As this list was generated within the same
function and ranges from (min(sigma), max(sigma)), the condition is
always satisfied.

The unit tests are not affected, but there may be a small numerical
difference as another implementation detail has been fixed in the
process; the interpolation x-values ('sigma factors') should be the
ratio between sigma and sigma at the lower bound of the current
window. In the initial implementation this was mistakenly identified
as sigma_chunk[0], the left-most value in the block _out of the actual
sigma values_; it should be the exact convolved width from `sigma_samples`.

7c26355b

Abins broadening: tests for broadening kernels · b082a36c
Adam J. Jackson authored 5 years ago

b082a36c

Abins broadening: rework normalisation to work with extreme values · e216c68b

Adam J. Jackson authored 5 years ago

Normalise Gaussians by multiplying by bin-width instead of dividing by
sum(values). The sum approach can go badly wrong if values lie near or
beyond the sampling range, as as tail fragment (or noise floor!) would
be amplified to unity. Scaling by the theoretical value as we do here
has a different downside: truncated broadening kernels will not quite
sum up to 1, so a little intensity is lost in broadening. This error
is more tolerable, especially as it can be decreased by extending the
truncation range.

e216c68b

Abins broadening: avoid unreliable float comparison · d0f4cb5d
Adam J. Jackson authored 5 years ago

d0f4cb5d
Abins broadening: flake8 tidy-up · 67743e98
Adam J. Jackson authored 5 years ago

67743e98

Abins broadening: null method, remove legacy, clean up · 5e08904c

Adam J. Jackson authored 5 years ago

- The 'legacy' broadening method is removed. It is not recommended for
  use in any circumstance.
- A 'none' option is provided which simply resamples, for
  reference/test purposes.
- Some other broadening schemes are consolidated:
  - 'gaussian_trunc' method is removed and the superior
    'gaussian_trunc_windowed' is renamed to 'gaussian_truncated'
  - The different summation approaches are no longer exposed to the
    choice of broadening method; a default 'auto' choice selects between
    the two summation schemes based on the size of the data.
- An 'auto' broadening scheme is introduced, and from here all Abins
  Instruments are expected to provide one. For TOSCA it switches
  between 'gaussian_truncated' and 'interpolate' depending on the
  number of input frequencies.
- Responsibility for deciding to resample is best located at the
  Instrument level, as this needs to trigger the calculation of an
  appropriate set of broadening widths (sigma).
- The broadening method selection is now drawn from AbinsParameters
  during SPowderSemiEmpiricalCalculator; the motivation here is to move
  'Abins state' information as high up in the program as appropriate,
  so that information about what modes/settings are used are generally
  determined by function arguments. This is more maintainable as it
  allows lower-level objects/functions to be tested independently of a
  full Abins run.

Performance tests are interesting at this point. Testing
_broaden_spectrum_ with some synthetic data:

| Scheme             | Average time | Max time |
|--------------------+--------------+----------|
| none               |       0.0002 |   0.0005 |
| interpolate        |       0.0035 |   0.0046 |
| interpolate_coarse |       0.0029 |   0.0034 |
| gaussian_truncated |       0.0241 |   0.0269 |
| normal_truncated   |       0.0245 |   0.0257 |
| gaussian           |       0.2760 |   0.2794 |
| normal             |       0.2422 |   0.2525 |

So there is a clear heirarchy of methods with order-of-magnitude
performance differences.

Comparing full runs of Abins on crystalline benzene up to 3rd-order:

| Scheme             |  time |
|--------------------+-------|
| none               | 12.46 |
| interpolate        | 21.85 |
| interpolate_coarse | 21.28 |
| auto               | 21.78 |
| gaussian_truncated | 28.22 |
| normal_truncated   | 27.94 |

If we assume that 'none' is essentially pure overhead, then
'interpolate' costs around 10s... and the other methods are less than
twice as expensive. Where has the performance difference gone?

Try another system: Toluene molecule up to 4th order

| Scheme             |  time |
|--------------------+-------|
| none               | 1.905 |
| interpolate        | 3.188 |
| interpolate_coarse | 2.844 |
| auto               | 2.723 |
| gaussian_truncated | 5.887 |
| normal_truncated   | 5.943 |
| gaussian           | 57.81 |
| normal             | 50.53 |

Interesting that "normal" is quicker than "gaussian" in this case; the
"max time" column of the faster test has generally pointed to this
method being more _consistently_ fast than Gaussian even if slower on
_average_. I suspect some kind of caching magic.

"Auto" has beaten "interpolate" this time; this could be due to a
saving on the first order which doesn't really need binning. In any
case, it's disappointing that the factor-10 improvement from
interpolation seems to reduce to a factor-2 in practice. Perhaps
further optimisation is possible, but there may also be a
system-dependence at work. (The smaller test uses data randomly
distributed over the whole spectrum, while real data will cluster.)

In any case, with 'auto' we seem to have a strategy that brings the
cost of broadening roughly into line with other overheads in the
program.

There is a known route by which it might be possible to optimise
'interpolate' further: limiting the range over which each convolution
is performed. This could be quite complicated in practice, and there
are likely to be more fruitful avenues for improving the performance
of Abins. (e.g. combining data from different atoms before broadening.)

5e08904c

Abins broadening: remove unused extra kernel in interpolated scheme · a23c7209

Adam J. Jackson authored 5 years ago

The number of kernels was accidentally rounded up twice, leading to an
empty chunk which needed to be accounted for. Removing this seems to
have no impact on the results.

a23c7209

Abins broadening: refactor interpolation, apply PEP8 · cec6ae97

Adam J. Jackson authored 5 years ago

Fast interpolated broadening is now moved into a more
general/extensible function, including ascii-art documentation.
sqrt(2) spacing is implemented, and initial testing shows that this
noticably improves accuracy for little extra cost. Set this as the
default, and for now leave in the coarser option for test purposes.

A flag is also provided to skip rebinning inside this function; this
potentially avoids and expensive redundant operation but makes for
convoluted logic when this rebin is also considered inside the
Instrument. A cleaner scheme should be possible, in which it is
obvious where rebinning should happen.

cec6ae97

Abins broadening refactor · c02db5ba

Adam J. Jackson authored 5 years ago

Clean up some of the experimental broadening schemes into a more
systematic framework. Add comments and ASCII art to explain the
windowing procedure.

c02db5ba

Abins broadening - normalisation and tidying · 5fc365c1

Adam J. Jackson authored 5 years ago

A normalisation term was found in the Gaussian function which scales
using Sigma and fwhm to fix the maximum value of the peak as sigma is
varied. This does not appear to be consistent with the function
defined in the Abins paper, and leads to an increasing weight for
high-energy terms. This term is removed.

Instead a normalisation feature is added which normalises the sum of
the output values to one. This is required for a convolution kernel
that does not change the overall intensity, and removes any dependence
on the bin width. The cost of this process is not insignificant; it
may be more efficient to use the analytic integral ("normal" function)
which is 'pre-normalised'. Testing with different bin widths finds
that the differences in the resulting curves are negligible at typical
bin width and with large bins the results, while different, are not
obviously superior or inferior.

Work is started on refactoring the broadening methods that use
truncated functions by providing a truncation function that accepts
the `gaussian` or `normal` function as an input.

5fc365c1

Abins: Refactor into new file AbinsModules/Instruments/Broadening.py · e62a7283

Adam J. Jackson authored 5 years ago

Work in progress on the broadening features of Abins.

The Instrument class is becoming bloated with generic broadening
functions so these are moved into their own module. In contrast with
most of Abins, this module is not object-oriented and consists of
top-level functions; these are primarily for use by Instrument
implementations.

The "legacy" broadening logic is removed entirely and this label is now
used for the implementation of the legacy behaviour within the new
broadening logic (replacing the "legacy_plus") label.

Several redundant broadening implementations are currently in place
while their performance/scaling is tested.

e62a7283