Tutorial

Purpose

This tutorial is an introduction to the SrMise library and command-line tool, intended to expose new users and developers to the major use cases and options anticipated for SrMise.

Generating interest in SrMise is another goal of these examples, and we hope you will discover exciting ways to apply its capabilities to your scientific goals. If you think SrMise may help you do so, please feel free to contact us through the DiffPy website.

http://www.diffpy.org

Overview

SrMise is an implementation of the ParSCAPE algorithm, which incorporates standard chi-square fitting within an iterative clustering framework. The algorithm supposes that, in the absence of an atomic structure model, the model complexity (informally, the number of extracted peaks) which can be justifiably obtained from a PDF is primarily determined by the experimental uncertainties. The Akaike Information Criterion (AIC), summarized in the manual, is the information-theoretic tool used to balance model complexity with goodness-of-fit.

Three primary use cases are envisioned for SrMise:

  1. Peak fitting, where user-specified peaks are fit to the experimental data.
  2. Peak extraction, where the number of peaks and their parameters are estimated solely from the experimental data.
  3. Multimodel selection, where multiple sets of peaks are ranked in an AIC-driven analysis to determine the most plausible sets to guide additional investigation.

Productively running SrMise requires, in basic, the following elements:

  1. An experimental PDF. Note that peak extraction, though not peak fitting, requires that all peaks of interest be positive. This rules out peak extraction using SrMise for neutron PDFs obtained from samples containing elements with both positive and negative scattering factors.
  2. The experimental uncertainties. In principle these should be reported with the data, but in practice experimental uncertainties are frequently not reported, or are unreliable due to details of the data reduction process. In these cases the user should specify an ad hoc value. In peak extraction an ad hoc uncertainty necessarily results in ad hoc model complexity, or, more precisely, a reasonable model complexity if the provided uncertainty is presumed correct. (Even when the uncertainties are known, specifying an ad hoc value can be a pragmatic tool for exploring alternate models, especially in conjunction with multimodeling analysis.) For both peak extraction and peak fitting the estimated uncertainties of peak parameters (i.e. location, width, intensity) are dependent on the experimental uncertainty.
  3. The PDF baseline. For crystalline samples the baseline is linear and can be readily estimated. For nanoparticles more effort is required as SrMise includes explicit support for only a few basic shapes, although the user can define a baseline using arbitrary polynomials or an interpolating function constructed from a list of arbitrary numerical values.
  4. The range over which to extract or fit peaks. By default SrMise will use the entire PDF, but it is usually wise to restrict the range to the region of immediate interest.

The examples described below, though not exhaustive, go into detail about each of these points. They also cover other parameters for which good default values can usually be estimated directly from the data.

Getting Started

The examples are contained in the doc/examples/ directory of the SrMise source distribution, available as both a zip and tar.gz archive. Download one of these files (Windows users will generally favor the .zip, while Linux/Mac users the .tar.gz) to a directory of your choosing.

Uncompress the archive. If the downloaded file is archivename.zip or archivename.tar.gz this will create a new directory archivename in its current directory. On Windows this can be accomplished by right-clicking and choosing “Extract all”. On Linux/Mac OS X run, from the containing directory,

tar xvzf archivename.tar.gz

From a command window change to the doc/examples directory of the new folder. For example, a Windows’ user who extracted archivename.zip in the folder C:\Research would type

cd C:\Research\archivename\doc\examples

Every example below includes a Python script that may be run from this directory. While such scripts expose the full functionality of SrMise, for many common tasks the command-line program srmise included with the package is both sufficient and convenient, and the tutorial uses it to introduce many fundamental concepts. Its options may be examined in detail by running

srmise --help

It is recommended to work through, in the order presented, at least the command-line portion of each example. Users looking for more detail should find the copiously commented scripts helpful.

PDF Information

Information on the sample, experimental methods, and data reduction procedures for the example PDFs are summarized below. Special attention is given to why each PDF does or does not report reliable uncertainties.

Ag

A synchotron X-ray PDF (Qmax = 30 Å-1, Nyquist sampled) with reliable experimentally-estimated uncertainties for a crystalline powder of face-centered cubic silver. The 2D diffraction pattern was measured on an integrating detector. A Q-space 1D pattern with nearly uncorrelated experimentally-estimated uncertainties was obtained using SrXplanar. All other data reduction was performed using PDFgetX2.

Reliable experimental uncertainties were preserved during error propagation to the PDF by transforming the 1D pattern to the minimally-correlated (Nyquist) grid without intermediate resampling.

C60

A synchotron X-ray PDF (Qmax = 21.3 Å-1, finely sampled) for a powder of buckminsterfullerene nanoparticles in a face-centered cubic lattice, but with no fixed orientation at the lattice sites. The 2D diffraction pattern was measured on an integrating detector. A 2θ 1D pattern without propagated uncertainties was obtained using FIT2D. All other data reduction was performed using PDFgetX2. This PDF is unnormalized, so the scale of the y-axis is arbitrary. The nanoparticle baseline used for testing this PDF with SrMise is a fit to the observed interparticle contribution using an empirical model of thin spherical shells of constant density in an FCC lattice.

This PDF has unreliable uncertainties. Since the 1D pattern reports no uncertainty, PDFgetX2 treats the uncertainty as equal to the square-root of the values in the 1D pattern, which is invalid for integrating detectors. Moreover, the 1D pattern must be resampled onto a Q-space grid before the PDF can be calculated, and this introduces correlations between points. Finally, the PDF is itself oversampled, resulting in further correlations.

TiO2

A synchotron X-ray PDF (Qmax = 26 Å-1, finely sampled) for a crystalline powder of titanium dioxide (rutile). The 2D diffraction pattern was measured on an integrating detector. A Q-space 1D pattern with nearly uncorrelated experimentally-estimated uncertainties was obtained using SrXplanar. All other data reduction was performed using PDFgetX2.

Although the 1D diffraction pattern has reliable uncertainties, this PDF was (illustratively) sampled faster than the Nyquist rate, introducing significant correlations between nearby data points. Resampling this PDF at the Nyquist rate cannot recover reliable uncertainties unless the full variance-covariance matrix has been preserved and is propagated during resampling.