Commit d64324b9 authored by Somnath, Suhas's avatar Somnath, Suhas
Browse files

Removed import of the unicode_literal for all files. This will be added back...

Removed import of the unicode_literal for all files. This will be added back only when h5py is fixed.
parent 65dc70a2
%% Cell type:markdown id: tags:
# Band Excitation data procesing using pycroscopy
### Suhas Somnath, Chris R. Smith, Stephen Jesse
The Center for Nanophase Materials Science and The Institute for Functional Imaging for Materials <br>
Oak Ridge National Laboratory<br>
2/10/2017
%% Cell type:markdown id: tags:
## Configure the notebook
%% Cell type:code id: tags:
``` python
# Ensure python 3 compatibility
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
# Import necessary libraries:
# General utilities:
import sys
import os
# Computation:
import numpy as np
import h5py
# Visualization:
import matplotlib.pyplot as plt
from IPython.display import display
# Finally, pycroscopy itself
import pycroscopy as px
# set up notebook to show plots within the notebook
% matplotlib inline
```
%% Cell type:markdown id: tags:
## Set some basic parameters for computation
This notebook performs some functional fitting whose duration can be substantially decreased by using more memory and CPU cores. We have provided default values below but you may choose to change them if necessary.
%% Cell type:code id: tags:
``` python
max_mem = 1024*8 # Maximum memory to use, in Mbs. Default = 1024
max_cores = None # Number of logical cores to use in fitting. None uses all but 2 available cores.
```
%% Cell type:markdown id: tags:
## Make the data pycroscopy compatible
Converting the raw data into a pycroscopy compatible hierarchical data format (HDF or .h5) file gives you access to the fast fitting algorithms and powerful analysis functions within pycroscopy
#### H5 files:
* are like smart containers that can store matrices with data, folders to organize these datasets, images, metadata like experimental parameters, links or shortcuts to datasets, etc.
* are readily compatible with high-performance computing facilities
* scale very efficiently from few kilobytes to several terabytes
* can be read and modified using any language including Python, Matlab, C/C++, Java, Fortran, Igor Pro, etc.
#### You can load either of the following:
* Any .mat or .txt parameter file from the original experiment
* A .h5 file generated from the raw data using pycroscopy - skips translation
You can select desired file type by choosing the second option in the pull down menu on the bottom right of the file window
%% Cell type:code id: tags:
``` python
input_file_path = px.io_utils.uiGetFile(caption='Select translated .h5 file or raw experiment data',
filter='Parameters for raw BE data (*.txt *.mat *xls *.xlsx);; \
Translated file (*.h5)')
(data_dir, data_name) = os.path.split(input_file_path)
if input_file_path.endswith('.h5'):
# No translation here
h5_path = input_file_path
hdf = px.ioHDF5(px.io.hdf_utils.patch_be_lv_format(h5_path))
else:
# Set the data to be translated
data_path = input_file_path
(junk, base_name) = os.path.split(data_dir)
# Check if the data is in the new or old format. Initialize the correct translator for the format.
if base_name == 'newdataformat':
(junk, base_name) = os.path.split(junk)
translator = px.BEPSndfTranslator(max_mem_mb=max_mem)
else:
translator = px.BEodfTranslator(max_mem_mb=max_mem)
if base_name.endswith('_d'):
base_name = base_name[:-2]
# Translate the data
h5_path = translator.translate(data_path, show_plots=True, save_plots=False)
hdf = px.ioHDF5(h5_path)
print('Working on:\n' + h5_path)
h5_main = px.hdf_utils.getDataSet(hdf.file, 'Raw_Data')[-1]
```
%% Cell type:markdown id: tags:
##### Inspect the contents of this h5 data file
The file contents are stored in a tree structure, just like files on a conventional computer.
The data is stored as a 2D matrix (position, spectroscopic value) regardless of the dimensionality of the data. Thus, the positions will be arranged as row0-col0, row0-col1.... row0-colN, row1-col0.... and the data for each position is stored as it was chronologically collected
The main dataset is always accompanied by four ancillary datasets that explain the position and spectroscopic value of any given element in the dataset.
%% Cell type:code id: tags:
``` python
print('Datasets and datagroups within the file:\n------------------------------------')
px.io.hdf_utils.print_tree(hdf.file)
print('\nThe main dataset:\n------------------------------------')
print(h5_main)
print('\nThe ancillary datasets:\n------------------------------------')
print(hdf.file['/Measurement_000/Channel_000/Position_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Position_Values'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Values'])
print('\nMetadata or attributes in a datagroup\n------------------------------------')
for key in hdf.file['/Measurement_000'].attrs:
print('{} : {}'.format(key, hdf.file['/Measurement_000'].attrs[key]))
```
%% Cell type:markdown id: tags:
## Get some basic parameters from the H5 file
This information will be vital for futher analysis and visualization of the data
%% Cell type:code id: tags:
``` python
h5_pos_inds = px.hdf_utils.getAuxData(h5_main, auxDataName='Position_Indices')[-1]
pos_sort = px.hdf_utils.get_sort_order(np.transpose(h5_pos_inds))
pos_dims = px.hdf_utils.get_dimensionality(np.transpose(h5_pos_inds), pos_sort)
pos_labels = np.array(h5_pos_inds.attrs['labels'])[pos_sort]
print(pos_labels, pos_dims)
```
%% Cell type:markdown id: tags:
## Visualize the raw data
Use the sliders below to visualize spatial maps (2D only for now), and spectrograms.
For simplicity, all the spectroscopic dimensions such as frequency, excitation bias, cycle, field, etc. have been collapsed to a single slider.
%% Cell type:code id: tags:
``` python
px.be_viz_utils.jupyter_visualize_be_spectrograms(h5_main)
```
%% Cell type:code id: tags:
``` python
print(h5_main.shape)
h5_pos_inds = px.hdf_utils.getAuxData(h5_main, auxDataName='Position_Indices')[-1]
pos_sort = px.hdf_utils.get_sort_order(np.transpose(h5_pos_inds))
pos_dims = px.hdf_utils.get_dimensionality(np.transpose(h5_pos_inds), pos_sort)
pos_labels = np.array(h5_pos_inds.attrs['labels'])[pos_sort]
h5_spec_vals = px.hdf_utils.getAuxData(h5_main, auxDataName='Spectroscopic_Values')[-1]
h5_spec_inds = px.hdf_utils.getAuxData(h5_main, auxDataName='Spectroscopic_Indices')[-1]
spec_sort = px.hdf_utils.get_sort_order(h5_spec_inds)
spec_dims = px.hdf_utils.get_dimensionality(h5_spec_inds, spec_sort)
spec_labels = np.array(h5_spec_inds.attrs['labels'])[spec_sort]
ifreq = int(np.argwhere(spec_labels=='Frequency'))
freqs_nd = px.hdf_utils.reshape_to_Ndims(h5_spec_vals, h5_spec=h5_spec_inds)[0][ifreq].squeeze()
freqs_2d = nd_freqs.reshape(freqs_nd.shape[0], -1)
try:
num_udvs_steps = h5_main.parent.parent.attrs['num_udvs_steps']
except KeyError:
num_udvs_steps = h5_main.parent.parent.attrs['num_UDVS_steps']
```
%% Cell type:markdown id: tags:
## Fit the Band Excitation (BE) spectra
Fit each of the acquired spectra to a simple harmonic oscillator (SHO) model to extract the following information regarding the response:
* Oscillation amplitude
* Phase
* Resonance frequency
* Quality factor
By default, the cell below will take any previous result instead of re-computing the SHO fit
%% Cell type:code id: tags:
``` python
h5_sho_group = px.hdf_utils.findH5group(h5_main, 'SHO_Fit')
sho_fitter = px.BESHOmodel(h5_main, parallel=True)
if len(h5_sho_group) == 0:
print('No SHO fit found. Doing SHO Fitting now')
h5_sho_guess = sho_fitter.do_guess(strategy='complex_gaussian', processors=max_cores)
h5_sho_fit = sho_fitter.do_fit(processors=max_cores)
else:
print('Taking previous SHO results already present in file')
h5_sho_guess = h5_sho_group[-1]['Guess']
try:
h5_sho_fit = h5_sho_group[-1]['Fit']
except KeyError:
h5_sho_fit = sho_fitter.do_fit(processors=max_cores, h5_guess=h5_sho_guess)
```
%% Cell type:markdown id: tags:
## Visualize the SHO results
Here, we visualize the parameters for the SHO fits. BE-line (3D) data is visualized via simple spatial maps of the SHO parameters while more complex BEPS datasets (4+ dimensions) can be visualized using a simple interactive visualizer below.
You can choose to visualize the guesses for SHO function or the final fit values from the first line of the cell below.
Use the sliders below to inspect the BE response at any given location.
%% Cell type:code id: tags:
``` python
use_sho_guess = False
use_static_viz_func = False
if use_sho_guess:
sho_dset = h5_sho_guess
else:
sho_dset = h5_sho_fit
if hdf.file.attrs['data_type'] == 'BELineData' or len(pos_dims) != 2:
use_static_viz_func = True
step_chan = None
else:
if h5_main.parent.parent.attrs['VS_mode'] not in ['AC modulation mode with time reversal',
'DC modulation mode']:
use_static_viz_func = True
else:
if h5_main.parent.parent.attrs['VS_mode'] == 'DC modulation mode':
step_chan = 'DC_Offset'
else:
step_chan = 'AC_Amplitude'
if not use_static_viz_func:
try:
# use interactive visualization
px.be_viz_utils.jupyter_visualize_beps_sho(sho_dset, step_chan)
except:
print('There was a problem with the interactive visualizer')
use_static_viz_func = True
if use_static_viz_func:
# show plots of SHO results vs. applied bias
px.be_viz_utils.visualize_sho_results(sho_dset, show_plots=True,
save_plots=False)
```
%% Cell type:code id: tags:
``` python
h5_sho_dset = sho_dset
resp_func = None
guess_3d_data, success = px.io.hdf_utils.reshape_to_Ndims(h5_sho_dset)
h5_sho_spec_inds = px.io.hdf_utils.getAuxData(h5_sho_dset, 'Spectroscopic_Indices')[0]
h5_sho_spec_vals = px.io.hdf_utils.getAuxData(h5_sho_dset, 'Spectroscopic_Values')[0]
spec_nd, _ = px.io.hdf_utils.reshape_to_Ndims(h5_sho_spec_inds, h5_spec=h5_sho_spec_inds)
# sho_spec_sort = get_sort_order(h5_sho_spec_inds)
sho_spec_dims = np.array(spec_nd.shape[1:])
sho_spec_labels = h5_sho_spec_inds.attrs['labels']
h5_pos_inds = px.io.hdf_utils.getAuxData(h5_sho_dset, auxDataName='Position_Indices')[-1]
pos_nd, _ = px.io.hdf_utils.reshape_to_Ndims(h5_pos_inds, h5_pos=h5_pos_inds)
print(pos_nd.shape)
pos_dims = list(pos_nd.shape[:h5_pos_inds.shape[1]])
print(pos_dims)
pos_labels = h5_pos_inds.attrs['labels']
print(pos_labels)
# reshape to X, Y, step, all others
spec_step_dim_ind = np.arghwhere(sho_spec_labels == step_chan)[0][0]
step_dim_ind = len(pos_dims) + spec_step_dim_ind
# move the step dimension to be the first after all position dimensions
rest_sho_dim_order = range(len(pos_dims), len(guess_3d_data.shape))
rest_sho_dim_order.remove(step_dim_ind)
new_order = range(len(pos_dims)) + [step_dim_ind] + rest_sho_dim_order
# Transpose the 3D dataset to this shape:
sho_guess_Nd_1 = np.transpose(guess_3d_data, new_order)
# Now move the step dimension to the front for the spec labels as well
new_spec_order = range(len(sho_spec_labels))
new_spec_order.remove(spec_step_dim_ind)
new_spec_order = [spec_step_dim_ind] + new_spec_order
# new_spec_labels = sho_spec_labels[new_spec_order]
new_spec_dims = np.array(sho_spec_dims)[new_spec_order]
# Now collapse all additional dimensions
final_guess_shape = pos_dims + [new_spec_dims[0]] + [-1]
sho_dset_collapsed = np.reshape(sho_guess_Nd_1, final_guess_shape)
# Get the bias matrix:
bias_mat, _ = px.io.hdf_utils.reshape_to_Ndims(h5_sho_spec_vals, h5_spec=h5_sho_spec_inds)
bias_mat = np.transpose(bias_mat[spec_step_dim_ind], new_spec_order).reshape(sho_dset_collapsed.shape[len(pos_dims):])
# This is just the visualizer:
sho_quantity = 'Amplitude [V]'
step_ind = 0
row_ind = 1
col_ind = 1
def dc_spectroscopy_func(resp_vec):
return resp_vec['Amplitude [V]'] * np.cos(resp_vec['Phase [rad]']) * 1E+3
def ac_spectroscopy_func(resp_vec):
return resp_vec['Amplitude [V]']
if resp_func is None:
if step_chan == 'DC_Offset':
resp_func = dc_spectroscopy_func
resp_label = 'A cos($\phi$) (a. u.)'
else:
resp_func = ac_spectroscopy_func
resp_label = 'Amplitude (a. u.)'
spatial_map = sho_dset_collapsed[:, :, step_ind, 0][sho_quantity]
resp_vec = sho_dset_collapsed[row_ind, col_ind, :, :]
resp_vec = resp_func(resp_vec)
```
%% Cell type:code id: tags:
``` python
step_ind = 15
dset2 = sho_dset_collapsed.reshape([sho_shape[1], sho_shape[0], sho_shape[2], sho_shape[3]])
plt.imshow(dset2[:, :, step_ind, 0][sho_quantity])
```
%% Cell type:markdown id: tags:
## Fit loops to a function
This is applicable only to DC voltage spectroscopy datasets from BEPS. The PFM hysteresis loops in this dataset will be projected to maximize the loop area and then fitted to a function.
Note: This computation generally takes a while for reasonably sized datasets.
%% Cell type:code id: tags:
``` python
# Do the Loop Fitting on the SHO Fit dataset
loop_success = False
h5_loop_group = px.hdf_utils.findH5group(h5_sho_fit, 'Loop_Fit')
if len(h5_loop_group) == 0:
try:
loop_fitter = px.BELoopModel(h5_sho_fit, parallel=True)
print('No loop fits found. Fitting now....')
h5_loop_guess = loop_fitter.do_guess(processors=max_cores, max_mem=max_mem)
h5_loop_fit = loop_fitter.do_fit(processors=max_cores, max_mem=max_mem)
loop_success = True
except ValueError:
print('Loop fitting is applicable only to DC spectroscopy datasets!')
else:
loop_success = True
print('Taking previously computed loop fits')
h5_loop_guess = h5_loop_group[-1]['Guess']
h5_loop_fit = h5_loop_group[-1]['Fit']
```
%% Cell type:markdown id: tags:
## Prepare datasets for visualization
%% Cell type:code id: tags:
``` python
# Prepare some variables for plotting loops fits and guesses
# Plot the Loop Guess and Fit Results
if loop_success:
h5_projected_loops = h5_loop_guess.parent['Projected_Loops']
h5_proj_spec_inds = px.hdf_utils.getAuxData(h5_projected_loops,
auxDataName='Spectroscopic_Indices')[-1]
h5_proj_spec_vals = px.hdf_utils.getAuxData(h5_projected_loops,
auxDataName='Spectroscopic_Values')[-1]
# reshape the vdc_vec into DC_step by Loop
sort_order = px.hdf_utils.get_sort_order(h5_proj_spec_inds)
dims = px.hdf_utils.get_dimensionality(h5_proj_spec_inds[()],
sort_order[::-1])
vdc_vec = np.reshape(h5_proj_spec_vals[h5_proj_spec_vals.attrs['DC_Offset']], dims).T
#Also reshape the projected loops to Positions-DC_Step-Loop
# Also reshape the projected loops to Positions-DC_Step-Loop
proj_nd, _ = px.hdf_utils.reshape_to_Ndims(h5_projected_loops)
proj_3d = np.reshape(proj_nd, [h5_projected_loops.shape[0],
proj_nd.shape[2], -1])
```
%% Cell type:markdown id: tags:
## Visualize Loop fits
%% Cell type:code id: tags:
``` python
use_static_plots = False
if loop_success:
if not use_static_plots:
try:
px.be_viz_utils.jupyter_visualize_beps_loops(h5_projected_loops, h5_loop_guess, h5_loop_fit)
except:
print('There was a problem with the interactive visualizer')
use_static_plots = True
if use_static_plots:
for iloop in range(h5_loop_guess.shape[1]):
fig, ax = px.be_viz_utils.plot_loop_guess_fit(vdc_vec[:, iloop], proj_3d[:, :, iloop],
h5_loop_guess[:, iloop], h5_loop_fit[:, iloop],
title='Loop {} - All Positions'.format(iloop))
```
%% Cell type:markdown id: tags:
## Save and close
* Save the .h5 file that we are working on by closing it. <br>
* Also, consider exporting this notebook as a notebook or an html file. <br> To do this, go to File >> Download as >> HTML
* Finally consider saving this notebook if necessary
%% Cell type:code id: tags:
``` python
hdf.close()
```
%% Cell type:code id: tags:
``` python
```
......
%% Cell type:markdown id: tags:
# G-Mode filtering and inspection using pycroscopy
### Suhas Somnath and Stephen Jesse
The Center for Nanophase Materials Science and The Institute for Functional Imaging for Materials <br>
Oak Ridge National Laboratory<br>
5/05/2017
%% Cell type:markdown id: tags:
## Configure the notebook
%% Cell type:code id: tags:
``` python
# Ensure python 3 compatibility
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
# Import necessary libraries:
# General utilities:
from os import path
# Computation:
import numpy as np
import h5py
# Visualization:
import matplotlib.pyplot as plt
from IPython.display import display
# Finally, pycroscopy itself
import pycroscopy as px
# set up notebook to show plots within the notebook
% matplotlib inline
```
%% Cell type:markdown id: tags:
## Make the data pycroscopy compatible
Converting the raw data into a pycroscopy compatible hierarchical data format (HDF or .h5) file gives you access to the fast fitting algorithms and powerful analysis functions within pycroscopy
#### H5 files:
* are like smart containers that can store matrices with data, folders to organize these datasets, images, metadata like experimental parameters, links or shortcuts to datasets, etc.
* are readily compatible with high-performance computing facilities
* scale very efficiently from few kilobytes to several terabytes
* can be read and modified using any language including Python, Matlab, C/C++, Java, Fortran, Igor Pro, etc.
#### You can load either of the following:
* Any .mat or .txt parameter file from the original experiment
* A .h5 file generated from the raw data using pycroscopy - skips translation
You can select desired file type by choosing the second option in the pull down menu on the bottom right of the file window
%% Cell type:code id: tags:
``` python
input_file_path = px.io_utils.uiGetFile(caption='Select translated .h5 file or raw experiment data',
filter='Parameters for raw G-Line data (*.txt);; \
Translated file (*.h5)')
folder_path, _ = path.split(input_file_path)
if input_file_path.endswith('.txt'):
print('Translating raw data to h5. Please wait')
tran = px.GLineTranslator()
h5_path = tran.translate(file_path)
else:
h5_path = input_file_path
print('Working on:\n' + h5_path)
```
%% Cell type:markdown id: tags:
## Open the .h5 file and extract some basic parameters
%% Cell type:code id: tags:
``` python
hdf = px.ioHDF5(h5_path)
h5_main = px.hdf_utils.getDataSet(hdf.file, 'Raw_Data')[-1]
parms_dict = h5_main.parent.parent.attrs
samp_rate = parms_dict['IO_rate_[Hz]']
ex_freq = parms_dict['BE_center_frequency_[Hz]']
h5_spec_vals = px.hdf_utils.getAuxData(h5_main, auxDataName='Spectroscopic_Values')[0]
pixel_ex_wfm = h5_spec_vals[0, :int(h5_spec_vals.shape[1]/parms_dict['grid_num_cols'])]
```
%% Cell type:markdown id: tags:
##### Inspect the contents of this h5 data file
The file contents are stored in a tree structure, just like files on a conventional computer.
The data is stored as a 2D matrix (position, spectroscopic value) regardless of the dimensionality of the data. Thus, the positions will be arranged as row0-col0, row0-col1.... row0-colN, row1-col0.... and the data for each position is stored as it was chronologically collected
The main dataset is always accompanied by four ancillary datasets that explain the position and spectroscopic value of any given element in the dataset.
Note that G-mode data is acquired line-by-line rather than pixel-by-pixel.
%% Cell type:code id: tags:
``` python
print('Datasets and datagroups within the file:\n------------------------------------')
px.io.hdf_utils.print_tree(hdf.file)
print('\nThe main dataset:\n------------------------------------')
print(h5_main)
print('\nThe ancillary datasets:\n------------------------------------')
print(hdf.file['/Measurement_000/Channel_000/Position_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Position_Values'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Values'])
print('\nMetadata or attributes in a datagroup\n------------------------------------')
for key in hdf.file['/Measurement_000'].attrs:
print('{} : {}'.format(key, hdf.file['/Measurement_000'].attrs[key]))
```
%% Cell type:markdown id: tags:
## Try different FFT filters on the data
%% Cell type:code id: tags:
``` python
filter_parms = dict()
filter_parms['noise_threshold'] = 1E-4
filter_parms['comb_[Hz]'] = [ex_freq, 1E+3, 10]
# filter_parms['LPF_cutOff_[Hz]'] = -1
# Noise frequencies - 15.6 kHz ~ 14-17.5, 7.8-8.8, 45-49.9 ~ 48.9414 kHz
# filter_parms['band_filt_[Hz]'] = None # [[8.3E+3, 15.6E+3, 48.9414E+3], [1E+3, 0.5E+3, 0.1E+3]]
filter_parms['phase_[rad]'] = 0
filter_parms['samp_rate_[Hz]'] = samp_rate
filter_parms['num_pix'] = 1
# Test filter on a single line:
row_ind = 40
filt_line, fig_filt, axes_filt = px.processing.gmode_utils.test_filter(h5_main[row_ind], filter_parms, samp_rate,
show_plots=True, use_rainbow_plots=False)
fig_filt.savefig(path.join(folder_path, 'FFT_filter_on_line_{}.png'.format(row_ind)), format='png', dpi=300)
filt_row = filt_line.reshape(-1, pixel_ex_wfm.size)
fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, filt_row, x_label='Bias (V)', title='FFT Filtering',
plots_on_side=4, y_label='Deflection (a.u.)',
subtitles='Row: ' + str(row_ind) + ' Col:')
fig.savefig(path.join(folder_path, 'FFT_filtered_loops_on_line_{}.png'.format(row_ind)), format='png', dpi=300)
```
%% Cell type:markdown id: tags:
## Apply selected filter to entire dataset
%% Cell type:code id: tags:
``` python
# h5_filt_grp = px.hdf_utils.findH5group(h5_main, 'FFT_Filtering')[-1]
h5_filt_grp = px.processing.gmode_utils.fft_filter_dataset(h5_main, filter_parms, write_filtered=True)
h5_filt = h5_filt_grp['Filtered_Data']
# Test to make sure the filter gave the same results
filt_row = h5_filt[row_ind].reshape(-1, pixel_ex_wfm.size)
fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, filt_row, x_label='Bias (V)', title='FFT Filtering',
plots_on_side=4, y_label='Deflection (a.u.)',
subtitles='Row: ' + str(row_ind) + ' Col:')
```
%% Cell type:markdown id: tags:
## Now break up the filtered lines into "pixels"
Also visualize loops from different pixels
%% Cell type:code id: tags:
``` python
# h5_resh = h5_filt_grp['Filtered_Data-Reshape_000/Reshaped_Data']
h5_resh = px.processing.gmode_utils.reshape_from_lines_to_pixels(h5_filt, pixel_ex_wfm.size, 1)
fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, h5_resh, x_label='Bias (V)', title='FFT Filtering',
plots_on_side=5, y_label='Deflection (a.u.)')
fig.savefig(path.join(folder_path, 'FFT_filtered_loops_on_line_{}.png'.format(row_ind)), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
hdf.close()
```
......
%% Cell type:markdown id: tags:
# Analyzing Ptychography data using pycroscopy
### Stephen Jesse, Suhas Somnath, and Chris R. Smith,
The Center for Nanophase Materials Science and The Institute for Functional Imaging for Materials <br>
Oak Ridge National Laboratory<br>
1/19/2017
Here, we will be working with ptychography datasets acquired using a scanning transmission electron microscope (STEM). These ptychography datsets have four dimensions - two (x, y) dimensions from the position of the electron beam and each spatial pixel contains a two dimensional (u, v) image, called a ronchigram, recorded by the detector. Though the ronchigrams are typically averaged to two values (bright field, dark field), retaining the raw ronchigrams enables deeper investigation of data to reveal the existence of different phases in the material and other patterns that would not be visible in the averaged data
%% Cell type:markdown id: tags:
## Configure the notebook first
%% Cell type:code id: tags:
``` python
# Ensure python 3 compatibility
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
# Import necessary libraries:
import h5py
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from IPython.display import display
import ipywidgets as widgets
import pycroscopy as px
# set up notebook to show plots within the notebook
% matplotlib inline
```
%% Cell type:markdown id: tags:
## Load pycroscopy compatible ptychography dataset
For simplicity we will use a dataset that has already been transalated form its original data format into a pycroscopy compatible hierarchical data format (HDF5 or H5) file
#### H5 files:
* are like smart containers that can store matrices with data, folders to organize these datasets, images, metadata like experimental parameters, links or shortcuts to datasets, etc.
* are readily compatible with high-performance computing facilities
* scale very efficiently from few kilobytes to several terabytes
* can be read and modified using any language including Python, Matlab, C/C++, Java, Fortran, Igor Pro, etc.
%% Cell type:code id: tags:
``` python
# Select a file to work on:
# h5_path = px.io_utils.uiGetFile('*.h5', 'pycroscopy formatted Ptychography dataset')
h5_path = r"\\nanophase\IFgroup\SPM software development\Raw_Data\Ptychography\20120212_21_GB.h5"
print('Working on:\n' + h5_path)
# Open the file
h5_file = h5py.File(h5_path, mode='r+')
```
%% Cell type:markdown id: tags:
## Inspect the contents of this h5 data file
The file contents are stored in a tree structure, just like files on a contemporary computer.
The data is stored as a 2D matrix (position, spectroscopic value) regardless of the dimensionality of the data.
In the case of these 4D ptychography datasets, the data is stored as a N x P dataset where N is the number of spatial positions of the electron beam and P is the number of pixels in the detector.
The main dataset is always accompanied by four ancillary datasets that explain the position and spectroscopic value of any given element in the dataset.
In the case of the 2d images, the positions will be arranged as row0-col0, row0-col1.... row0-colN, row1-col0....
The spectroscopic information is trivial since the data at any given pixel is just a scalar value
%% Cell type:code id: tags:
``` python
# define a small function called 'print_tree' to look at the folder tree stucture
def print_tree(parent):
print(parent.name)
if isinstance(parent, h5py.Group):
for child in parent:
print_tree(parent[child])
print('Datasets and datagroups within the file:')
print_tree(h5_file)
print('\nThe main dataset:')
print(h5_file['/Measurement_000/Channel_000/Raw_Data'])
print('\nThe ancillary datasets:')
print(h5_file['/Measurement_000/Channel_000/Position_Indices'])
print(h5_file['/Measurement_000/Channel_000/Position_Values'])
print(h5_file['/Measurement_000/Channel_000/Spectroscopic_Indices'])
print(h5_file['/Measurement_000/Channel_000/Spectroscopic_Values'])
print('\nMetadata or attributes in a datagroup')
for key in h5_file['/Measurement_000'].attrs:
print('{} : {}'.format(key, h5_file['/Measurement_000'].attrs[key]))
```
%% Cell type:markdown id: tags:
## Read some basic parameters for visualization
%% Cell type:code id: tags:
``` python
# Select the dataset containing the raw data to start working with:
h5_main = px.hdf_utils.getDataSet(h5_file, 'Raw_Data')[-1]
# Read some necessary parameters:
h5_pos_inds = px.hdf_utils.getAuxData(h5_main, auxDataName=['Position_Indices'])[0]
num_rows = len(np.unique(h5_pos_inds[:, 0]))
num_cols = len(np.unique(h5_pos_inds[:, 1]))
h5_spec_inds = px.hdf_utils.getAuxData(h5_main, auxDataName=['Spectroscopic_Indices'])[0]
num_sensor_rows = len(np.unique(h5_spec_inds[0, :]))
num_sensor_cols = len(np.unique(h5_spec_inds[1, :]))
```
%% Cell type:markdown id: tags:
## Visualize the Raw Ronchigrams
%% Cell type:code id: tags:
``` python
current_inds = [int(0.5*num_rows),int(0.5*num_cols)]
half_ronch = [int(0.5*num_sensor_rows), int(0.5*num_sensor_cols)]
all_ronchs, success = px.io.hdf_utils.reshape_to_Ndims(h5_main)
all_ronchs = np.transpose(all_ronchs, [0,2,1,3])
all_ronchs = all_ronchs.reshape([num_rows*num_sensor_rows, num_cols*num_sensor_cols])
current_ronch_slice = (slice(current_inds[0]*num_sensor_rows, (current_inds[0]+1)*num_sensor_rows),
slice(current_inds[1]*num_sensor_cols, (current_inds[1]+1)*num_sensor_cols))
plt.imshow(all_ronchs[current_ronch_slice], cmap=px.plot_utils.cmap_jet_white_center(), origin='lower')
```
%% Cell type:code id: tags:
``` python
ronch_min = np.min(all_ronchs)
ronch_max = np.max(all_ronchs)
fig, axes = plt.subplots(ncols=2, figsize=(14,7))
axes[0].hold(True)
axes[0].imshow(all_ronchs, cmap=px.plot_utils.cmap_jet_white_center(), origin='lower')
current_roch_center = [current_inds[0]*num_sensor_rows+half_ronch[0],
current_inds[1]*num_sensor_cols+half_ronch[1]]
main_vert_line = axes[0].axvline(x=current_roch_center[1], color='r')
main_hor_line = axes[0].axhline(y=current_roch_center[0], color='r')
# ronch_box = axes[0].add_patch(patches.Rectangle((current_roch_center[1] - half_ronch[1], current_roch_center[0] - half_ronch[0]),
# num_sensor_cols, num_sensor_rows, fill=False,
# color='red', linewidth=2))
img_zoom = axes[1].imshow(all_ronchs[current_ronch_slice],cmap=px.plot_utils.cmap_jet_white_center(),
vmax=ronch_max, vmin=ronch_min, origin='lower')
# axes[1].axvline(x=half_ronch[0], color='k')
# axes[1].axhline(y=half_ronch[1], color='k')
def move_zoom_box(coarse_row, coarse_col):
current_inds[0] = coarse_row
current_inds[1] = coarse_col
current_roch_center = [current_inds[0]*num_sensor_rows+half_ronch[0],
current_inds[1]*num_sensor_cols+half_ronch[1]]
main_vert_line.set_xdata(current_roch_center[1])
main_hor_line.set_ydata(current_roch_center[0])
# ronch_box.set_x(current_roch_center[1] - half_ronch[1])
# ronch_box.set_y(current_roch_center[0] - half_ronch[0])
current_ronch_slice = (slice(current_inds[0]*num_sensor_rows, (current_inds[0]+1)*num_sensor_rows),
slice(current_inds[1]*num_sensor_cols, (current_inds[1]+1)*num_sensor_cols))
img_zoom.set_data(all_ronchs[current_ronch_slice])
img_zoom.set_clim(vmax=ronch_max, vmin=ronch_min)
display(fig)
widgets.interact(move_zoom_box, coarse_row=(0, num_rows, 1),
coarse_col=(0, num_cols, 1));
```
%% Cell type:markdown id: tags:
## Performing Singular Value Decompostion (SVD)
SVD decomposes data (arranged as position x value) into a sequence of orthogonal components arranged in descending order of variance. The first component contains the most significant trend in the data. The second component contains the next most significant trend orthogonal to all previous components (just the first component). Each component consists of the trend itself (eigenvector), the spatial variaion of this trend (eigenvalues), and the variance (statistical importance) of the component.
Here, SVD essentially compares every single ronchigram with every other ronchigram to find statistically significant trends in the dataset. Such correlation would be infeasible if the ronchigrams were averaged to bright-field and dark-field scalar values.
%% Cell type:code id: tags:
``` python
# First check if SVD was already computed on this dataset:
h5_svd_group = px.hdf_utils.findH5group(h5_main, 'SVD')
if len(h5_svd_group) == 0:
print('No prior SVD results found - doing SVD now')
h5_svd_group = px.doSVD(h5_main, num_comps=256)
else:
print('Taking previous SVD results already present in file')
h5_svd_group = h5_svd_group[-1]
h5_u = h5_svd_group['U']
h5_v = h5_svd_group['V']
h5_s = h5_svd_group['S']
num_comps = 16
```
%% Cell type:markdown id: tags:
## Visualize the SVD results
##### S (variance):
The plot below shows the variance or statistical significance of the SVD components. The first few components contain the most significant information while the last few components mainly contain noise.
Note also that the plot below is a log-log plot. The importance of each subsequent component drops exponentially.
%% Cell type:code id: tags:
``` python
# Visualize variance of the principal components
fig, axes = px.plot_utils.plotScree(h5_s, title='Variance')
```
%% Cell type:markdown id: tags:
#### U (Eigenvalues or loading maps):
The plot below shows the spatial distribution of each SVD component
%% Cell type:code id: tags:
``` python
# Visualize the eigenvalues or loading maps from SVD:
loadings = np.reshape(h5_u[:, :num_comps], (num_rows, num_cols, -1))
fig, axes = px.plot_utils.plot_map_stack(loadings, num_comps=num_comps, heading='Eigenvalues',
cmap=px.plot_utils.cmap_jet_white_center())
```
%% Cell type:markdown id: tags:
#### V (Eigenvectors)
The V dataset contains the eigenvectors for each component
%% Cell type:code id: tags:
``` python
# Visualize the eigenvectors from SVD:
eigenvectors = np.reshape(h5_v[:num_comps], (-1, num_sensor_rows, num_sensor_cols))
eigenvectors = np.transpose(eigenvectors, (1, 2, 0))
fig, axes = px.plot_utils.plot_map_stack(eigenvectors, num_comps=num_comps, heading='Eigenvectors',
cmap=px.plot_utils.cmap_jet_white_center())
```
%% Cell type:markdown id: tags:
## Clustering
Clustering divides data into k clusters such that the variance within each cluster is minimized.
In principle, clustering can be perfomed on any dataset that has some spectral values (eg. - ronchgirams in the case of the raw dataset or an array of values for each SVD component) for each position. However, the computation time increases with the size of the dataset.
Here, we will be performing k-means clustering on the U matrix from SVD.
We want a large enough number of clusters so that K-means identifies fine nuances in the data. At the same time, we want to minimize computational time by reducing the number of clusters. We recommend 32 clusters.
%% Cell type:code id: tags:
``` python
# Attempt to find any previous computation
h5_kmeans_group = px.hdf_utils.findH5group(h5_u, 'Cluster')
if len(h5_kmeans_group) == 0:
print('No k-Means computation found. Doing K-Means now')
num_clusters = 32
num_comps_for_clustering = 128
estimator = px.Cluster(h5_u, 'KMeans', num_comps=num_comps_for_clustering, n_clusters=num_clusters)
h5_kmeans_group = estimator.do_cluster()
else:
print('Taking existing results of previous K-Means computation')
h5_kmeans_group = h5_kmeans_group[-1]
h5_labels = h5_kmeans_group['Labels']
h5_centroids = h5_kmeans_group['Mean_Response']
# In case we take existing results, we need to get these parameters
num_clusters = h5_centroids.shape[0]
num_comps_for_clustering = h5_centroids.shape[1]
```
%% Cell type:markdown id: tags:
#### Visualize k-means results
%% Cell type:code id: tags:
``` python
label_mat = np.reshape(h5_labels, (num_rows, num_cols))
fig, axis = plt.subplots(figsize=(7,7))
axis.imshow(label_mat, cmap=px.plot_utils.cmap_jet_white_center())
axis.set_title('k-Means Cluster labels');
```
%% Cell type:markdown id: tags:
#### Visualize the hierarchical clustering
The vertical length of the branches indicates the relative separation between neighboring clusters.
%% Cell type:code id: tags:
``` python
e_vals = np.reshape(h5_u[:, :num_comps_for_clustering],
(num_rows, num_cols, -1))
fig = px.plot_utils.plot_cluster_dendrogram(label_mat, e_vals,
num_comps_for_clustering,
num_clusters,
last=num_clusters);
```
%% Cell type:markdown id: tags:
### Save and close
* Save the .h5 file that we are working on by closing it. <br>
* Also, consider exporting this notebook as a notebook or an html file. <br> To do this, go to File >> Download as >> HTML
* Finally consider saving this notebook if necessary
%% Cell type:code id: tags:
``` python
h5_file.close()
```
%% Cell type:code id: tags:
``` python
```
......
%% Cell type:markdown id: tags:
# Improving superconductivity in BaFe2As2-based crystals by cobalt clustering and electronic uniformity
Li Li, Qiang Zheng, Qiang Zou, Shivani Rajput, Anota Ijaduola, Zhiming Wu, Xiaoping Wang, Huibo Cao, Suhas Somnath, Stephen Jesse, Miaofang Chi, Zheng Gai, David Parker, and Athena Sefat
#### Scientific Reports
Notebook written by: Suhas Somnath
2/9/2016
%% Cell type:code id: tags:
``` python
# Ensure python 3 compatibility:
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from os import path
import numpy as np
import h5py
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from mpl_toolkits.axes_grid1 import make_axes_locatable
from scipy.signal import medfilt
from scipy.ndimage.filters import gaussian_filter
from sklearn.utils.extmath import randomized_svd
from sklearn.cluster import KMeans
import pycroscopy as px
from pycroscopy.io.translators.omicron_asc import AscTranslator
# set up notebook to show plots within the notebook
% matplotlib inline
```
%% Cell type:code id: tags:
``` python
#%% Load file
file_path = r"E:\Shivani\STS_Au-Ba122 crystal\I(V) TraceUp Tue Sep 20 09_17_08 2016 [14-1] STM_Spectroscopy STM.asc"
folder_path, file_name = path.split(file_path)
file_name = file_name[:-4] + '_'
```
%% Cell type:code id: tags:
``` python
tran = AscTranslator()
h5_path = tran.translate(file_path)
hdf = px.ioHDF5(h5_path)
h5_main = px.hdf_utils.getDataSet(hdf.file, 'Raw_Data')[-1]
```
%% Cell type:code id: tags:
``` python
x_label = 'Bias (V)'
y_label = 'Current (a.u.)'
```
%% Cell type:code id: tags:
``` python
def print_hdf_tree(h5_obj):
print(h5_obj.name)
if isinstance(h5_obj, h5py.Group):
for child in h5_obj:
print_hdf_tree(h5_obj[child])
```
%% Cell type:code id: tags:
``` python
print('Datasets and datagroups within the file:\n------------------------------------')
print_hdf_tree(h5_main.file)
print('\nThe main dataset:\n------------------------------------')
print(h5_main)
print('\nThe ancillary datasets:\n------------------------------------')
print(hdf.file['/Measurement_000/Channel_000/Position_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Position_Values'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Values'])
print('\nMetadata or attributes in a datagroup\n------------------------------------')
for key in hdf.file['/Measurement_000'].attrs:
print('{} : {}'.format(key, hdf.file['/Measurement_000'].attrs[key]))
```
%% Cell type:code id: tags:
``` python
# Read some basic parameters
h5_meas_grp = hdf.file['/Measurement_000']
num_rows = int(h5_meas_grp.attrs['y-pixels'])
num_cols = int(h5_meas_grp.attrs['x-pixels'])
num_pos = num_rows * num_cols
spectra_length = int(h5_meas_grp.attrs['z-points'])
volt_vec = px.hdf_utils.getAuxData(h5_main, auxDataName='Spectroscopic_Values')[-1][0]
```
%% Cell type:code id: tags:
``` python
raw_data_3d = np.reshape(h5_main[()], (num_rows, num_cols, spectra_length))
fig, axes = plt.subplots(ncols=2, figsize=(8, 4))
px.plot_utils.plot_map(axes[0], raw_data_3d[:, :, 0], cmap=px.plot_utils.cmap_jet_white_center())
axes[0].set_aspect(1)
axes[0].invert_yaxis()
axes[0].set_title('slice of data - use this for cropping the data')
axes[1].plot(volt_vec, raw_data_3d[10, 10, :])
axes[1].set_xlabel('Bias (V)')
axes[1].set_ylabel('Current (nA)')
axes[1].set_title('Response at a single pixel')
fig.tight_layout()
```
%% Cell type:code id: tags:
``` python
#%% User supplied information for cropping data:
start_row = 30 # 0 is the first row
end_row = num_rows # num_rows is the last row
start_col = 0
end_col = num_cols # num_cols is the last column
volt_chop = 0.3
```
%% Cell type:code id: tags:
``` python
# now crop the data according the specifications above:
volt_index = np.where(np.logical_and(volt_vec >= -volt_chop, volt_vec <= volt_chop))[0]
volt_vec = volt_vec[volt_index]
raw_data_3d = raw_data_3d[start_row:end_row, start_col:end_col, volt_index]
num_rows, num_cols, spectra_length = raw_data_3d.shape
raw_data_2d = raw_data_3d.reshape(-1, spectra_length)
fig, axes = plt.subplots(ncols=2, figsize=(8, 4))
px.plot_utils.plot_map(axes[0], raw_data_3d[:, :, 0], cmap=px.plot_utils.cmap_jet_white_center())
axes[0].set_aspect(1)
axes[0].invert_yaxis()
axes[0].set_title('Cropped data')
axes[1].plot(volt_vec, raw_data_3d[10, 10, :])
axes[1].set_xlabel('Bias (V)')
axes[1].set_ylabel('Current (nA)')
axes[1].set_title('Cropped response at a single pixel')
fig.tight_layout()
```
%% Cell type:code id: tags:
``` python
#%% Do k-means to identify the principal response types and where (spatially) the occur in the data
num_clusters = 16 # change the number of clusters here. Something ~ 16 is easy to understand and visualize
estimators = KMeans(num_clusters)
results = estimators.fit(raw_data_2d)
labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)
labels_mat = np.reshape(labels, (raw_data_3d.shape[0], raw_data_3d.shape[1]))
fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,
spec_label=x_label, resp_label=y_label)
axes_KM2[0].invert_yaxis()
axes_KM2[0].set_aspect(1)
fig_KM2.savefig(path.join(folder_path, file_name + 'K_means_V_' + str(num_clusters) + '_clusters.png'),
format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
# throw out bad points using a threshold. Works better than finding outliers using K-means
current_threshold = 0.2
bad_pixels = np.where(np.max(np.abs(raw_data_2d), axis=1) > current_threshold)[0]
# find the pixels that are good:
good_pixels = [item for item in np.arange(raw_data_2d.shape[0]) if item not in bad_pixels]
# find the average of all the good pixels
good_resp = np.mean(raw_data_2d[good_pixels], axis=0)
fig, axis = plt.subplots(ncols=2, figsize=(10, 5))
axis[0].set_title('Bad pixels')
px.plot_utils.plot_line_family(axis[0], volt_vec, raw_data_2d[bad_pixels], label_prefix='Cluster')
axis[1].set_title('Mean of all other pixels')
axis[1].plot(volt_vec, good_resp)
fig.savefig(path.join(folder_path, file_name + '_outliers_removal.png'), format='png', dpi=300)
# now replace bad pixels by the mean of the good pixels
outlier_free_2d = np.copy(raw_data_2d)
outlier_free_2d[bad_pixels] = good_resp
outlier_free_3d = np.reshape(outlier_free_2d, raw_data_3d.shape)
```
%% Cell type:code id: tags:
``` python
estimators = KMeans(num_clusters)
results = estimators.fit(outlier_free_2d)
labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)
labels_mat = np.reshape(labels, (outlier_free_3d.shape[0], outlier_free_3d.shape[1]))
fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,
spec_label=x_label, resp_label=y_label)
axes_KM2[0].invert_yaxis()
axes_KM2[0].set_aspect(1)
fig_KM2.savefig(path.join(folder_path, file_name + 'After_removing_outliers_K_means_V_' + str(num_clusters) + '_clusters.png'),
format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
num_comps = 256
U, S, V = randomized_svd(outlier_free_2d, num_comps, n_iter=3)
fig_V, axes_V = px.plot_utils.plot_loops(volt_vec, V, title='SVD Eigenvectors',
evenly_spaced=False, plots_on_side=4, use_rainbow_plots=False,
x_label=x_label, y_label=y_label, subtitles='Eigenvector')
fig_V.savefig(path.join(folder_path, file_name + 'SVD_Eigenvectors.png'), format='png', dpi=200)
loadings = np.reshape(U, (raw_data_3d.shape[0], raw_data_3d.shape[1], -1))
fig_U, axes_U = px.plot_utils.plot_map_stack(loadings, num_comps=16, cmap=px.plot_utils.cmap_jet_white_center())
fig_U.savefig(path.join(folder_path, file_name + 'SVD_Eigenvalues.png'), format='png', dpi=150)
fig_S, axes_S = px.plot_utils.plotScree(S)
fig_S.savefig(path.join(folder_path, file_name + 'SVD_S.png'), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
#%% SVD filtering - Reconstructing the original data with a subset of the SVD components to remove noise present in the senior components
# Note that this method of filtering does NOT downsample the data
filter_comps = 32
filtered_2d = np.dot(np.dot(U[:, :filter_comps], np.eye(filter_comps)*S[:filter_comps]), V[:filter_comps, :])
filtered_3d = np.reshape(filtered_2d, (raw_data_3d.shape[0], raw_data_3d.shape[1], -1))
# Compare the raw and filtered data
fig_filt, axes = px.plot_utils.plot_loops(volt_vec, [outlier_free_2d, filtered_2d], line_colors=['k', 'r'],
dataset_names=['Raw','SVD Filtered']
y_label='Current (a.u.)', title='SVD filtering')
fig_filt.savefig(path.join(folder_path, file_name + '_SVD_Filtered_data_I_' + str(filter_comps) + '_comps.png'),
format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
# Try a Gaussian filter on the stack:
gaus_sigma = 0.35
gaus_cycles = 1
source_dset = filtered_3d.copy()
for cyc_ind in range(gaus_cycles):
sink_dset = np.zeros(raw_data_3d.shape)
for volt_ind in range(spectra_length):
volt_slice = source_dset[:, :, volt_ind]
sink_dset[:, :, volt_ind] = gaussian_filter(volt_slice, gaus_sigma)
source_dset = sink_dset.copy()
gaus_filt_3d = sink_dset
# See some example loops:
gaus_filt_2d = np.reshape(gaus_filt_3d, raw_data_2d.shape)
fig, axes = px.viz.plot_utils.plot_loops(volt_vec, [filtered_2d, gaus_filt_2d], line_colors=['k', 'r'],
dataset_names=['Raw', 'Filtered'], x_label=x_label, y_label=y_label,
title='Gaussian Filter')
fig.savefig(path.join(folder_path, file_name + 'Gaussian_Filter_sigma_' + str(gaus_sigma) + '_cycles_' +
str(gaus_cycles) + '.png'), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
#%% Offset the data in the current axis by setting the current at 0V to 0
mid_pt = int(volt_vec.size * 0.5)
offsets = np.mean(gaus_filt_2d[:, mid_pt-1:mid_pt+1], axis=1)
offseted_data_2d = gaus_filt_2d - np.repeat(np.atleast_2d(offsets).transpose(), spectra_length, axis=1)
offseted_data_3d = np.reshape(offseted_data_2d, (raw_data_3d.shape[0], raw_data_3d.shape[1], -1))
#%% Plotting an example IV to check correction
row_ind = 15
col_ind = 12
fig, ax = plt.subplots()
ax.plot(volt_vec, filtered_3d[row_ind, col_ind], 'bo-', label='Raw')
ax.plot(volt_vec, offseted_data_3d[row_ind, col_ind], 'r*-', label='Offsetted')
ax.plot([volt_vec[0], volt_vec[-1]], [0, 0], 'k')
ax.plot([0, 0], [np.min(filtered_3d[row_ind, col_ind]), np.max(filtered_3d[row_ind, col_ind])], 'k')
# ax.set_xlim([volt_vec[mid_pt-5], volt_vec[mid_pt + 5]])
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
extents = 5
ax.set_xlim([volt_vec[mid_pt - extents], volt_vec[mid_pt + extents]])
chosen_sections_1 = filtered_3d[row_ind, col_ind, mid_pt - extents:mid_pt + extents]
chosen_sections_2 = offseted_data_3d[row_ind, col_ind, mid_pt - extents:mid_pt + extents]
ax.set_ylim([np.min(np.hstack([chosen_sections_1, chosen_sections_2])),
np.max(np.hstack([chosen_sections_1, chosen_sections_2]))])
ax.set_title('Ofsetting the current')
ax.legend(loc='best')
fig.savefig(path.join(folder_path, file_name + '_offsetting_current.png'), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
#%% Calculating dI/dV now:
dIdV_3d = np.diff(offseted_data_3d, axis=2)
# Adding 0 to spectral axis since we lose one index due to differential
dIdV_3d = np.dstack((dIdV_3d, np.zeros(shape=(offseted_data_3d.shape[0], offseted_data_3d.shape[1]))))
dIdV_2d = np.reshape(dIdV_3d, offseted_data_2d.shape)
```
%% Cell type:code id: tags:
``` python
# Apply some median filters:
med_filt_size = 3
dIdV_filt_3d = medfilt(dIdV_3d, [1, 1, med_filt_size])
dIdV_filt_2d = np.reshape(dIdV_filt_3d, raw_data_2d.shape)
fig, axes = px.viz.plot_utils.plot_loops(volt_vec, [dIdV_2d, dIdV_filt_2d], line_colors=['k', 'r'],
dataset_names=['Raw', 'Filtered'], x_label=x_label, y_label=y_label,
title='Median Filter')
fig.savefig(path.join(folder_path, file_name + 'Median_Filter_size_' + str(med_filt_size) + '.png'), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
num_clusters = 8 # change the number of clusters here. Something ~ 16 is easy to understand and visualize
estimators = KMeans(num_clusters)
results = estimators.fit(dIdV_filt_2d)
labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)
labels_mat = np.reshape(labels, (dIdV_filt_3d.shape[0], dIdV_filt_3d.shape[1]))
fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,
spec_label=x_label, resp_label=y_label)
axes_KM2[0].invert_yaxis()
axes_KM2[0].set_aspect(1)
fig_KM2.savefig(path.join(folder_path, file_name + 'K_means_filtered_dIdV_' +
str(num_clusters) + '_clusters.png'), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
# Downsample and average
box_size = 3
down_samp_didv_3d = np.zeros((dIdV_filt_3d.shape[0]-box_size + 1,
dIdV_filt_3d.shape[1] - box_size + 1,
dIdV_filt_3d.shape[2]))
down_samp_iv_3d = np.zeros(down_samp_didv_3d.shape)
for row_ind in range(down_samp_didv_3d.shape[0]):
row_start = row_ind
row_end = row_ind + box_size
for col_ind in range(down_samp_didv_3d.shape[1]):
col_start = col_ind
col_end = col_ind + box_size
box_didv_data = np.reshape(dIdV_filt_3d[row_start: row_end, col_start: col_end], (-1, spectra_length))
mean_di_dv = np.mean(box_didv_data, axis=0)
box_iv_data = np.reshape(offseted_data_3d[row_start: row_end, col_start: col_end], (-1, spectra_length))
mean_iv = np.mean(box_iv_data, axis=0)
offset_iv = mean_iv - np.mean(mean_iv[mid_pt - 1:mid_pt + 1])
#shifted_di_dv = mean_di_dv - np.min(mean_di_dv[int(0.5*spectra_length) - 2: int(0.5*spectra_length) + 2])
ldos = volt_vec * mean_di_dv / offset_iv
down_samp_iv_3d[row_ind, col_ind] = offset_iv
down_samp_didv_3d[row_ind, col_ind] = mean_di_dv
down_samp_didv_2d = np.reshape(down_samp_didv_3d, (-1, spectra_length))
down_samp_iv_2d = np.reshape(down_samp_didv_3d, (-1, spectra_length))
```
%% Cell type:code id: tags:
``` python
num_clusters = 8 # change the number of clusters here. Something ~ 16 is easy to understand and visualize
estimators = KMeans(num_clusters)
results = estimators.fit(down_samp_didv_2d)
labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)
labels_mat = np.reshape(labels, (down_samp_didv_3d.shape[0], down_samp_didv_3d.shape[1]))
fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,
spec_label=x_label, resp_label=y_label)
axes_KM2[0].invert_yaxis()
axes_KM2[0].set_aspect(1)
fig_KM2.savefig(path.join(folder_path, file_name + 'K_means_downsampled_dIdV_' + str(box_size) + '_box_' +
str(num_clusters) + '_clusters.png'), format='png', dpi=300)
```
%% Cell type:code id: tags:
``` python
y = centroids[-1]
x = volt_vec
from scipy.interpolate import interp1d
f2 = interp1d(x, y, kind='cubic')
xnew = np.linspace(volt_vec[0], volt_vec[-1], num=250, endpoint=True)
plt.plot(x, y, 'k', xnew, f2(xnew), 'r')
```
%% Cell type:code id: tags:
``` python
```
......
......@@ -6,7 +6,7 @@ Created on Thu Aug 25 11:48:53 2016
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from warnings import warn
......
......@@ -3,7 +3,7 @@ Created on 7/17/16 10:08 AM
@author: Suhas Somnath, Numan Laanait, Chris R. Smith
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from warnings import warn
import numpy as np
from .model import Model
......
......@@ -3,7 +3,7 @@ Created on 12/15/16 3:44 PM
@author: Numan Laanait -- nlaanait@gmail.com
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from warnings import warn
import numpy as np
from .utils.be_loop import loop_fit_function
......
......@@ -3,7 +3,7 @@ Created on 10/5/16 3:44 PM
@author: Numan Laanait -- nlaanait@gmail.com
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from warnings import warn
import numpy as np
......
......@@ -3,7 +3,7 @@ Created on 7/17/16 10:08 AM
@author: Numan Laanait, Suhas Somnath
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from warnings import warn
import numpy as np
......
......@@ -3,7 +3,7 @@ Created on 12/15/16 10:08 AM
@author: Numan Laanait
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from warnings import warn
import numpy as np
import sys
......
......@@ -5,7 +5,7 @@ Created on Mon Jan 23 11:13:22 2017
@author: Suhas Somnath, Stephen Jesse
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
import numpy as np
from scipy.optimize import least_squares
import itertools as itt
......
......@@ -3,7 +3,7 @@
@author: Ondrej Dyck
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
import os
import numpy as np
from scipy.optimize import least_squares
......
......@@ -7,7 +7,7 @@ Created on Wed Jun 29 11:13:22 2016
Various helper functions for aiding loop fitting and projection
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
import matplotlib.pyplot as plt
import numpy as np
......
......@@ -5,7 +5,7 @@ Created on Mon Sep 28 11:35:57 2015
@author: Anton Ievlev
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
import numpy as np
from numpy import exp, abs, sqrt, sum, real, imag, arctan2, append
......
......@@ -5,7 +5,7 @@ Created on Wed Aug 31 17:03:29 2016
@author: Suhas Somnath
"""
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
import numpy as np
# TODO: Test and debug node and clusterTree classes for agglomerative clustering etc
......
......@@ -5,7 +5,8 @@ Created on Tue Dec 15 11:10:37 2015
@author: Suhas Somnath
"""
from __future__ import division, print_function, absolute_import, unicode_literals
# cannot import unicode_literals since it is not compatible with h5py just yet
from __future__ import division, print_function, absolute_import
import numpy as np
from .hdf_utils import getAuxData
......
......@@ -11,7 +11,7 @@
# from the tag file datatype. I think these are used more than the tag
# datratypes in describing the data.
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
from .parse_dm3 import *
import numpy as np
......
......@@ -4,7 +4,7 @@ Created on Aug 7, 2015
@author: James Anderson
'''
from __future__ import division, print_function, absolute_import, unicode_literals
from __future__ import division, print_function, absolute_import
import collections
import struct
import array
......
......@@ -4,7 +4,9 @@ Created on Tue Nov 3 21:14:25 2015
@author: Suhas Somnath, Chris Smith, Numan Laanait
"""
from __future__ import division, print_function, absolute_import, unicode_literals
# cannot import unicode_literals since it is not compatible with h5py just yet
from __future__ import division, print_function, absolute_import
import os
import h5py
from warnings import warn
......