Commit 783ab218 authored by Unknown's avatar Unknown
Browse files

Combine the load dataset example with the interacting with h5 tutorial

parent 89ac9e7e
......@@ -3,24 +3,6 @@
Examples using ``pycroscopy.hdf_utils.getDataSet``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="Conventionally, the h5py package is used to create, read, write, and modify h5 files.">
.. only:: html
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_load_dataset_example_thumb.png
:ref:`sphx_glr_auto_examples_plot_load_dataset_example.py`
.. raw:: html
</div>
.. only:: not html
* :ref:`sphx_glr_auto_examples_plot_load_dataset_example.py`
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="11/11/2017">
......
......@@ -3,24 +3,6 @@
Examples using ``pycroscopy.hdf_utils.print_tree``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="Conventionally, the h5py package is used to create, read, write, and modify h5 files.">
.. only:: html
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_load_dataset_example_thumb.png
:ref:`sphx_glr_auto_examples_plot_load_dataset_example.py`
.. raw:: html
</div>
.. only:: not html
* :ref:`sphx_glr_auto_examples_plot_load_dataset_example.py`
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="S. Somnath\ :sup:`1,2`, R. K. Vasudevan\ :sup:`1,3` * :sup:`1` Institute for Functional Imagin...">
......
......@@ -5,13 +5,13 @@ Examples using ``pycroscopy.ioHDF5``
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="Conventionally, the h5py package is used to create, read, write, and modify h5 files.">
<div class="sphx-glr-thumbcontainer" tooltip="S. Somnath\ :sup:`1,2`, R. K. Vasudevan\ :sup:`1,3` * :sup:`1` Institute for Functional Imagin...">
.. only:: html
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_load_dataset_example_thumb.png
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_spectral_unmixing_thumb.png
:ref:`sphx_glr_auto_examples_plot_load_dataset_example.py`
:ref:`sphx_glr_auto_examples_plot_spectral_unmixing.py`
.. raw:: html
......@@ -19,17 +19,17 @@ Examples using ``pycroscopy.ioHDF5``
.. only:: not html
* :ref:`sphx_glr_auto_examples_plot_load_dataset_example.py`
* :ref:`sphx_glr_auto_examples_plot_spectral_unmixing.py`
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="S. Somnath\ :sup:`1,2`, R. K. Vasudevan\ :sup:`1,3` * :sup:`1` Institute for Functional Imagin...">
<div class="sphx-glr-thumbcontainer" tooltip="">
.. only:: html
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_spectral_unmixing_thumb.png
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_microdata_example_thumb.png
:ref:`sphx_glr_auto_examples_plot_spectral_unmixing.py`
:ref:`sphx_glr_auto_examples_plot_microdata_example.py`
.. raw:: html
......@@ -37,17 +37,17 @@ Examples using ``pycroscopy.ioHDF5``
.. only:: not html
* :ref:`sphx_glr_auto_examples_plot_spectral_unmixing.py`
* :ref:`sphx_glr_auto_examples_plot_microdata_example.py`
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="">
<div class="sphx-glr-thumbcontainer" tooltip="**Suhas Somnath** 8/8/2017">
.. only:: html
.. figure:: /auto_examples/images/thumb/sphx_glr_plot_microdata_example_thumb.png
.. figure:: /auto_examples/dev_tutorials/images/thumb/sphx_glr_plot_tutorial_02_writing_to_h5_thumb.png
:ref:`sphx_glr_auto_examples_plot_microdata_example.py`
:ref:`sphx_glr_auto_examples_dev_tutorials_plot_tutorial_02_writing_to_h5.py`
.. raw:: html
......@@ -55,17 +55,17 @@ Examples using ``pycroscopy.ioHDF5``
.. only:: not html
* :ref:`sphx_glr_auto_examples_plot_microdata_example.py`
* :ref:`sphx_glr_auto_examples_dev_tutorials_plot_tutorial_02_writing_to_h5.py`
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="**Suhas Somnath** 8/8/2017">
<div class="sphx-glr-thumbcontainer" tooltip="11/11/2017">
.. only:: html
.. figure:: /auto_examples/dev_tutorials/images/thumb/sphx_glr_plot_tutorial_02_writing_to_h5_thumb.png
.. figure:: /auto_examples/user_tutorials/images/thumb/sphx_glr_plot_tutorial_01_interacting_w_h5_files_thumb.png
:ref:`sphx_glr_auto_examples_dev_tutorials_plot_tutorial_02_writing_to_h5.py`
:ref:`sphx_glr_auto_examples_user_tutorials_plot_tutorial_01_interacting_w_h5_files.py`
.. raw:: html
......@@ -73,7 +73,7 @@ Examples using ``pycroscopy.ioHDF5``
.. only:: not html
* :ref:`sphx_glr_auto_examples_dev_tutorials_plot_tutorial_02_writing_to_h5.py`
* :ref:`sphx_glr_auto_examples_user_tutorials_plot_tutorial_01_interacting_w_h5_files.py`
.. raw:: html
......
%% Cell type:code id: tags:
``` python
%matplotlib inline
```
%% Cell type:markdown id: tags:
\n\n======================================================================================\nTutorial 1: Data Translation\n======================================================================================\n\n**Suhas Somnath**\n8/8/2017\n\nThis set of tutorials will serve as examples for developing end-to-end workflows for and using pycroscopy.\n\n**In this example, we extract data and parameters from a Scanning Tunnelling Spectroscopy (STS) raw data file, as\nobtained from an Omicron STM, and write these to a pycroscopy compatible data file.**\n\n\nPrerequisites:\n==============\n\nBefore proceeding with this example series, we recommend reading the previous documents to learn more about:\n\n1. Data and file formats\n * Why you should care about data formats\n * Current state of data formats in microscopy\n * Structuring data in pycroscopy\n\n2. HDF5 file format\n\n\nIntroduction to Data Translation\n================================\n\nBefore any data analysis, we need to access data stored in the raw file(s) generated by the microscope. Often, the\ndata and parameters in these files are **not** straightforward to access. In certain cases, additional / dedicated\nsoftware packages are necessary to access the data while in many other cases, it is possible to extract the necessary\ninformation from built-in **numpy** or similar python packages included with **anaconda**.\n\nPycroscopy aims to make data access, storage, curation, etc. simply by storing the data along with all\nrelevant parameters in a single **.hdf5** or **.h5** file.\n\nThe process of copying data from the original format to **pycroscopy compatible hdf5 files** is called\n**Translation** and the classes available in pycroscopy that perform these operation are called **Translators**\n\n\nWriting Your First Data Translator\n==================================\n\n**The goal in this section is to trandslate the .asc file obtained from an Omicron microscope into a pycroscopy\ncompatible .h5 file.**\n\nWhile there is an **AscTranslator** avialable in pycroscopy that can translate these files in just a **single** line,\nwe will intentionally assume that no such translator is avialable. Using a handful of useful functions in pycroscopy,\nwe will translate the files from the source **.asc** format to the pycroscopy compatible **.h5** in just a few lines.\nThe code developed below is essentially the **AscTranslator**. The same methodology can be used to translate other data\nformats\n\n\nSetting up the notebook\n=======================\n\nThere are a few setup procedures that need to be followed before any code is written. In this step, we simply load a\nfew python packages that will be necessary in the later steps.
%% Cell type:code id: tags:
``` python
# Ensure python 3 compatibility:\nfrom __future__ import division, print_function, absolute_import, unicode_literals\n\n# In case some of these packages are not installed, install them\n#!pip install -U os wget numpy h5py matplotlib pycroscopy\n\n# The package for accessing files in directories, etc.:\nimport os\nimport wget\n\n# The mathematical computation package:\nimport numpy as np\n\n# The package used for creating and manipulating HDF5 files:\nimport h5py\n\n# Packages for plotting:\nimport matplotlib.pyplot as plt\n\n# Finally import pycroscopy for certain scientific analysis:\nimport pycroscopy as px
# Ensure python 3 compatibility:\nfrom __future__ import division, print_function, absolute_import, unicode_literals\n\n# The package for accessing files in directories, etc.:\nimport os\n\n# Warning package in case something goes wrong\nfrom warnings import warn\n\n# Package for downloading online files:\ntry:\n # This package is not part of anaconda and may need to be installed.\n import wget\nexcept ImportError:\n warn('wget not found. Will install with pip.')\n import pip\n pip.main(['install', 'wget'])\n import wget\n\n# The mathematical computation package:\nimport numpy as np\n\n# The package used for creating and manipulating HDF5 files:\nimport h5py\n\n# Packages for plotting:\nimport matplotlib.pyplot as plt\n\n# Finally import pycroscopy for certain scientific analysis:\ntry:\n import pycroscopy as px\nexcept ImportError:\n warn('pycroscopy not found. Will install with pip.')\n import pip\n pip.main(['install', 'pycroscopy'])\n import pycroscopy as px
```
%% Cell type:markdown id: tags:
0. Select the Raw Data file\n===========================\nDownload the data file from Github:
%% Cell type:code id: tags:
``` python
url = 'https://raw.githubusercontent.com/pycroscopy/pycroscopy/master/data/STS.asc'\ndata_file_path = 'temp_1.asc'\nif os.path.exists(data_file_path):\n os.remove(data_file_path)\n_ = wget.download(url, data_file_path, bar=None)
```
%% Cell type:markdown id: tags:
1. Exploring the Raw Data File\n==============================\n\nInherently, one may not know how to read these **.asc** files. One option is to try and read the file as a text file\none line at a time.\n\nIt turns out that these .asc files are effectively the standard **ASCII** text files.\n\nHere is how we tested to see if the **asc** files could be interpreted as text files. Below, we read just thefirst 10\nlines in the file
%% Cell type:code id: tags:
``` python
with open(data_file_path, 'r') as file_handle:\n for lin_ind in range(10):\n print(file_handle.readline())
```
%% Cell type:markdown id: tags:
2. Loading the data\n===================\nNow that we know that these files are simple text files, we can manually go through the file to find out which lines\nare important, at what lines the data starts etc.\nManual investigation of such .asc files revealed that these files are always formatted in the same way. Also, they\ncontain parameters in the first 403 lines and then contain data which is arranged as one pixel per row.\nSTS experiments result in 3 dimensional datasets (X, Y, current). In other words, a 1D array of current data (as a\nfunction of excitation bias) is sampled at every location on a two dimensional grid of points on the sample.\nBy knowing where the parameters are located and how the data is structured, it is possible to extract the necessary\ninformation from these files.\nSince we know that the data sizes (<200 MB) are much smaller than the physical memory of most computers, we can start\nby safely loading the contents of the entire file to memory
%% Cell type:code id: tags:
``` python
# Extracting the raw data into memory\nfile_handle = open(data_file_path, 'r')\nstring_lines = file_handle.readlines()\nfile_handle.close()
```
%% Cell type:markdown id: tags:
3. Read the parameters\n======================\nThe parameters in these files are present in the first few lines of the file
%% Cell type:code id: tags:
``` python
# Reading parameters stored in the first few rows of the file\nparm_dict = dict()\nfor line in string_lines[3:17]:\n line = line.replace('# ', '')\n line = line.replace('\n', '')\n temp = line.split('=')\n test = temp[1].strip()\n try:\n test = float(test)\n # convert those values that should be integers:\n if test % 1 == 0:\n test = int(test)\n except ValueError:\n pass\n parm_dict[temp[0].strip()] = test\n\n# Print out the parameters extracted\nfor key in parm_dict.keys():\n print(key, ':\t', parm_dict[key])
```
%% Cell type:markdown id: tags:
3.a Prepare to read the data\n============================\nBefore we read the data, we need to make an empty array to store all this data. In order to do this, we need to read\nthe dictionary of parameters we made in step 2 and extract necessary quantities
%% Cell type:code id: tags:
``` python
num_rows = int(parm_dict['y-pixels'])\nnum_cols = int(parm_dict['x-pixels'])\nnum_pos = num_rows * num_cols\nspectra_length = int(parm_dict['z-points'])
```
%% Cell type:markdown id: tags:
3.b Read the data\n=================\nData is present after the first 403 lines of parameters.
%% Cell type:code id: tags:
``` python
# num_headers = len(string_lines) - num_pos\nnum_headers = 403\n\n# Extract the STS data from subsequent lines\nraw_data_2d = np.zeros(shape=(num_pos, spectra_length), dtype=np.float32)\nfor line_ind in range(num_pos):\n this_line = string_lines[num_headers + line_ind]\n string_spectrum = this_line.split('\t')[:-1] # omitting the new line\n raw_data_2d[line_ind] = np.array(string_spectrum, dtype=np.float32)
```
%% Cell type:markdown id: tags:
4.a Preparing some necessary parameters\n=======================================
%% Cell type:code id: tags:
``` python
max_v = 1 # This is the one parameter we are not sure about\n\nfolder_path, file_name = os.path.split(data_file_path)\nfile_name = file_name[:-4] + '_'\n\n# Generate the x / voltage / spectroscopic axis:\nvolt_vec = np.linspace(-1 * max_v, 1 * max_v, spectra_length)\n\nh5_path = os.path.join(folder_path, file_name + '.h5')
```
%% Cell type:markdown id: tags:
4b. Calling the NumpyTranslator to create the pycroscopy data file\n==================================================================\nThe NumpyTranslator simplifies the ceation of pycroscopy compatible datasets. It handles the file creation,\ndataset creation and writing, creation of ancillary datasets, datagroup creation, writing parameters, linking\nancillary datasets to the main dataset etc. With a single call to the NumpyTranslator, we complete the translation\nprocess.
%% Cell type:code id: tags:
``` python
tran = px.io.NumpyTranslator()\nh5_path = tran.translate(h5_path, raw_data_2d, num_rows, num_cols,\n qty_name='Current', data_unit='nA', spec_name='Bias',\n spec_unit='V', spec_val=volt_vec, scan_height=100,\n scan_width=200, spatial_unit='nm', data_type='STS',\n translator_name='ASC', parms_dict=parm_dict)
```
%% Cell type:markdown id: tags:
Notes on pycroscopy translation\n===============================\n* Steps 1-3 would be performed anyway in order to begin data analysis\n* The actual pycroscopy translation step are reduced to just 3-4 lines in step 4.\n* While this approach is feasible and encouraged for simple and small data, it may be necessary to use lower level\n calls to write efficient translators\n\nVerifying the newly written H5 file:\n====================================\n* We will only perform some simple and quick verification to show that the data has indeed been translated corectly.\n* Please see the next notebook in the example series to learn more about reading and accessing data.
%% Cell type:code id: tags:
``` python
with h5py.File(h5_path, mode='r') as h5_file:\n # See if a tree has been created within the hdf5 file:\n px.hdf_utils.print_tree(h5_file)\n\n h5_main = h5_file['Measurement_000/Channel_000/Raw_Data']\n fig, axes = plt.subplots(ncols=2, figsize=(11, 5))\n spat_map = np.reshape(h5_main[:, 100], (100, 100))\n px.plot_utils.plot_map(axes[0], spat_map, origin='lower')\n axes[0].set_title('Spatial map')\n axes[0].set_xlabel('X')\n axes[0].set_ylabel('Y')\n axes[1].plot(np.linspace(-1.0, 1.0, h5_main.shape[1]),\n h5_main[250])\n axes[1].set_title('IV curve at a single pixel')\n axes[1].set_xlabel('Tip bias [V]')\n axes[1].set_ylabel('Current [nA]')\n\n# Remove both the original and translated files:\nos.remove(h5_path)\nos.remove(data_file_path)
```
......
......@@ -66,12 +66,21 @@ few python packages that will be necessary in the later steps.
# Ensure python 3 compatibility:
from __future__ import division, print_function, absolute_import, unicode_literals
# In case some of these packages are not installed, install them
#!pip install -U os wget numpy h5py matplotlib pycroscopy
# The package for accessing files in directories, etc.:
import os
import wget
# Warning package in case something goes wrong
from warnings import warn
# Package for downloading online files:
try:
# This package is not part of anaconda and may need to be installed.
import wget
except ImportError:
warn('wget not found. Will install with pip.')
import pip
pip.main(['install', 'wget'])
import wget
# The mathematical computation package:
import numpy as np
......@@ -83,7 +92,13 @@ import h5py
import matplotlib.pyplot as plt
# Finally import pycroscopy for certain scientific analysis:
import pycroscopy as px
try:
import pycroscopy as px
except ImportError:
warn('pycroscopy not found. Will install with pip.')
import pip
pip.main(['install', 'pycroscopy'])
import pycroscopy as px
####################################################################################
# 0. Select the Raw Data file
......
cb56813ad1c279b344400a6b79314e21
\ No newline at end of file
431362ec4dbb3d9e362646182c626de6
\ No newline at end of file
......@@ -74,12 +74,21 @@ few python packages that will be necessary in the later steps.
# Ensure python 3 compatibility:
from __future__ import division, print_function, absolute_import, unicode_literals
# In case some of these packages are not installed, install them
#!pip install -U os wget numpy h5py matplotlib pycroscopy
# The package for accessing files in directories, etc.:
import os
import wget
# Warning package in case something goes wrong
from warnings import warn
# Package for downloading online files:
try:
# This package is not part of anaconda and may need to be installed.
import wget
except ImportError:
warn('wget not found. Will install with pip.')
import pip
pip.main(['install', 'wget'])
import wget
# The mathematical computation package:
import numpy as np
......@@ -91,7 +100,13 @@ few python packages that will be necessary in the later steps.
import matplotlib.pyplot as plt