KMeans Clustering not writing to file
Created by: sulaymandesai
Hi,
Hope you're well.
I am following this published notebook: https://nbviewer.jupyter.org/github/pycroscopy/papers/blob/master/Notebooks/EM/STEM/Image_Cleaning_Atom_Finding.ipynb
When I try to run the KMeans clustering I have the following error:
num_clusters = 4
# num_clusters = 32
estimator = px.processing.Cluster(h5_U, KMeans(n_clusters=num_clusters), num_comps=num_comps)
if estimator.duplicate_h5_groups==[]:
t0 = time()
h5_kmeans = estimator.compute()
print('kMeans took {} seconds.'.format(round(time()-t0, 2)))
else:
h5_kmeans = estimator.duplicate_h5_groups[-1]
print( 'Using existing results.')
print( 'Clustering results in {}.'.format(h5_kmeans.name))
half_wind = int(win_size*0.5)
# generate a cropped image that was effectively the area that was used for pattern searching
# Need to get the math righ on the counting
cropped_clean_image = clean_image_mat[half_wind:-half_wind + 1, half_wind:-half_wind + 1]
# Plot cluster results Get the labels dataset
labels_mat = np.reshape(h5_kmeans['Labels'][()], [num_rows, num_cols])
fig, axes = plt.subplots(ncols=2, figsize=(14,7))
axes[0].imshow(cropped_clean_image,cmap=spiepy.NANOMAP, origin='lower')
axes[0].set_title('Cleaned Image', fontsize=16)
axes[1].imshow(labels_mat, aspect=1, interpolation='none',cmap=spiepy.NANOMAP, origin='lower')
axes[1].set_title('K-means cluster labels', fontsize=16);
for axis in axes:
axis.get_yaxis().set_visible(False)
axis.get_xaxis().set_visible(False)
usid.jupyter_utils.save_fig_filebox_button(fig, 'Clustered_Clean_Image.png')
Consider calling test() to check results before calling compute() which computes on the entire dataset and writes results to the HDF5 file
Group: <HDF5 group "/Measurement_000/Channel_000/Plane_Mean_Subtracted_Data-Windowing_000/Image_Windows-SVD_000/U-Cluster_000" (0 members)> had neither the status HDF5 dataset or the legacy attribute: "last_pixel".
Group: <HDF5 group "/Measurement_000/Channel_000/Plane_Mean_Subtracted_Data-Windowing_000/Image_Windows-SVD_000/U-Cluster_001" (0 members)> had neither the status HDF5 dataset or the legacy attribute: "last_pixel".
Group: <HDF5 group "/Measurement_000/Channel_000/Plane_Mean_Subtracted_Data-Windowing_000/Image_Windows-SVD_000/U-Cluster_002" (0 members)> had neither the status HDF5 dataset or the legacy attribute: "last_pixel".
Performing clustering on /Measurement_000/Channel_000/Plane_Mean_Subtracted_Data-Windowing_000/Image_Windows-SVD_000/U.
Took 5.76 sec to compute KMeans
Calculated the Mean Response of each cluster.
Took 340.1 msec to calculate mean response per cluster
Writing clustering results to file.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-35-6b9a66d30096> in <module>
7 if estimator.duplicate_h5_groups==[]:
8 t0 = time()
----> 9 h5_kmeans = estimator.compute()
10 print('kMeans took {} seconds.'.format(round(time()-t0, 2)))
11 else:
~/.pyenv/versions/3.8.3/lib/python3.8/site-packages/pycroscopy-0.60.7-py3.8.egg/pycroscopy/processing/cluster.py in compute(self, rearrange_clusters, override)
226
227 if self.h5_results_grp is None:
--> 228 h5_group = self._write_results_chunk()
229 self.delete_results()
230 else:
~/.pyenv/versions/3.8.3/lib/python3.8/site-packages/pycroscopy-0.60.7-py3.8.egg/pycroscopy/processing/cluster.py in _write_results_chunk(self)
282 h5_cluster_group = create_results_group(self.h5_main, self.process_name,
283 h5_parent_group=self._h5_target_group)
--> 284 self._write_source_dset_provenance()
285
286 write_simple_attrs(h5_cluster_group, self.parms_dict)
~/.pyenv/versions/3.8.3/lib/python3.8/site-packages/pyUSID/processing/process.py in _write_source_dset_provenance(self)
793
794 @staticmethod
--> 795 def _map_function(*args, **kwargs):
796 """
797 The function that manipulates the data on a single instance (position). This will be used by
~/.pyenv/versions/3.8.3/lib/python3.8/site-packages/sidpy/hdf/hdf_utils.py in write_simple_attrs(h5_obj, attrs, verbose)
371 '{}'.format(type(attrs)))
372 if not isinstance(h5_obj, (h5py.File, h5py.Group, h5py.Dataset)):
--> 373 raise TypeError('h5_obj should be a h5py File, Group or Dataset object'
374 ' but is instead of type '
375 '{}t'.format(type(h5_obj)))
TypeError: h5_obj should be a h5py File, Group or Dataset object but is instead of type <class 'NoneType'>t
Any help would be appreciated!