Differences between revisions 15 and 30 (spanning 15 versions)
Revision 15 as of 2013-08-11 17:50:09
Size: 3853
Editor: SteveLudtke
Comment:
Revision 30 as of 2019-07-31 22:17:45
Size: 6484
Editor: MuyuanChen
Comment: __
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
The ''info'' folder is a replacement for the BDB-based ''EMAN2DB'' folder in the project directory. All information is stored in human-readable and editable JSON files with a ''.js'' extension. JSON files are human-readable and editable text files, which are also software-friendly. This is rapidly becoming one of the most common formats used in the web-development community. They have a '.json' extension, and can be opened in any text-editor.
Line 6: Line 6:
If you have a micrograph called, for example, ''micrographs/DC19873.hdf'', and have processed it with EMAN2, you will end up with other derived image files, such as ''particles/DC19873_ptcls.hdf'' and ''particles/DC19873__ctf_flip.hdf'', etc. Any information which needs to be stored about the micrograph as a whole, will be stored in ''info/DC19873_info.json''. If you wish to copy micrographs from one file to another, all you need to do is copy the image file(s) and this JSON file to the info/ folder in the new project. For more information on these files and accessing them in EMAN2, see [[Eman2JSStorage]].
Line 8: Line 8:
For each micrograph in the project a '''basename''' is assigned, to avoid confusion as the data goes from raw micrograph to particles, etc. That is, a file named ''micrographs/jj1234.mrc'' would use the '''basename''' ''jj1234''. ''particles/jj1234.hdf'' would use the same '''basename'''. While these files are human-readable, using the "Info" button in the EMAN2 file browser will give a much more convenient way to look at the contents of these files.

If you have a micrograph called, for example, ''micrographs/DC19873.hdf'', and have processed it with EMAN2, you will end up with other derived image files, such as ''particles/DC19873_ptcls.hdf'' and ''particles/DC19873`__`ctf_flip.hdf'', etc. Any information which needs to be stored about the micrograph as a whole, will be stored in ''info/DC19873_info.json''. If you wish to copy micrographs from one project to another, all you need to do is copy the image file(s) and this JSON file to the info/ folder in the new project.
Line 18: Line 20:
||global.boxsize||The default box-size used by e2boxer and other programs. Specified in pixels in the fully sampled micrographs.||
||global.ptclsize||Estimated maximum dimension of the particle being reconstructed measured in fully sampled pixels.||
||global.invariant_type||Which type of invariants this project uses. Changing invariants requires redoing particle preprocessing, so it is considered a property of the project. "bispec" or "harmonic" are the current choices.||
Line 28: Line 33:
||boxes||The list of particle locations in the micrograph from e2boxer.py||
||boxes_tilted||The list of particle locations in the micrograph from e2boxer.py||
||boxes_untilted||The list of particle locations in the micrograph from e2boxer.py||
||ctf||A list of CTF related objects: [EMAN2CTF instance,signal 1D,background 1D,signal 2D, background 2D] computed from particles, not the overall frame ||
||ctf_frame||A list of CTF related objects associated with the whole frame: [box size,EMAN2CTF instance,box coord,set of excluded boxnums]||
||boxes||list of lists. Each item in the list is a 3 element list containing (X-center,Y-center,method), where type is a string describing the method used to find the particle, eg- "manual",...||
||boxes_rct||list of lists. Each item in the list is the same as in "boxes" above. type is "tilted" or "untilted"||
||ctf||A list of CTF related objects: [EMAN2CTF instance,signal 1D,background 1D] computed from particles, not the overall frame. Note, prior to 5/18/16 im_2d and bg_2d were also stored here ||
||ctf_im2d||2d power spectrum average of particles from this image||
||ctf_bg2d||2d background power spectrum from particles in this image||
||ctf_frame||A list of CTF related objects associated with the whole frame: [box size,EMAN2CTF instance,box coord,set of excluded boxnums],quality (old location),oversampling||
||ctf_microbox||A 1-D power spectrum computed from the whole micrograph with the same box size and mask as individual particles. Only generated if e2ctf.py has the correct options||
Line 35: Line 42:
|| || bad_particles - Particle numbers which have been determined to be 'bad'. The 'bad' particles may optionally be excluded when building sets|| || || ''bad_particles'' - Particle numbers which have been determined to be 'bad'. The 'bad' particles may optionally be excluded when building sets||

==== <tomogram>_info.json ====
Here is a list of parameters used for the tomogram processing workflow. Files with the same base name share a single json file, regardless of their directory and suffix. So "tiltseries/xx.hdf", "tomograms/xx`__`bin4.hdf", "tomograms/xx`__`bin2.hdf", and "particles3d/xx`__`ptcl.hdf" will use the same info file "info/xx_info.json".

||Parameter||Description||
||tlt_file|| File name of the raw tilt series ||
||tlt_params|| (N, 5) list, where N is the number of tilt in the tilt series. The columns are translation along x,y axis, and rotation around z, y, x axis in the EMAN coordinates. The translation is in unbinned pixels, and rotation is in degrees.||
||apix_unbin|| Unbinned A/pixel value for the tilt series. ||
||ali_loss|| Alignment score for each tilt image computed during reconstruction. Lower is better.||
||boxes_3d||list of lists. Each item in the list is a list containing (X-center,Y-center,Z-center,method,score,class #]), where method is a string describing the method used to find the particle, eg- "manual",..., class # is an integer grouping similar particles, score is an arbitrary float (lower better). The coordinates are in unbinned pixels, and zero is at the center of the tomogram.||
||class_list|| Dictionary of dictionaries. {"class # in boxes_3d" : {"name" : name of the particle type, "boxsize" : unbinned box size of the particle} } ||
||cs|| cs value input during CTF estimation. ||
||voltage|| Voltage value input during CTF estimation. ||
||defocues|| A list containing the defocus value at the center of each tilt image||
||phase|| Phase shift for each tilt image||

Metadata Stored in JSON files

JSON files in the info folder

JSON files are human-readable and editable text files, which are also software-friendly. This is rapidly becoming one of the most common formats used in the web-development community. They have a '.json' extension, and can be opened in any text-editor.

For more information on these files and accessing them in EMAN2, see Eman2JSStorage.

While these files are human-readable, using the "Info" button in the EMAN2 file browser will give a much more convenient way to look at the contents of these files.

If you have a micrograph called, for example, micrographs/DC19873.hdf, and have processed it with EMAN2, you will end up with other derived image files, such as particles/DC19873_ptcls.hdf and particles/DC19873__ctf_flip.hdf, etc. Any information which needs to be stored about the micrograph as a whole, will be stored in info/DC19873_info.json. If you wish to copy micrographs from one project to another, all you need to do is copy the image file(s) and this JSON file to the info/ folder in the new project.

project.json

This file contains overall project parameters, such as A/pix, voltage, mass, etc.

Parameter

Description

global.apix

A/pixel value for this project. Generally speaking, if you want to downsample/rescale your data, you should do this in a separate project.

global.microscope_voltage

Default microscope voltage in kV when running CTF fitting and micrograph evaluation programs. While it is technically possible to combine data from multiple micrographs in a single project, it is not usually a good idea. If you are thinking about doing this you may wish to ask us about it first.

global.microscope_cs

Default Cs value in mm when running CTF fitting and micrograph evaluation programs.

global.particle_mass

The default mass of the particle being reconstructed in kDa. Clearly in some cases (assemblies in various stages of assembly, etc.) there will not be a single value for this. This value is primarily used as a default, and is really only used for setting reasonable isosurfaces in any case, so such situations should not really be a problem

global.boxsize

The default box-size used by e2boxer and other programs. Specified in pixels in the fully sampled micrographs.

global.ptclsize

Estimated maximum dimension of the particle being reconstructed measured in fully sampled pixels.

global.invariant_type

Which type of invariants this project uses. Changing invariants requires redoing particle preprocessing, so it is considered a property of the project. "bispec" or "harmonic" are the current choices.

project_icon

Used by the GUI

project_name

For recordkeeping purposes

notebook.json

This is where the projectmanager stores the text entered in the 'lab notebook'.

<micrograph>_info.json

Per micorgraph information is stored in one file for each micrograph. This allows easy copying of micrographs with their metadata between projects. While there is, of course, metadata in the image headers, THIS metadata is not stored in the header because the information, such as CTF information, is associated with multiple image files, including the micrograph itself, as well as particle stacks, etc.

Parameter

Description

boxes

list of lists. Each item in the list is a 3 element list containing (X-center,Y-center,method), where type is a string describing the method used to find the particle, eg- "manual",...

boxes_rct

list of lists. Each item in the list is the same as in "boxes" above. type is "tilted" or "untilted"

ctf

A list of CTF related objects: [EMAN2CTF instance,signal 1D,background 1D] computed from particles, not the overall frame. Note, prior to 5/18/16 im_2d and bg_2d were also stored here

ctf_im2d

2d power spectrum average of particles from this image

ctf_bg2d

2d background power spectrum from particles in this image

ctf_frame

A list of CTF related objects associated with the whole frame: [box size,EMAN2CTF instance,box coord,set of excluded boxnums],quality (old location),oversampling

ctf_microbox

A 1-D power spectrum computed from the whole micrograph with the same box size and mask as individual particles. Only generated if e2ctf.py has the correct options

quality

A single integer from 0-9. No predefined meaning, though 5 is the default value, and larger should be interpreted as better. This is to permit qualitative assessement by the user at various stages of analysis.

sets

A dicitionary containing lists of integers keyed by set name

bad_particles - Particle numbers which have been determined to be 'bad'. The 'bad' particles may optionally be excluded when building sets

<tomogram>_info.json

Here is a list of parameters used for the tomogram processing workflow. Files with the same base name share a single json file, regardless of their directory and suffix. So "tiltseries/xx.hdf", "tomograms/xx__bin4.hdf", "tomograms/xx__bin2.hdf", and "particles3d/xx__ptcl.hdf" will use the same info file "info/xx_info.json".

Parameter

Description

tlt_file

File name of the raw tilt series

tlt_params

(N, 5) list, where N is the number of tilt in the tilt series. The columns are translation along x,y axis, and rotation around z, y, x axis in the EMAN coordinates. The translation is in unbinned pixels, and rotation is in degrees.

apix_unbin

Unbinned A/pixel value for the tilt series.

ali_loss

Alignment score for each tilt image computed during reconstruction. Lower is better.

boxes_3d

list of lists. Each item in the list is a list containing (X-center,Y-center,Z-center,method,score,class #]), where method is a string describing the method used to find the particle, eg- "manual",..., class # is an integer grouping similar particles, score is an arbitrary float (lower better). The coordinates are in unbinned pixels, and zero is at the center of the tomogram.

class_list

Dictionary of dictionaries. {"class # in boxes_3d" : {"name" : name of the particle type, "boxsize" : unbinned box size of the particle} }

cs

cs value input during CTF estimation.

voltage

Voltage value input during CTF estimation.

defocues

A list containing the defocus value at the center of each tilt image

phase

Phase shift for each tilt image

Eman2InfoMetadata (last edited 2019-07-31 22:17:45 by MuyuanChen)