Using the database from Python (for programmers or advanced users)
The normal method for accessing image data on disk is using the read_image, read_images and write_image methods, for example:
1 # e2.py
2 img=EMData()
3 img.read_image("test.hdf",5) # reads the 6th image from test.hdf (first image is 0)
4 img.write_image("test2.hdf",-1) # appends (-1) the image to the end of test2.hdf
5 img_list=EMData.read_images("test.hdf",range(50)) # reads the first 50 images from test.hdf into a list of EMData objects
6 n=EMUtil.get_image_count("test.hdf") # counts the number of images in test.hdf
When writing to a (typically) 8 bit file format, like JPEG, PNG, PGM, the floating point values in the image need to be converted to an 8 bit scale. By default this is done with an algorithm that exludes outliers (ie - it doesn't span the full range of the image). To override this behavior, set the dictionary elements "render_min" and "render_max" on the image to be saved, and the specified range will be used instead. Here is a simple example:
a=test_image() a["render_min"]=a["minimum"] a["render_max"]=a["maximum"] a.write_image("a.png")
File i/o can also be performed with databases, such as :
However, this is not the preferred mechanism for using the database interface, since there are many more powerful operations which can be performed. Such as:
1 e2.py # This implicitly performs a 'from EMAN2db import *', which opens the local environment: DB=EMAN2DB.open_db()
2 testdb = db_open_dict("bdb:test") # this opens a specific database in the local directory called "test"
3 testdb[0]=test_image() # stores an EMData object in the 'test' database
4 img=testdb[0] # This reads the EMData object back from the database
5 testdb.set_attr(0,"mykey",5.5) # This sets an attribute "mykey" on EMData keyed 0 in database 'test'
6 # This operation is MUCH faster than doing the same thing with any
7 # flat file
8 testdb.get_attr(0,"mykey") # This retrieves an attribute of image 0 from database test without
9 # loading the image data
10 testdb["testimg"]=test_image() # Keys in the database need not be integers, though the
11 # read_image, etc. methods can only access integer keys
12 testdb["alist"]=[1,2,3,4,5] # You can also use the 'test' database to store arbitrary other
13 # metadata, not just images. This assigns a list to key 'alist'
14 db_close_dict("test") # While database will be cleanly closed automatically, except for
15 # cases where python is forcibly terminated (^c is ok), it isn't
16 # a bad idea to close them if you know you won't use them again
Basically, each database object can be treated as a python dictionary. Any Python object that can be pickled (almost any python object) can be stored as a value in these dictionaries. It is even possible to mix images of different sizes within a single object.
The attribute mechanism (set_attr, get_attr) is tied into the EMData object attribute dictionary. That is, the following operations are functionally equivalent, but the second version is MUCH faster.
Unlike python dictionaries, if a value in the database is an object, changing the object does not result in writing the change back to the database, unless you explicitly write it again. For example:
1 # With a dictionary
2 test={1:["a","b","c"],2:3}
3 test[1][1]="c"
4 print test[1]
5 ["a","c","c"]
6 # With a database
7 testdb = db_open_dict("bdb:test")
8 testdb[1]=["a","b","c"]
9 testdb[2]=3
10 testdb[1][1]="c" # This effectively does nothing
11 print testdb[1]
12 ["a","b","c"]
13 # To make the above actually work
14 d=testdb[1]
15 d[1]="c"
16 testdb[1]=d
You can write/read the full header for an EMData object inexpensively with:
1 testdb[2]=test_image()
2 hdr=testdb.get_header(2) # returns the equivalent of get_attr_dict on an EMData object
3 #If DB is associated with the disk database, get header requires an argument (image number).
4 hdr["apix_x"]=2.0
5 testdb.set_header(2, hdr) # hdr can be either a dictionary or and EMData object
There is a small cost associated with opening each database, so it is generally a good idea for performance purposes to open the database and only close it if you aren't expecting to use it again for some time.