Vous êtes sur la page 1sur 6


discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/220258295

CIF2Cell: Generating geometries for electronic

structure programs
Impact Factor: 3.11 DOI: 10.1016/j.cpc.2011.01.013 Source: DBLP




Torbjrn Bjrkman
bo Akademi University

Available from: Torbjrn Bjrkman

Retrieved on: 20 October 2015

CIF2Cell: Generating geometries for electronic structure programs

Torbjorn Bjorkmana

Aalto University School of Science and Technology, P.O. Box 11100, 00076 Aalto, Finland

The CIF2Cell program generates the geometrical setup for a number of electronic structure programs based on the
crystallographic information in a Crystallographic Information Framework (CIF) file. The program will retrieve the
space group number, Wyckoff positions and crystallographic parameters, make a sensible choice for Bravais lattice
vectors (primitive or principal cell) and generate all atomic positions. Supercells can be generated and alloys are
handled gracefully. The code currently has output interfaces to the electronic structure programs ABINIT, CASTEP,
CPMD, Crystal, Elk, Exciting, EMTO, Fleur, RSPt, Siesta and VASP.

1. Program summary
Program title: CIF2Cell
Catalogue identifier: ???
Program obtainable from: http://cif2cell.sourceforge.net/
Licensing provisions: GNU General Public License version 3
Programming language used: Python (versions 2.4-2.7)
Library dependency: PyCifRW[1]
No. lines in distributed program, including test data, etc.: Approximately 12 500.
No. bytes in distributed program, including test data, etc.: Approximately 500 000
Distribution format: tar.gz
Computer: Any computer that can run Python (versions 2.4-2.7).
Operating system: Any operating system that can run Python (versions 2.4-2.7).
Nature of problem: Generate the geometrical setup of a crystallographic cell for a variety of electronic structure
programs from data contained in a CIF file.
Solution method: The CIF file is parsed using routines contained in the library PyCifRW[1], and crystallographic as
well as bibliographic information is extracted. The program then generates the principal cell from symmetry
information, crystal parameters, space group number and Wyckoff sites. Reduction to a primitive cell is then
performed, and the resulting cell is output to suitably named files along with documentation of the information
source generated from any bibliographic information contained in the CIF file. If the space group symmetries
is not present in the CIF file the program will fall back on internal tables, so only the minimal input of space
group, crystal parameters and Wyckoff positions are required. Additional key features are handling of alloys
and supercell generation.
Email address: torbjorn@cc.hut.fi (Torbjorn Bjorkman)
Preprint submitted to Elsevier

December 30, 2010

Currently implements support for the following general purpose electronic structure programs: ABINIT[2, 3],
CASTEP[4], CPMD[5], Crystal[6], Elk[7], exciting[8], EMTO[9], Fleur[10], RSPt[11], Siesta[12] and VASP[13
2. Introduction
There is presently no standard for specifying input data to an electronic structure program (ESP), although attempts
in that direction have been made[17]. Code development in an academic environment is also necessarily aimed at
implementation of new features, and there are little resources for maintaining user interfaces. In crystallography, there
exists since a number of years a de facto standard for communication of scientific data, namely the crystallographic
information framework (CIF)[18]. CIF specifies a language for communicating crystallographic data, such as the
purely geometrical information like crystal parameters, space groups and atomic positions, but also citation data such
as authors and journal references. Since the geometrical data is always required user input for an ESP, it would clearly
be very convenient to generate this data using the language favored by the crystallographic community, and to this end
the program CIF2Cell was written. It supplies an interface between the CIF file format and a number of electronic
structure programs, with the aim of enlarging the list so as to maximally benefit the material computations community.
3. Technical description
3.1. Installation, running and documentation
The program comes with an automatic installer implemented using the distutils package from the Python
standard library. The program is installed by simply running the command
python setup.py install
in the distribution root, optionally giving the argument, --prefix=/where/you/want/it, to install in a nonstandard location.
The code is run from the command line, controlling the input and output by command line arguments to the
program. No input files are used and the program will never stop to prompt the user for additional input. The
documentation is accessed by simply typing
cif2cell --help
which will print instructions for the running of the program and descriptions of all options to screen.
3.2. Cell generation from CIF files
Here follows a brief description of the structure and parsing of a CIF file, as seen from the point of view of the
present application. A CIF file consists of one or several blocks of data. Each data block describes a crystal structure
using a set of standardized keywords, and may in addition to the minimum input needed to generate the geometrical
setup for an electronic structure code also contain information about the experiment, citation information, space group
data and much more.
CIF2Cell uses the CIF parser CifRW[1] to get the required information from a data block in a CIF file. The
standard CIF dictionary often provides several keyword options for storing data, and the program must therefore scan
for all of these, taking care to select the keyword that is most likely to contain the correct information. A case in
point is the element names that are commonly stored (with some differences of the format) both under the keyword
atom site type symbol and atom site label, but where by the standard the type symbol option is more
likely to be the element name. The Table 1 shows the keywords used when searching for different properties, as well
as their order of precedence. The minimal required input to generate the geometry for an ESP is the space group
number, the lattice parameters, the chemical element names and the Wyckoff positions.
All positions in the conventional (principal) cell are generated using the symmetry operations of the space group,
commonly stored in CIF files in the form of a list of symmetry equivalent positions. These will be used if found,
and if not the code will use an internal table for the given space group, using the assumption that the crystal is given

Table 1: Geometric information obtained from the CIF file and which CIF keywords that are used for identifying the information. The word or in
bold typeface indicates that the keywords are checked for in the order given in the table, and the first match is used.

Space group number
Crystal parameters
Element names
Wyckoff positions
Site occupancy
Space group symbol
Symmetry operations

CIF keyword
space group IT number or
symmetry Int Tables number
cell length a, cell length b, cell length c,
cell angle alpha, cell angle beta, cell angle gamma
atom site type symbol or
atom site label
atom site fract x, atom site fract y, atom site fract z
atom site occupancy
space group name H-M alt or space group name Hall or
symmetry space group name H-M or symmetry space group name Hall
symmetry equiv pos as xyz

in the standard setting. The internal tables consists of Python code automatically generated from output from the
program SgInfo[19] and make up approximately half (6000) of the number of code lines of the program. The cell
can then be reduced to a primitive cell by some set of lattice translations, or kept in the principal cell. The choice
of primitive lattice translations, and thus of the primitive cell, is not unique and particularly for the lower symmetry
structures it is not obvious what constitutes the most natural choice. In CIF2Cell choices has been made in an to
attempt to make the generated cell coincide with structures listed by the United States Naval Research Institute Center
for Computational Materials Science[20] for the purposes of electronic structure calculations. The program is also
able to produce supercells based on either the primitive or the principal cell.
The program will look at the site occupancy that indicates the distribution of atoms for alloys. If not found all
occupancies are assumed to be 1. The program will internally represent an alloy as a position occupied by several
atomic species and it is possible to enforce generation of a cell also for an ESP that have no method to handle alloys
in the primitive cell (such as the coherent potential approximation[9]), but the output will be incomplete and the user
will be required to edit the generated files by hand to put in the missing species. For an ESP that has some means of
handling an alloy in the primitive cell the code will set up input files accordingly.
For the convenience of the user, any bibliographic information and information about source databases of the CIF
file are also processed into documentation strings written as comments to the ESP input files.
3.3. Testing the code
To test the robustness of the code, CIF2Cell has been run for 100.000 CIF files from the Crystallography Open
Database (COD)[21]. CIF2Cell gracefully handles all errors that arise in this test and manages to generate a cell in
97.5% of the cases. In the cases where the program failed to generate a cell, about 60% are due to insufficient data
in the CIF files and the remaining 40% come from the program failing to work out the proper symmetry operations.
A part of the symmetry failures may actually fall in the first category, but it appears that it is mostly structures given
in a non-standard settings without explicitly given symmetry operations, a case the program is presently incapable of
handling. Note that this test of course did not verify the correctness of each of the 97.500 generated primitive cells.
The correctness of the cells generated has been verified primarily through comparisons with the above mentioned
structures listed by the United States Naval Research Institute Center for Computational Materials Science. These
tests have been exhaustive in that all possible crystal systems (triclinic, monoclinic, orthorhombic, tetragonal, trigonal,
hexagonal and cubic) and all different centerings (including the rhombohedral setting of trigonal systems) have been
verified to work properly.
The interfaces to the different electronic structure programs have been explicitly tested for Elk, EMTO, VASP and
RSPt. For the remaining programs, the output from CIF2Cell has been verified by generating structures that appear in
the documentations of the ESPs.

Supplied with the code are 10 sample CIF files that can be used for testing the installation. These describe the
compounds: FeAs, cubic and orthorhombic BaTiO3 , Ni2 0Mn3 P6 , Si, SiC, -Mn, La0.7 Sr0.3 MnO3 and the - and
-phases of Pu. The examples have been taken from COD and the Inorganic Crystal Structure Database (ICSD).
4. Example of output
The current set of supported electronic structure programs is listed in the program summary above. The form of
the output depends on the input format of the ESP and ranges from a whole directory structure (e.g. EMTO) to output
to file of a number of lines suitable for copying into a main input file (e.g. the Elk FP-LAPW code). As the present
author is not an expert in all of these programs, the behavior is likely to change in response to user feedback, but some
of the simpler interfaces should be fairly stable.
A suitable example of the output is the generation of a POSCAR file to the Vienna ab-initio simulation package (VASP)[1316], since this produces a single file with only structural information. We consider the output for
orthorhombic BaTiO3 (the CIF file used in this example is also supplied with the distribution):
Generated by cif2cell
1 1 3

from ICSD reference: 161341. [REFERENCE] Species order:





Ba Ti O

Since the POSCAR input format only allows comments on the first line this usually becomes rather long, and it has
been abbreviated in the example above. The word [REFERENCE] is a string giving the name of the compound and a
full journal reference, in this case
Ba (Ti O3) :

Xiao, C.J. et al., Materials Chemistry and Physics 111, 209-212 (2008).

The output tells the user that the file was automatically generated by CIF2Cell, the source database including reference
number, a full journal reference. Last is printed the order in which the atomic species come in the list of atomic
positions, a piece of information that is necessary for the generation of the remaining input to this particular ESP.
Other interfaces behave similarly, the main difficulty being in codes that use the pseudo-potential approach, where
the pseudo-potential for the different atomic species is necessarily user supplied information that CIF2Cell has no
means of guessing. The code will in those cases produce some placeholder which contains information about what
atomic species occupies the position in question, so that the user can easily change this information.
5. Acknowledgements
The author would like to thank Dr. P. Larsson and Dr. A. Blomqvist for long and stimulating discussions regarding user interfaces of electronic structure programs and professor R. Nieminen for reading and commenting on
the manuscript, The ICSD has kindly granted permission to distribute the program with sample CIF files from the
database. This research has been supported by the Academy of Finland through a Centre of Excellence Grant 20062011.
[1] J. R. Hester, A validating CIF parser:
URL http://dx.doi.org/10.1107/S0021889806015627









[2] X. Gonze, G.-M. Rignanese, M. Verstraete, J.-M. Beuken, Y. Pouillon, R. Caracas, F. Jollet, M. Torrent, G. Zerah, M. Mikami, P. Ghosez,
M. Veithen, J.-Y. Raty, V. Olevano, F. Bruneval, L. Reining, R. Godby, G. Onida, D. R. Hamann, D. C. Allan, A brief introduction to the
abinit software package, Zeitschrift fur Kristallographie 220 (12) (2005) 558562.
[3] X. Gonze, B. Amadon, P.-M. Anglade, J.-M. Beuken, F. Bottin, P. Boulanger, F. Bruneval, D. Caliste, R. Caracas, M. Ct, T. Deutsch,
L. Genovese, P. Ghosez, M. Giantomassi, S. Goedecker, D. Hamann, P. Hermet, F. Jollet, G. Jomard, S. Leroux, M. Mancini, S. Mazevet,
M. Oliveira, G. Onida, Y. Pouillon, T. Rangel, G.-M. Rignanese, D. Sangalli, R. Shaltaf, M. Torrent, M. Verstraete, G. Zerah, J. Zwanziger,
Abinit: First-principles approach to material and nanosystem properties, Computer Physics Communications 180 (12) (2009) 2582 2615,
40 YEARS OF CPC: A celebratory issue focused on quality software for high performance, grid and novel computing architectures. doi:DOI:
URL http://www.sciencedirect.com/science/article/B6TJ5-4WTRSCM-3/2/20edf8da70cd808f10fe352c45d0c0be
[4] S. J. Clark, M. D. Segall, C. J. Pickard, P. J. Hasnip, M. J. Probert, K. Refson, M. C. Payne, First principles methods using castep, Zeitschrift
fur Kristallographie 220 (12) (2005) 567570.
[5] [link].
URL http://www.cpmd.org
[6] R. Dovesi, R. Orlando, B. Civalleri, C. Roetti, V. R. Saunders, C. M. Zicovich-Wilson, Crystal: a computational tool for the ab initio study of
the electronic properties of crystals, Zeitschrift fur Kristallographie 220 (2005) 571573.
URL http://dx.doi.org/10.1524/zkri.220.5.571.65065
[7] [link].
URL http://elk.sourceforge.net
[8] [link].
URL http://exciting-code.org
[9] L. Vitos, Computational Quantum Mechanics for Materials Engineers; The EMTO Method and Applications, Springer London, 2007.
[10] [link].
URL http://www.flapw.de
[11] J. M. Wills, O. Eriksson, M. Alouani, D. L. Price, Full-potential LMTO total energy and force calculations, in: H. Dreusse (Ed.), Electronic
Structure and Physical Properties of Solids; The Uses of the LMTO Method, Springer, 1996, pp. 148167.
[12] J. M. Soler, E. Artacho, J. D. Gale, A. Garca, J. Junquera, P. Ordejon, D. Sanchez-Portal, The siesta method for ab initio order- n materials
simulation, Journal of Physics: Condensed Matter 14 (11) (2002) 2745.
URL http://stacks.iop.org/0953-8984/14/i=11/a=302
[13] G. Kresse, J. Hafner, Ab initio molecular dynamics for liquid metals, Phys. Rev. B 47 (1) (1993) 558561. doi:10.1103/PhysRevB.47.558.
[14] G. Kresse, J. Hafner, Ab initio molecular-dynamics simulation of the liquid-metalamorphous-semiconductor transition in germanium, Phys.
Rev. B 49 (20) (1994) 1425114269. doi:10.1103/PhysRevB.49.14251.
[15] G. Kresse, J. Furthmuller, Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set,
Computational Materials Science 6 (1) (1996) 15 50. doi:DOI: 10.1016/0927-0256(96)00008-0.
URL http://www.sciencedirect.com/science/article/B6TWM-3VRVTBF-3/2/88689b1eacfe2b5fe57f09d37eff3b74
[16] G. Kresse, J. Furthmuller, Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set, Phys. Rev. B 54 (16)
(1996) 1116911186. doi:10.1103/PhysRevB.54.11169.
[17] D. Caliste, Y. Pouillon, M. Verstraete, V. Olevano, X. Gonze, Sharing electronic structure and crystallographic data with etsf io, Computer
Physics Communications 179 (10) (2008) 748 758. doi:DOI: 10.1016/j.cpc.2008.05.007.
[18] S. R. Hall, F. H. Allen, I. D. Brown, The crystallographic information file (cif): a new standard archive file for crystallography, Acta
Crystallographica A 47 (1991) 655685.
[19] [link].
URL http://cci.lbl.gov/sginfo/
[20] [link].
URL http://cst-www.nrl.navy.mil/lattice/
[21] S. Grazulis, D. Chateigner, R. T. Downs, A. F. T. Yokochi, M. Quiros, L. Lutterotti, E. Manakova, J. Butkus, P. Moeck, A. Le Bail, Crystallography Open Database an open-access collection of crystal structures, Journal of Applied Crystallography 42 (4) (2009) 726729.
URL http://dx.doi.org/10.1107/S0021889809016690