Vous êtes sur la page 1sur 9

Ind. Eng. Chem. Res.

2004, 43, 6253-6261 6253

A Group-Contribution Method for Predicting Pure Component


Properties of Biochemical and Safety Interest
Emmanuel Stefanis, Leonidas Constantinou, and Costas Panayiotou*,
Department of Chemical Engineering, Aristotle University of Thessaloniki, GR 54124, Thessaloniki, Greece
and Frederick Research Center, P.O. Box 24729, Nicosia, Cyprus

A simple, yet quite accurate method for predicting properties of organic compounds of
environmental and nutraceutical interest is presented. It is an extension of a previous successful
group-contribution method (Constantinou, L.; Gani, R. AIChE J. 1994, 40, 1697) and uses two
kinds of groups: first-order groups that describe the basic molecular structure of the compounds
and second-order groups, which are based on conjugation theory and improve the accuracy of
the predictions. Twenty-six new first-order groups have been defined to ensure that the molecular
structures of any compound of biochemical interest, including complex aromatic, multiring, and
heterocyclic compounds, can be easily described. Furthermore, 12 new second-order groups have
been defined to enhance the reliability of the predictions and the applicability of the method.
The three properties that have been estimated by the new method are the octanol-water partition
coefficient (logKow), the total (Hildebrand) solubility parameters at 25 C, and the flash point.
These properties have many applications in the chemical, pharmaceutical, and food industries,
as well as in the protection of the environment.

Introduction of functional groups: first-order groups (UNIFAC groups)


and second-order groups that are based on conjugation
Computer-aided molecular design is a very important theory. The second-order groups give a physical mean-
process and tool that is used for the prediction of ing to the method, and this is an advantage compared
properties of organic compounds when reliable experi- to the other group-contribution methods. These groups
mental data are not available, for checking questionable improve the accuracy of the predictions significantly.
values of already measured properties, and especially
for the selection of compounds with desired properties. The definition of second-order groups is based on the
ABC framework.5 According to this framework, each
In the past several decades, many group-contribution
compound is represented as a hybrid of many conjugate
methods have been widely used for the prediction of
forms. Every conjugate form is considered as a structure
physicochemical properties of pure organic compounds.
with integer-order localized bonds and integer charges
One of the first widely used group-contribution methods
on atoms. The purely covalent conjugate form is the
was the UNIFAC method,1 where the value of each
property was obtained as the sum of contributions of dominant conjugate, and the ionic forms are the reces-
simple first-order groups. The methods of Joback and sive conjugates, which are generated by using a conju-
Reid2 and of Horvath3 are also methods of this kind. gate operator. When a conjugate operator is applied to
More recently, a new class of group-contribution meth- a dominant conjugate, it can generate a series of
ods has been proposed. In this kind of method,4,5 second- recessive conjugates. Conjugate operators consist of
order groups are defined to provide more structural subchains with two or three bonds, such as OdC-C or
information, to distinguish isomers, and to afford more C-C-C-H. In this theory, the properties of each
accurate predictions. Second-order groups have a strong compound are estimated by combining the correspond-
physicochemical meaning and can significantly improve ing properties of its conjugate forms. The properties of
the accuracy of property predictions. The definition of the conjugate forms are estimated through conjugation
second-order groups is based on the theory of conjuga- operators. Each operator has a fixed contribution, which
tion operators.5,6 Marrero and Gani7 introduced a higher is determined through regression and reflects the
level of approximation by defining third-order groups contribution of a whole series of conjugate forms.
to provide more structural information about systems The basic property in the ABC framework is the
of fused aromatic and nonaromatic rings. standard enthalpy of formation at 298 K. For the
estimation of this property, the contributions of the
The Existing Constantinou-Gani Model conjugate forms can be expressed in terms of their
physical significance rather than adjustable parameters.
One of the most accurate group-contribution methods Therefore, the most important conjugate forms, i.e., the
is the model proposed by Constantinou and Gani.4 forms that exert the strongest influence on the standard
According to this model, the molecular structure of each enthalpy of formation (and also on the other properties
organic compound can be described by using two kinds to be estimated), can be distinguished. The conjugation
operators that are related to the most important con-
* To whom correspondence should be addressed. Fax: jugate forms make the largest contributions.6 It is
+302310996222. E-mail: cpanayio@auth.gr. possible to identify the classes of conjugate forms with
Aristotle University of Thessaloniki. the highest conjugation activity by examining the
Frederick Research Center. contributions of their operators. The identification of
10.1021/ie0497184 CCC: $27.50 2004 American Chemical Society
Published on Web 07/29/2004
6254 Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004

Table 1. First-Order Group Contributions for logKow, Total Solubility Parameter (25 C), and Flash Point
contributions
total solubility flash sample group assignment
group logKow parameter point (occurrences)
-CH3 0.6998 -2308.6 0.6 propane (2)
-CH2 0.4707 -277.1 10.8 butane (2)
-CH< 0.0405 -355.5 12.2 isobutane (1)
>C< -0.4723 -176.2 12.4 neopentane (1)
CH2dCH- 0.9737 -2766.2 10.3 propylene (1)
-CHdCH- 0.6749 -381.9 19.1 cis-2-butene (1)
CH2dC< 0.8361 -980.2 19.1 isobutene (1)
-CHdC< 0.1234 1887.1 37.9 2-methyl-2-butene (1)
>CdC< 2.6256 1601.8 - 2,3-dimethyl-2-butene (1)
CH2dCdCH- - -3745.0 - 1,2-butadiene (1)
CHEC- 0.2159 -975.5 21.8 propyne (1)
CEC 0.1597 2169.3 24.5 2-butyne (1)
ACH 0.3633 -6.4 11.5 benzene (6)
AC 0.2497 684.3 20.5 naphthalene (2)
ACCH3 0.7748 -221.8 23.8 toluene (1)
ACCH2- 0.4036 1023.4 31.5 m-ethyltoluene (1)
ACCH< 0.1910 605.5 36.4 sec-butylbenzene (1)
CH3CO -0.5433 3269.1 47.4 methyl ethyl ketone (1)
CH2CO -0.9379 7274.2 69.7 cyclopentanone (1)
CHO -0.3524 5398.2 35.5 1-butanal (1)
COOH -0.5994 9477.8 97.3 vinyl acid (1)
CH3COO -0.3164 1865.1 53.0 ethyl acetate (1)
CH2COO -0.7454 5194.2 61.9 methyl propionate (1)
HCOO -0.9078 1716.0 38.4 n-propyl formate (1)
COO -0.8772 3671.8 54.1 ethyl acrylate (1)
OH -1.0577 12228.9 60.3 2-propanol (1)
ACOH -0.2851 8456.1 68.5 phenol (1)
CH3O -0.3088 -480.8 22.4 methyl ethyl ether (1)
CH2O -0.8032 -206.7 29.8 ethyl vinyl ether (1)
CHO -0.5994 1229.1 28.8 diisopropyl ether (1)
C2H5O2 -1.5767 - 91.9 2-methoxyethanol (1)
CH2O (cyclic) -0.4673 3733.9 28.1 1,4-dioxane (2)
CH2NH2 -0.9178 3650.7 39.4 1-amino-2-propanol (1)
CHNH2 -2.0541 560.4 33.6 isopropylamine (1)
CH3NH -0.7114 8616.2 - n-methylaniline (1)
CH2NH -1.1628 4183.8 44.4 di-n-propylamine (1)
CHNH -1.5361 3381.8 41.3 diisopropylamine (1)
CH3N -0.7715 2166.5 51.4 trimethylamine (1)
CH2N -1.1007 -2662.6 34.1 triethylamine (1)
ACNH2 -0.7834 9228.4 87.4 aniline (1)
CONH2 - 14930.1 - 2-methacrylamide (1)
CONHCH3 -1.8463 27386.9 - n-methylacetamide (1)
CONHCH2 - - 150.1 n-butylacetamide (1)
CON(CH3)2 -1.5663 12770.8 126.3 n,n-dimethylacetamide (1)
CON(CH2)2 -1.8559 - - n,n-diethylacetamide (1)
C5H4N 0.3803 4686.3 104.3 2-methylpyridine (1)
C5H3N - 6574.7 - 2,6-dimethylpyridine (1)
CH2SH 0.5426 2191.2 46.1 n-butyl mercaptan (1)
CH3S 0.2730 1271.1 - methyl ethyl sulfide (1)
CH2S -0.0200 3585.2 63.8 diethyl sulfide (1)
CHS -0.0961 - - diisopropyl sulfide (1)
I 1.0874 3183.8 - isopropyl iodide (1)
BR 0.7195 2163.8 44.2 2-bromopropane (1)
CH2Cl 0.7601 1923.3 37.6 n-butyl chloride (1)
CHCl 0.4039 426.3 37.8 isopropyl chloride (1)
CCl -0.3759 -1415.6 - tert-butyl chloride (1)
CHCl2 - 1164.0 - 1,1-dichloropropane (1)
CCl3 - -1208.7 105.8 benzotrichloride (1)
ACCl 0.9738 1332.2 40.7 m-dichlorobenzene (2)
ACF 0.4075 -701.5 -15.7 fluorobenzene (1)
Cl-(CdC) -0.0890 -473.5 7.6 2,3-dichloropropene (1)
CF3 0.8115 -5199.5 -6.2 perfluorohexane (2)
CH2NO2 -0.3970 10030.7 81.5 1-nitropropane (1)
CHNO2 -0.6961 12706.7 - 2-nitropropane (1)
ACNO2 0.0451 6303.5 96.5 nitrobenzene (1)
CH2CN -0.6326 9359.8 69.7 n-butyronitrile (1)
CF2 - -3464.4 - perfluoromethylcyclohexane (5)
C4H3S 1.5387 4722.7 - 2-methylthiophene (1)
F (except as above) - -2965.3 - 2-fluoropropane (1)
CH2dCdC< - -2326.1 - 3-methyl-1,2-butadiene (1)
CHdCdCH- - -795.6 - 2,3-pentadiene (1)
CHCO -1.0761 7805.8 - diisopropyl ketone (1)
O (except as above) -0.0431 2467.6 6.7 divinyl ether (1)
Cl (except as above) 0.2919 636.3 47.6 hexachlorocyclopentadiene (2)
NH2 (except as above) 0.0088 -841.5 - melamine (3)
>CdN- -0.5341 3380.7 - 2,4,6-trimethylpyridine (1)
Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004 6255

Table 1 (Continued)
contributions
total solubility flash sample group assignment
group logKow parameter point (occurrences)
-CHdN- -0.5597 5026.4 38.8 isoquinoline (1)
NH (except as above) -0.5432 3459.4 - dibenzopyrrole (1)
NdN- -0.0403 -7339.6 - p-aminoazobenzene (1)
CN (except as above) -0.8772 10253.0 60.1 cis-crotonitrile (1)
NO2 (except as above) - 1655.1 - nitroglycerine (3)
OdCdN- - 2694.6 42.3 n-butyl isocyanate (1)
CHSH - 1234.8 45.9 cyclohexyl mercaptan (1)
CSH - 2230.2 - tert-butyl mercaptan (1)
SH (except as above) 0.5157 - - 2-mercaptobenzothiazole (1)
S (except as above) 0.3793 4770.2 17.8 thiophene (1)
SO2 -2.7494 14215.0 190.7 sulfolene (1)
>CdS - 26271.8 - n-methylthiopyrrolidone (1)
OdP< -3.4646 - - bis-2-chloroethyl-2-chloroethyl phosphonate (1)
>P- -0.6046 -1643.4 2.7 triphenylphosphine (1)
>CdO (except as above) -0.5028 - - anthraquinone (2)
NHCO -1.0134 - - phenylurea (1)
-NdO -0.1526 - - nitrosobenzene (1)
N (except as above) -0.5546 - - triphenylamine (1)

second-order groups is based on the operators with temperature, Tc; critical pressure, Pc; critical volume,
much higher contributions than the others. The struc- Vc; melting point, Tm; normal boiling point, Tb; standard
ture of a second-order group should incorporate a Gibbs energy at 298 K, Gf; standard enthalpy of
subchain with at least one important conjugate opera- vaporization at 298 K, Hvap; standard enthalpy of
tor. Because of a possible structural similarity of the formation at 298 K, Hf;4 acentric factor, ; and liquid
operators, a second-order group can contain more than molar volume at 298 K, Vl.9
one conjugate operators. For example, the second-order Recently,10 the method has been extended to the
group CH3COCH2 contains the following operators: prediction of polymer properties, such as the glass
OdC-C, OdC-C-H, and C-C-C-H. The structure transition temperature, Tg, through the estimation of
of a second-order group should be built with first-order three scaling constants, namely, T*, P*, and F*, of the
groups and should be as small as possible. lattice fluid (LF) model of Sanchez and Lacombe.11 The
The methodology that is followed for the identification first- and second-order group contributions of each
of second-order groups is as follows:8 (a) identification scaling constant were estimated through regression.
of all first-order groups present in the syntactic type of
a given compound; (b) definition of all possible substruc- Proposed Model
tures of two or three adjacent first-order groups; (c) The model of Constantinou-Gani, even though it is
identification of all two-bond and three-bond conjugation one of the most accurate group-contribution methods,12
operators in the substructures; (d) estimation of the cannot be applied to all homologous series of organic
conjugation operator energy of all substructures by compounds because the 78 first-order groups (UNIFAC
addition of the energies of all of the conjugation opera- groups) that it uses cannot describe the molecular
tors; and (e) identification of substructures with much structure of complex compounds. Compounds of complex
higher conjugation energies than the others. These multiring, heterocyclic, and aromatic structures are of
substructures are the second-order groups. significant importance in the chemical, biochemical,
The basic equation that gives the value of each pharmaceutical, and food industries, as well as for the
property according to the molecular structure is environmental protection.
The first target of the present work was to introduce
f(p) ) i niFi + j mjSj (1) a new, accurate model that can predict the properties
of both simple- and complex-structured compounds. This
was achieved by adding 26 new first-order groups to the
where Fi is the contribution of the first-order group of 78 already existing UNIFAC groups. The new, simple
type i that appears ni times in the compound and Sj is groups ensure that any molecular structure can be
the contribution of the second-order group of type j that described at an initial, basic level. Thus, the model is
appears mj times in the compound. f(p) is a single able to provide a basic description of each of the
equation for the property under consideration, p, and compounds that occur in the DIPPR database of ther-
is selected after a thorough study of the physicochemical mophysical properties. Twelve new second-order groups
and thermodynamic behavior of the property. The were also added to the existing second-order groups of
determination of the contributions is made by a two- the Constantinou-Gani method to provide more details
step regression analysis. about the molecular structure of the recently introduced
In the first step of the regression, the aim is to complex compounds, such as fused aromatic and mul-
determine the contributions of the first-order groups tiring structures. These groups lead to more accurate
only (that is, the Fi). In the second step, using the Fi results, and in some cases, they allow isomers to be
contributions, the second-order groups are activated, distinguished.
and the second-order group contributions (Sj) are cal- The second target was to apply the new model to the
culated through regression. These contributions act as prediction of important properties of chemical, biochemi-
a correction to the first-order approximation. cal, and safety interest. These are (a) octanol-water
Ten properties have been estimated so far by the partition coefficients (logKow), (b) total (Hildebrand)
Constantinou-Gani method. These are the critical solubility parameters at 25 C, and (c) flash points.
6256 Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004

The octanol-water partition coefficient (logKow) is Table 3 illustrates the overall improvement of the
defined as the common logarithm (base 10) of the ratio estimations of the three properties that was achieved
of a compounds concentration in n-octanol to its con- after the introduction of second-order groups in the
centration in water in a two-phase system in equilibri- regression. The following three parameters are used to
um. The logKow of n-octanol is equal to 3.00. This measure the accuracy of the estimations:
means that the concentration of n-octanol that is diluted
in the n-octanol of the two-phase system is considered
(Xest - Xexp)2
to be 1000 times greater than that in water (log 1000 )
3).
The total (Hildebrand) solubility parameter is defined
standard deviation (SD) ) x N


1
as the square root of the cohesive energy density, where average absolute error (AAE) ) |Xest - Xexp|
the cohesive energy density is the ratio of the cohesive N
energy (Ecoh) to the molar volume (V) of a compound. average absolute percent error (AAPE) )
Cohesive energy is equal to Hvap - RT, where Hvap 1 |Xest - Xexp|
is the standard enthalpy of vaporization, R is the
N Xexp
100%
universal gas constant, and T is the temperature. Thus

where N is the number of data points, Xest is the

x x
Ecoh Hvap - RT estimated value of the property, and Xexp is the experi-
) ) (2)
V V mental value.
Scatter plots of estimated vs experimental values for
The flash point is the minimum temperature at which the three properties are presented in Figures 1-3.
the vapor pressure of a liquid is sufficient to form an In Table 4, the statistical logKow values of the
ignitable mixture with air near the surface of the liquid. proposed method are compared with the values of
The sources of reliable experimental data were the similar existing methods of logKow estimation, accord-
handbook Exploring QSARsHydrophobic, Electronic ing to the standard deviations, average absolute errors,
and Steric Constants by Hansch, Leo, and Hoekman for and the correlation coefficient r2
the octanol-water partition coefficients (logKow), the
DIPPR database of thermophysical properties14 for the
total (Hildebrand) solubility parameters at 25 C; and
r2 )
(Xest - Xexp)2
Fire Protection Guide to Hazardous Materials (National
Fire Protection Association)15 for the flash points of (Xexp - Xexp)2
organic compounds.
A least-squares analysis was carried out to estimate In Table 5, the standard deviation and the average
the first-order and second-order contributions for all absolute error of the method of Marrero and Gani18 for
properties. The modified Levenberg approach was used logKow estimation and that of the present new method
to minimize the total sum of squared errors between are compared at each level of approximation.
the experimental and predicted values of the properties.
This was the criterion for the selection of the most Discussion
appropriate equation to fit the experimental data. The
model is applicable to organic compounds with three or The proposed model features many advantages over
more carbon atoms excluding the atom of the charac- existing similar models. First, it is able to describe the
teristic group (e.g., -COOH or -CHO). basic molecular structure of organic compounds, both
In Table 1, the first-order group contributions for the simple and complex, with a relatively small set of first-
three previously mentioned properties are presented. order groups. In addition, the set of second-order groups,
Table 2 lists the second-order group contributions for which can provide more accurate results, is probably
the same properties. (Dashes indicate that the contribu- the only set of functional groups in the literature that
tions of the group in the specific property are not has a sound physical meaning because it is based on
available.) conjugation theory. The application of the model is
The equations selected for the estimation of each simple, and the accuracy is exceptional compared to
property are as follows: other existing methods.
Concerning the octanol-water partition coefficients
Octanol-water partition coefficient (logKow) (logKow), this model seems to be more accurate than
other existing models by Meylan and Howard16 (Kow-
logKow ) i niFi + j mjSj + 0.097 (3) Win software17 is based on this theory) and Marrero and
Gani.18 In Tables 4 and 5, one can see that the proposed
Total solubility parameter (25 C) [(kJ/m3)(1/2)] method has smaller standard deviations, smaller aver-
age percent errors, and higher correlation coefficients
solubility parameter ) ( i niFi + j mjSj + (r2) between the experimental and estimated values.
The other models for estimating logKow16-18 use
75954.1)0.383837 - 56.14 (4) larger sets of experimental data but also hundreds of
Flash point (K) groups to describe the molecular structures. These
groups have little physical meaning, whereas the second-
flash point ) i niFi + j mjSj + 216 (5) order groups of the proposed new method are based on
conjugation theory. On the other hand, even though the
proposed method employs a significantly smaller num-
The quantity mjSj is considered to be zero for com- ber of first- and second-order groups, it still can describe
pounds that do not have second-order groups. most of the existing molecular structures of organic
Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004 6257

Table 2. Second-Order Group Contributions for logKow, Total Solubility Parameter, and Flash Point
contributions
total solubility flash sample group assignment
group logKow parameter point (occurrences)
(CH3)2-CH- 0.0341 142.1 2.7 isobutane (1)
(CH3)3-C- 0.3415 592.3 0.6 neopentane (1)
-CH(CH3)-CH(CH3)- 0.4434 1581.2 -6.8 2,3-dimethylbutane (1)
-CH(CH3)-C(CH3)2- - 2678.4 - 2,2,3-trimethylbutane (1)
-C(CH3)2-C(CH3)2- - 5677.6 - 2,2,3,3-tetramethylpentane (1)
ring of 5 carbons 0.3986 -2637.7 -24.6 cyclopentane (1)
ring of 6 carbons 0.3436 -524.2 -9.6 cyclohexane (1)
-CdC-CdC- -0.1577 -426.8 -22.8 1,3-butadiene (1)
CH3-Cd 0.0020 11.9 -4.3 isobutene (2)
-CH2-Cd 0.1639 -762.7 10.2 1-butene (1)
>C{H or C}-Cd - -1257.2 20.2 3-methyl-1-butene (1)
string in cyclic - 626.1 23.9 ethylcyclohexane (1)
>CHCHO - -1634.4 -3.7 2-methylpropanal (1)
CH3(CO)CH2- -0.2217 142.0 1.1 methyl ethyl ketone (1)
C(cyclic)dO -0.1947 -3745.0 -7.6 cyclopentanone (1)
ACCOOH 0.4878 -3076.5 9.1 benzoic acid (1)
>C{H or C}-COOH -0.2086 511.1 2.2 isobutyric acid (1)
CH3(CO)OC{H or C}< 0.0749 134.4 3.5 isopropyl acetate (1)
(CO)C{H2}COO 0.0339 1060.5 8.4 ethyl acetoacetate (1)
(CO)O(CO) - -2875.9 -17.0 acetic anhydride (1)
ACHO 0.0229 3315.0 6.4 benzaldehyde (1)
>CHOH -0.0238 -359.5 -1.4 2-propanol (1)
>C<OH -0.2811 -23.4 -2.8 tert-butanol (1)
-C(OH)C(OH)- 0.0293 5020.6 5.9 1,2-propanediol (1)
-C(OH)C(N) 0.1716 3306.4 21.4 1-amino-2-propanol (1)
C(in cyclic)-OH -0.2029 4022.7 -1.7 cyclohexanol (1)
C-O-CdC 0.0732 -228.5 -14.4 ethyl vinyl ether (1)
AC-O-C 0.4087 2493.0 2.8 methyl phenyl ether (1)
>N{H or C}(in cyclic) 0.2166 -492.7 -0.5 cyclopentimine (1)
-S-(in cyclic) - 2389.4 - tetrahydrothiophene (1)
ACBr 0.1263 337.4 -3.6 bromobenzene (1)
ACI 0.0000 1267.1 - iodobenzene (1)
CH3(CO)CH< -0.2375 -437.1 - methyl isopropyl ketone (1)
ring of 3 carbons 0.2113 -9764.5 - cyclopropane (1)
ring of 4 carbons - -3673.4 - cyclobutane (1)
ring of 7 carbons 0.6085 -1486.4 - cycloheptane (1)
ACCOO -0.0047 -83.5 -9.4 methyl benzoate (1)
AC(ACHm)2AC(ACHn)2 -0.0814 -69.8 3.2 naphthalene (1)
Ocyclic-CcyclicdO -0.9326 9215.6 64.8 diketene (1)
AC-O-AC 0.0930 -4646.5 6.1 diphenyl ether (1)
CcycHmdNcyc-CcycHndCcycHp -0.0582 2348.2 0.0 2,6-dimethylpyridine (1)
NHm-CHn-COOH -1.0851 -7854.3 - ethylenediaminetetraacetic acid (4)
CHn-O-OH - 2002.5 - ethylbenzene hydroperoxide (1)
CHm-O-O-CHn - -2029.1 - di-tert-butyl peroxide (1)
NcycHm-CcycdO 0.2315 11489.1 49.3 2-pyrrolidone (1)
Ocyc-CcycHmdNcyc -0.0487 -8721.6 - oxazole (1)
-O-CHm-O-CHn- 0.2923 -620.3 -2.3 methylal (1)
AC-NH-AC 0.2617 2.8 - dibenzopyrrole (1)
C(dO)-C-C(dO) 0.9193 -3668.9 -14.7 2,4-pentanedione (1)

Table 3. Comparison of the First- and Second-Order can be seen in the Appendix, the estimated value of the
Approximations of the Proposed Method present new method is 3.03 (absolute error ) 1%). The
SD AAE AAPE (%) estimated value for n-octanol obtained by the Marrero
data first second first second first second and Gani method18 is 2.85 (absolute error ) 4.8%), and
property points order order order order order order the logKow value that the KowWin software17 estimates
logKow 422 0.315 0.267 0.226 0.188 - - is 2.81 (absolute error ) 6.3%).
total solubility 1017 1.468 1.308 0.996 0.901 5.15 4.67 The group-contribution methods by van Krevelen and
parameter
flash point 418 16.530 14.733 11.898 10.689 3.66 3.27 Hoftyser20 and Fedors21 have been used widely for
decades for predicting the cohesive energy, liquid molar
compounds. Introduction of a third-order approximation volume, and solubility parameters of polymers. Fedors
level18 makes the method rather inconvenient for sci- contributions can also give satisfactory results for the
entists to use and also complex for the needs of total solubility parameters of pure compounds. In fact,
computer-aided design. In addition, this third-order for relatively simple molecules, Fedors method is
approximation does not seem to exceed the accuracy of comparable to and sometimes better than ours. How-
the second-order approximation of the proposed method ever, for complex molecules, as are most of the molecules
(see Tables 4 and 5). of biochemical interest, Fedors method appears rather
The key compound that can provide an indication on inappropriate. We demonstrate this fact by means of
the accuracy of each logKow method is the n-octanol. examples in the Appendix. The proposed model has an
Theoretically, the logKow value of n-octanol is 3.00. This error of -0.5% for salicylic acid, whereas the Fedors
is also the experimental value for this compound. As method give a result for the total solubility parameter
6258 Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004

Table 4. Comparison of the Accuracy of Existing


Methods for logKow Estimation
KowWin software Marrero, new
version 1.6617 Gani18 method
standard deviation (SD) 0.44 0.34 0.27
absolute average error (AAE) 0.32 0.24 0.19
correlation coefficient (r2) 0.95 0.97 0.99

Table 5. Comparison of the Marrero and Gani18 Method


and the New Method for the logKow Estimation
first-order second-order third-order
approximation approximation approximation
standard deviation (SD)
Marrero, Gani 0.42 0.38 0.34
new method 0.31 0.27
absolute average error (AAE)
Marrero, Gani 0.35 0.27 0.24
new method 0.23 0.19

Figure 1. Scatter plot of estimated vs experimental logKow In the literature, there are no relevant group-
values.
contribution methods for estimating the flash point
exclusively from the molecular structure and without
any other data, so there cannot be any direct compari-
sons with our method. In the flash point estimation, the
mean error is only 3.27% (Table 3), and the average
error of the absolute values about 10 K. In some cases,
an accuracy of 10 K would not be appropriate for safety
purposes, but it could act as a warning when the actual
temperature reaches the estimated flash point -10 K.
This warning is very important for the safety of the
personnel in both laboratories and industries, especially
in cases of presence of substances without any experi-
mental data available or in the production of new
compounds with completely unknown flammability
behavior.

Conclusion
With the group contributions and model equations
Figure 2. Scatter plot of estimated vs experimental total solubil- that were presented in this work, crucial environmental
ity parameters.
and safety properties of organic compounds can be
predicted easily and with satisfactory accuracy. By
adding the results for the octanol-water partition
coefficients (logKow), the total solubility parameters at
25 C, and the flash points to the other 10 properties
previously predicted by Constantinou et al.,4,8 the total
set of properties of this method is much more complete.
In addition, the experimental database of the method
now includes a larger number of compounds that cover
homologous series with multiring, aromatic, or hetero-
cyclic structures and have special applications in the
chemical, pharmaceutical, or food industries.
The present method is currently being extended to
temperature-dependent properties, such as surface ten-
sion, through its combination with the QCHB model.19

Acknowledgment
The authors express their appreciation to the Cyprus
Figure 3. Scatter plot of estimated vs experimental flash points. Research Promotion Foundation for the partial financial
support of the project.
of this compound with an error of 29.5%. In the example
of 2-cyclohexyl cyclohexanone, the errors were 0.1% and
8.8% for our method and for Fedors method, respec- List of Symbols
tively. In general, the differences in accuracy are AAE ) average absolute error
probably due to the extensive use of second-order groups AAPE ) average absolute percentage error
that can provide a more detailed description of molec- C ) universal constant of the model
ular structures in multiring, heterocyclic, or aromatic f(p) ) equation of property p
compounds of biochemical interest, which was one of the Fi ) contribution of a first-order group of type i
most important scopes of this new method. logKow ) octanol-water partition coefficient
Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004 6259

mj ) number of occurrences of a second-order group of type First-order Approximation:


j in the compound
ni ) number of occurrences of a first-order group of type i
in the compound occurrences, contribution,
N ) number of data points first-order group ni Fi niFi
Pc ) critical pressure -CH3 1 0.6998 0.6998
R ) universal gas constant C5H4N 1 0.3803 0.3803
r2 ) correlation coefficient niFi 1.0801
SD ) standard deviation universal constant, C 0.097
Sj ) contribution of a second-order group of type j
T ) temperature First-order approximation value
Tb ) normal boiling point
Tc ) critical temperature
Tm ) melting point
logKow ) i niFi + 0.097 ) 1.1771
V ) molar volume
Vc ) critical volume First-order approximation error ) (1.18-1.11)/1.11 )
Vl ) liquid molar volume 6.3%
Xest ) estimated value of a property Second-Order Approximation:
Xexp ) experimental value of a property
) total (Hildebrand) solubility parameter
Gf ) standard Gibbs energy occurrences, contributions,
Hf ) standard enthalpy of formation second-order group mj Sj mj S j
Hvap ) standard enthalpy of vaporization CcycHmdNcyc- 1 -0.0582 -0.0582
) acentric factor CcycHndCcycHp
mjSj -0.0582
Appendix
Second-order approximation value
Examples of Predictions of the Octanol-Water
Partition Coefficients (logKow). Example 1: 1-Oc-
tanol. logKow ) i niFi + i mjSj + 0.097 ) 1.1189
Estimated logKow ) 1.12
Experimental logKow ) 1.11
Percentage error ) (1.12-1.11)/1.11 ) 0.9%
Other Methods of Estimation
Marrero, Gani: Estimated logKow ) 1.40. Error )
(1.40-1.11)/1.11 ) 26.1%
KowWin version 1.66: Estimated logKow ) 1.35.
Error ) (1.35-1.11)/1.11 ) -21.6%
occurrences, contribution, Examples of Predictions of Total Solubility
first-order group ni Fi niFi Parameters. Example 1: Salicylic Acid.
-CH3 1 0.6998 0.6998
-CH2 7 0.4707 3.2949
-OH 1 -1.0577 -1.0577
niFi 2.9370
universal constant, C 0.097

No second-order groups are involved. First-order ap-


proximation value
First-Order Approximation:
logKow ) i niFi + 0.097 ) 3.034
occurrences, contribution,
first-order group ni Fi niFi
Estimated logKow ) 3.03
Experimental logKow ) 3.00 ACH 4 -6.4 -25.6
AC 1 684.3 684.3
Percentage error ) (3.03-3.00)/3.00 ) 1% COOH 1 9477.8 9477.8
Other Methods of Estimation ACOH 1 8456.1 8456.1
Marrero, Gani: Estimated logKow ) 2.85. Error ) niFi 18592.6
(2.85-3.00)/3.00 ) -4.8% universal constant, C 75954.1
KowWin version 1.66: Estimated logKow ) 2.81.
Error ) (2.81-3.00)/3.00 ) -6.3% First-order approximation value
Example 2: 2-Methylpyridine.
solubility parameter ) ( i niFi + C)0.383837 - 56.14
) (94546.7)0.383837 - 56.14 ) 25.11 (kJ/m3)1/2

First-order approximation error ) (25.11-24.21)/


24.21 ) 3.7%
6260 Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004

Second-Order Approximation: Percentage error ) (18.79-18.77)/18.77 ) 0.1%


Other Method of Estimation
Fedors: Estimated solubility parameter ) 20.42.
occurrences, contributions,
second-order group mj Sj mj S j Error ) (20.42-18.77)/18.77 ) 8.8%
Examples of Predictions of Flash Point. Ex-
ACCOOH 1 -3076.5 -3076.5 ample 1: Naphthalene.
mjSj -3076.5

Second-order approximation value

solubility parameter ) ( i niFi + i mjSj +


C)0.383837 - 56.14
) (91470.2)0.383837 - 56.14 ) 24.09 (kJ/m3)1/2

Estimated solubility parameter ) 24.09 (kJ/m3)1/2 First-Order Approximation:


Experimental solubility parameter ) 24.21 (kJ/m3)1/2
Percentage error ) (24.09-24.21)/24.21 ) -0.5%
Other Method of Estimation occurrences, contribution,
Fedors: Estimated solubility parameter ) 31.36. first-order group ni Fi niFi
Error ) (31.36-24.21)/24.21 ) 29.5% ACH 8 11.5 92
Example 2: 2-Cyclohexyl Cyclohexanone. AC 2 20.5 41
niFi 1 7274.2 133
universal constant, C 216

First-order approximation value

flash point ) i niFi + 216 ) 349 K


First-order approximation error ) (349 - 352)/352 )
0.9%
First-order approximation: Second-Order Approximation:

occurrences, contribution, occurrences, contributions,


first-order group ni Fi niFi second-order group mj Sj mj S j
-CH2 8 -277.1 -2216.8 AC(ACHm)2AC(ACHn)2 1 3.2 3.2
-CH 2 -355.5 -711 mjSj 3.2
CH2CO- 1 7274.2 7274.2
niFi 4346.4
universal constant, C 75954.1 Second-order approximation value

First-order approximation value flash point ) i niFi + j mjSj + 216 ) 352.2 K


solubility parameter ) ( i niFi + C) 0.383837
- 56.14
Estimated flash point ) 352.2 K
Experimental flash point ) 352 K
) (80300.5.)0.383837 - 56.14 ) 20.18 (kJ/m3)1/2
Percentage error ) (352.2 - 352)/352 ) 0.02%
First-order approximation error ) (20.18-18.77)/ Example 2: D-Limonene [Cyclohexene, 1-Meth-
18.77 ) 7.5% yl-4-(1-methylethenyl)-, (R)-].
Second-Order Approximation:

occurrences, contributions,
second-order group mj Sj mj S j
C(cyclic)dO 1 -3745 -3745
mjSj -3745
First-Order Approximation:
Second-order approximation value

solubility parameter ) ( i niFi + i mjSj + first-order group


occurrences,
ni
contribution,
Fi niFi
0.383837 -CH3 2 0.6 1.2
C) - 56.14 -CH2 3 10.8 32.4
-CH 1 12.2 12.2
) (76555.5)0.383837 - 56.14 ) 18.79 (kJ/m3)1/2 CH2dC< 1 19.1 19.1
-CHdC< 1 37.9 37.9
Estimated solubility parameter ) 18.79 (kJ/m3)1/2 niFi 102.8
Experimental solubility parameter ) 18.77 (kJ/m3)1/2 universal constant, C 216
Ind. Eng. Chem. Res., Vol. 43, No. 19, 2004 6261

No second-order groups are involved. First-order ap- (2) Joback, K. G.; Reid, R. C. Estimation of Pure-Component
proximation value Properties from Group Contributions. Chem. Eng. Commun. 1983,
57, 233.
flash point ) i niFi + 216 ) 318.8 K (3) Horvath, A. L. Molecular Design; Elsevier: Amsterdam
1992.
(4) Constantinou, L.; Gani, R. New Group Contribution Method
for Estimating Properties of Pure Compounds. AIChE J. 1994, 40,
Estimated flash point ) 318.8 K 1697.
Experimental flash point ) 318 K (5) Mavrovouniotis, M. L. Estimation of Properties from Con-
Percentage error ) (318.8 - 318)/318 ) 0.25% jugate Forms of Molecular Structures. Ind. Eng. Chem. Res. 1990,
Example 3: Camphor (Bicyclo-2.2.1-heptan-2- 32, 1734.
one, 1,7,7-Trimethyl-). (6) Constantinou, L.; Prickett, S. E.; Mavrovouniotis, M. L.
Estimation of Thermodynamic and Physical Properties of Acyclic
Hydrocarbons Using the ABC Approach and Conjugation Opera-
tors. Ind. Eng. Chem. Res. 1993, 32 (8), 1734.
(7) Marrero, J.; Gani, R. Group Contribution Based Estimation
of Pure Component Properties. Fluid Phase Equilib. 2001, 183-
184, 183-208.
(8) Constantinou, L. Property Estimation Method for Accurate
Process Design. Petroleum Technology Quarterly 2001, 6/2, 103-
109.
First-order approximation: (9) Constantinou, L.; Gani, R.; OConnell, J. Estimation of the
Acentric Factor and the Liquid Molar Volume at 298 K Using a
New Group Contribution Method. Fluid Phase Equilibr. 1995, 103,
occurrences, contribution, 11-22.
first-order group ni Fi niFi (10) Boudouris, D.; Constantinou, L.; Panayiotou, C. Prediction
-CH3 3 0.6 1.8 of Volumetric Behavior and Glass Transition Temperature of
-CH2 2 10.8 21.6 Polymers: A Group-Contribution Approach. Fluid Phase Equilib.
-CH 1 12.2 12.2 2001, 167, 1-19.
-C 2 12.4 24.8 (11) Sanchez, I. C.; Lacombe, R. Statistical Thermodynamics
CH2CO 1 69.7 69.7 of Polymer Solutions. Macromolecules 1978, 11, 1145.
niFi 130.1 (12) Poling, B. E.; Prausnitz, J. M.; OConnel, J. P. The
universal constant, C 216 Properties of Gases and Liquids; McGraw-Hill: New York, 2000.
(13) Hansch, C.; Leo, A.; Hoekman, D. Exploring QSARs
First-order approximation value Hydrophobic, Electronic and Steric Constants; ACS Professional
Reference Book; American Chemical Society: Washington, DC,
flash point ) i niFi + 216 ) 346.1 K 1995.
(14) Daubert, T. E.; Danner, R. P. Physical and Thermodynamic
Properties of Pure Compounds: Data Compilation; Hemisphere:
New York, 1989.
First-order approximation error: ) (346.1 - 339)/339 (15) Fire Protection Guide to Hazardous Materials, 11th ed.;
) 2.1% National Fire Protection Association: Quincy, MA, 1994.
Second-order Approximation: (16) Meylan, W.; Howard, P. Atom/Fragment Contribution
Method for Estimating Octanol-Water Partition Coefficients. J.
Pharm. Sci. 1995, 84, 83.
occurrences, contributions, (17) KowWin; U.S. Environmental Protection Agency: Wash-
second-order group mj Sj mj S j ington, DC, 2000 (free distribution at www.epa.gov/oppt/exposure/
C(cyclic)dO 1 -7.6 -7.6 docs/episuite.htm).
mjSj -7.6 (18) Marrero, J.; Gani, R. Group-Contribution-Based Estima-
tion of Octanol/Water Partition Coefficient and Aqueous Solubility.
Second-order approximation value Ind. Eng. Chem. Res. 2002, 41, 6623.
(19) Panayiotou, C. The QHCB model of fluids and their
flash point ) i niFi + j mjSj + 216 ) 338.5 K mixtures. J. Chem. Thermodyn. 2003, 35, 349.
(20) Van Krevelen, D. W.; Hoftyzer, P. J. Properties of Polymers,
Their Estimation and Correlation with Chemical Structure; 2nd
ed.; Elsevier: New York, 1976.
Estimated flash point ) 338.5 K
(21) Fedors, R. F. Method for estimating both the solubility
Experimental flash point ) 339 K parameters and molar volumes of liquids. Polym. Eng. Sci. 1974,
Percentage error ) (338.5 - 339)/339 ) -0.15% 14, 147.

Literature Cited Received for review April 8, 2004


Revised manuscript received June 10, 2004
(1) Fredenslund, Aa.; Gmehling, J.; Rasmussen, P. Vapor- Accepted June 17, 2004
Liquid Equilibria Using UNIFAC; Elsevier Scientific: Amsterdam
1977. IE0497184

Vous aimerez peut-être aussi