Prediction of critical properties of sulfur-containing compounds: New QSPR models

https://doi.org/10.1016/j.jmgm.2020.107700Get rights and content

Highlights

  • New QSPR models for prediction of critical properties of sulfur compounds.

  • The variables of model are solely related to the molecular structures of compounds.

  • A large data set on the critical properties of 130 different structures was used.

  • The capability of models was evaluated for different categories of compounds.

Abstract

In this study, new models have been proposed for the prediction of different critical properties (critical temperature (TC), critical pressure (PC), critical volume (VC), and acentric factor (ω)) of the sulfur-containing compounds based on quantitative structure-property relationship (QSPR). An extensive data set containing experimental data of over 130 different sulfur-containing compounds was employed. Enhanced Replacement Method (ERM) was applied for subset variable selection. Based on ERM selected descriptors, two different models, including linear model and genetic programming (GP) based non-linear model have been proposed for each critical property. The predicted values of each target were in good agreement with the experimental data. For GP-based models, the values of the coefficient of determination (R2) were 0.936, 0.976, 0.990, and 0.917 for TC, PC, VC, and ω, respectively. After revisiting the available QSPR models, it was found that the domain of applicability of new models has been expanded.

Introduction

Crude oil is considered as the main source of energy throughout the world. The properties of crude oil are influenced by the amount of sulfur content [1]. Crude oil usually contains a wide variety of sulfur-containing compounds, including thiols, disulfides, thioethers, and thiophenic sulfur-containing compounds such as dibenzothiophene, and its derivatives [2]. The existence of these compounds in the oil streams leads to some problems in the refining such as corrosion in the equipment and pipelines as well as the catalyst deactivation. Besides, the presence of sulfur-containing compounds in the refinery products such as diesel, gasoline, and jet fuels can cause many technical and environmental problems [3]. Also, the sulfur-containing compounds in the transportation fuels are converted into sulfur oxides (SOx) as a result of combustion. The emission of SOx leads to the serious environmental problems such as acid rain and air pollution [4]. Therefore, some strict regulations have been legislated throughout the world to reduce SOx emission. According to the USA and Europe legislations, the maximum permissible sulfur content of the transportation fuels is 10 ppm [5]. Several processes have been proposed for the desulfurization of fuels, including oxidative desulfurization (ODS) [6], hydrodesulfurization (HDS) [7], adsorptive desulfurization [8], biodesulfurization (BDS) [9], and extractive desulfurization (EDS) [10].

The availability of reliable information regarding the physical, chemical, and thermodynamic properties of the sulfur-containing compounds is essential in the modeling and simulation of such processes. The availability of critical properties such as critical temperature (TC), critical pressure (PC), critical volume (VC), as well as acentric factor (ω) is necessary in case of using the equations of state (EOSs) or corresponding states theory for the estimation of thermodynamic and physical properties. Besides, the above properties can be used in different applications such as flash calculations, and estimation of different fundamental properties of mixtures (i.e., heat capacity, the heat of vaporization, vapor pressure, and viscosity). The experimental measurements of critical properties are accompanied by uncertainty due to the impurities of the samples and the possibility of decomposition of complex compounds with large structures before reaching their critical conditions [11]. Therefore, the experimental measurement of these properties is costly and time-consuming. In this regard, the development of the reliable and robust predictive models for the estimation of critical properties is welcome.

It should be noted that there are several theoretical methods for the determination of the critical properties in the literature [[12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26]]. One category of these methods is Group Contribution (GC) based methods, which frequently used in the prediction of the critical properties and acentric factor [[13], [14], [15], [16], [17], [18]]. However, this method has its advantages and disadvantages. The main advantages of these methods are simplicity and ease of applicability. Despite these advantages, the GC-based methods have important disadvantages. For example, these methods have limited capability for the isomers, and the compounds with new groups. Besides, GC-based methods cannot take into account the interactions between the bonds [12].

Quantitative Structure-Property Relationship (QSPR) is another approach that leads to obtaining reliable molecular insights by relating any macroscopic properties to the molecular features [27]. By using the QSPR method, the desirable properties of new compounds which were not determined previously can be predicted. The variables of these models are the molecular descriptors, which can be derived from the molecular structure of the compounds. A molecular descriptor is an expression of a molecular characterization in the numerical form. QSPR methodology has been applied to predict many properties such as flash point [28], Henry’s law constant [29], and normal boiling point [30]. Some researchers have used the QSPR method to predict the critical properties of pure chemicals [[19], [20], [21], [22], [23], [24]] and their mixtures [25,26]. Table 1 shows the previous developed QSPR models for the prediction of the critical properties of pure compounds. As can be observed, some of the developed models (for example the models developed by Sobati and Abooali [24]) are related to the datasets in which no sulfur-containing compounds are present. Besides, the number of sulfur-containing compounds is limited in the previous datasets. For example, the maximum number of sulfur-containing compounds in the previous datasets is 82, which belongs to Gharagheizi and Mehrpooya dataset [21]. Moreover, QSPR modeling of all critical properties including TC, PC, VC, and ω for a specific data set was not carried out in the previous studies. Therefore, the main aim of the present study is the development of QSPR models for the prediction of critical properties, including TC, PC, VC, along with ω for the sulfur-containing compounds based on an extended data set with the largest number of structures. In this regard, the linear models are proposed, at first. Then, as the secondary aim, it is tried to develop non-linear models based on the Genetic Programming (GP) approach using the selected descriptors in the linear models. It should be noted that our comprehensive literature survey leads to the addition of new experimental data to the previous data set employed by Gharagheizi and Mehrpooya [21] for TC, PC, and ω (see Table 1). Thus, the present dataset is more comprehensive in comparison with the previously applied datasets in terms of the number of sulfur-containing compounds and the diversity of the involved structures.

Section snippets

Methodology

Fig. 1 shows the workflow of QSPR model development in the current study. In the first step, the experimental data is selected from the reliable sources. In the second step, the molecular structures of the compounds are drawn, and optimized. Then, the molecular descriptors are calculated for the optimized structures. In the third step, the training and test sub-datasets are prepared by randomization. In the fourth step, the subset of molecular descriptors is selected using the enhanced

Results and discussion

In the present study, the predictive QSPR models for each critical property and acentric factor have been developed using the ERM algorithm. In this regard, the prediction capability of the developed models was examined by increasing the number of selected descriptors in each model. In other words, the appropriate number of descriptors for each target has been determined using the breaking point plot in which the prediction capability of the model versus the number of selected descriptors is

Conclusions

In the present study, new QSPR models were developed for the prediction of TC, PC, VC, and ω of different sulfur-containing compounds. Linear models with 4, 5, 2, and 4 variables (i.e., molecular descriptors) for TC, PC, VC, and ω were proposed, at first. Then, the GP was implemented to develop non-linear models for each target. Unlike the available previous QSPR models in the literature, the domain of applicability of each developed model confirms that the proposed models are capable of

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (64)

  • A.S. Hukkerikar et al.

    Group-contribution+ (GC+) based estimation of properties of pure components: improved property estimation and uncertainty analysis

    Fluid Phase Equil.

    (2012)
  • L. Zhou et al.

    Quantitative structure-property relationship (QSPR) study for predicting gas-liquid critical temperatures of organic compounds

    Thermochim. Acta

    (2017)
  • S.S. Godavarthy et al.

    Improved structure–property relationship models for prediction of critical properties

    Fluid Phase Equil.

    (2008)
  • M.A. Sobati et al.

    Molecular based models for estimation of critical properties of pure refrigerants: quantitative structure property relationship (QSPR) approach

    Thermochim. Acta

    (2015)
  • L. Zhou et al.

    Predicting the gas-liquid critical temperature of binary mixtures based on the quantitative structure property relationship

    Chemometr. Intell. Lab. Syst.

    (2017)
  • E. Torabian et al.

    New structure-based models for the prediction of flash point of multi-component organic mixtures

    Thermochim. Acta

    (2019)
  • D. Ghaslani et al.

    Descriptive and predictive models for Henry’s law constant of CO2 in ionic liquids: a QSPR study

    Chem. Eng. Res. Des.

    (2017)
  • D. Abooali et al.

    Novel method for prediction of normal boiling point and enthalpy of vaporization at normal boiling point of pure refrigerants: a QSPR approach

    Int. J. Refrig.

    (2014)
  • M. Goodarzi et al.

    Application of quantitative structure-property relationship analysis to estimate the vapor pressure of pesticides

    Ecotoxicol. Environ. Saf.

    (2016)
  • D. Abooali et al.

    A new empirical model for estimation of crude oil/brine interfacial tension using genetic programming approach

    J. Petrol. Sci. Eng.

    (2019)
  • L. Jin et al.

    QSPR study on normal boiling point of acyclic oxygen containing organic compounds by radial basis function artificial neural network

    Chemometr. Intell. Lab. Syst.

    (2016)
  • J.S. Dambolena et al.

    Inhibitory effect of 10 natural phenolic compounds on Fusarium verticillioides. A structure–property–activity relationship study

    Food Contr.

    (2012)
  • M. Asadollahi-Baboli et al.

    Docking and QSAR analysis of tetracyclic oxindole derivatives as α-glucosidase inhibitors

    Comput. Biol. Chem.

    (2018)
  • M.C. Hemmer et al.

    Deriving the 3D structure of organic molecules from their infrared spectra

    Vib. Spectrosc.

    (1999)
  • S. Kovarich et al.

    QSAR classification models for the prediction of endocrine disrupting activity of brominated flame retardants

    J. Hazard Mater.

    (2011)
  • Y. Pan et al.

    Predicting the net heat of combustion of organic compounds from molecular structures based on ant colony optimization

    J. Loss Prev. Process. Ind.

    (2011)
  • M. Bagheri et al.

    Simple yet accurate prediction method for sublimation enthalpies of organic contaminants using their molecular structure

    Thermochim. Acta

    (2012)
  • K.A. Hossain et al.

    Chemometric modeling of aquatic toxicity of contaminants of emerging concern (CECs) in Dugesia japonica and its interspecies correlation with daphnia and fish: QSTR and QSTTR approaches

    Ecotoxicol. Environ. Saf.

    (2018)
  • J. Zhang et al.

    The use of an artificial neural network to estimate natural gas/water interfacial tension

    Fuel

    (2015)
  • Q. Jia et al.

    Norm indexes for predicting enthalpy of vaporization of organic compounds at the boiling point

    J. Mol. Liq.

    (2019)
  • X. Yan et al.

    A norm indexes-based QSPR model for predicting the standard vaporization enthalpy and formation enthalpy of organic compounds

    Fluid Phase Equil.

    (2020)
  • M. Mavaddat et al.

    A molecular structure based model for predicting optimal salinity of anionic surfactants

    Fluid Phase Equil.

    (2016)
  • Cited by (17)

    View all citing articles on Scopus
    View full text