Skip Navigation Links
Journal of Environmental Accounting and Management
António Mendes Lopes (editor), Jiazhong Zhang(editor)
António Mendes Lopes (editor)

University of Porto, Portugal


Jiazhong Zhang (editor)

School of Energy and Power Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi Province 710049, China

Fax: +86 29 82668723 Email:

Random Forest for Toxicity of Chemical Emissions: Features Selection and Uncertainty Quantification

Journal of Environmental Accounting and Management 3(3) (2015) 229--241 | DOI:10.5890/JEAM.2015.09.003

Antonino Marvuglia$^{1}$, Michael Leuenberger$^{2}$, Mikhail Kanevski$^{2}$, Enrico Benetto$^{1}$

$^{1}$ Luxembourg Institute of Science and Technology (LIST), 41, rue du Brill, L-4422 Belvaux, Luxembourg

$^{2}$ University of Lausanne, Faculty of Geosciences and Environment, Institute of Earth Surface Dynamics, Geopolis building CH-1015 Lausanne, Switzerland

Download Full Text PDF



Toxicity characterization of chemicals’ emissions is a complex task which proceeds via multimedia fate and exposure models attached to models of dose–response relationships. Several different environmental multimedia models exist, but in any case a vast amount of data on the properties of the chemical compounds being assessed is required. This paper deals with the selection of informative variables in the problem of deriving characterization factors for eco-toxicology and human toxicology of chemical compounds starting from molecular-based properties. The Random Forest algorithm has been applied to single out the most relevant variables when modelling one toxicity factor at the time. The set of variables retained varies according to the modeled output factor, but certain variables are almost always retained among the top three most important ones, regardless the output factor taken into consideration. The modelling performed in this paper is one of the first applications of nonlinear techniques to the database of organic substances made available by the multimedia fate and exposure model USEtox, largely used by the Life Cycle Assessment (LCA) community.


This work has been carried out in the framework of the project UNIC (Using Machine Learning for toxicological characterization of chemical emissions) under a research visiting grant provided by the Herbette Foundation, Lausanne, Switzerland.


  1. [1]  Birkved, M. and Heijungs, R. (2011), Simplified fate modelling in respect to ecotoxicological and human toxicological characterisation of emissions of chemical compounds. International Journal of Life Cycle Assessment, 16(8), 739-747.
  2. [2]  Breiman, L. (2001), Random Forests. Machine Learning, 45, 5-32.
  3. [3]  Harper, E.B., Stella, J.C. and Fremier, A.K. (2011), Global sensitivity analysis for complex ecological models: a case study of riparian cottonwood population dynamics. Ecological Applications, 21(4), 1225-1240.
  4. [4]  Hertwich, E.G., Mateles, S.F., Pease, W.S. and McKone, T.E. (2001), Human toxicity potentials for life-cycle assessment and toxics release inventory risk screening. Environmental Toxicology and Chemistry, 20(4), 928-939.
  5. [5]  Huijbregts, M., Hauschild, M., Jolliet, O., Margni, M., McKone, T., Rosenbaum, R.K., van de Meent, D. (2010), USEtox® User manual.
  6. [6]  Huijbregts, M.A.J., Thissen, U.M.J., Guinée, J.B., Jager, T., Kalf, D., Van de Meent, D., Ral gas, A.M.J., Wegener Sleeswijk, A., Reijnders, L. (2000), Priority assessment of toxic substances in life cycle assessment. Part I: calculation of toxicity potentials for 181 substances with the nested multi-media fate, exposure and effects model USES-LCA. Chemosphere, 41, 541-573.
  7. [7]  Kanevski, M. (2012), Multitask Learning of Environmental Spatial Data. In: Seppelt et al. (Eds.): Proceedings of the Sixth Biennial Meeting of the International Environmental Modelling and Software Society (iEMSs 2012): Managing Resources of a Limited Planet., Leipzig, Germany, 2012.
  8. [8]  Kanevski, M. (2013), A Methodology for Automatic Analysis and Modeling of Spatial Environmental Data. GEOProcessing 2013: The Fifth International Conference on Advanced Geographic Information Systems, Applications, and Services.
  9. [9]  Kanevski, M., Pozdnoukhov, A. and Timonin, V. (2009), Machine Learning for Spatial Environmental Data. Theory, Applications, and Software, EPFL Press: Lausanne, Switzerland.
  10. [10]  Liaw, A. and Wiener, M. (2002), Classification and regression by random forest. R News, 2/3, 18-22.
  11. [11]  Marvuglia, A., Kanevski, M., Leuenberger, M. and Benetto, E. (2014a), Variables selection for ecotoxicity and human toxicity characterization using Gamma Test. In: B. Murgante et al. (Eds.): ICCSA 2014, Part III, LNCS 8581, pp. 640-652, 2014.
  12. [12]  Marvuglia, A., Kanevski, M., Leuenberger, M. And Benetto, E. (2014b), Using machine learning for human toxicity and freshwater ecotoxicity characterization of chemical emissions, SETAC Europe 24th annual meeting, Basel, Switzerland, 11-15 May 2014.
  13. [13]  Marvuglia, A., Kanevski, M., Benetto. E. (2015), Machine learning for toxicity characterization of organic chemical emissions using USEtox database: learning the structure of the input space. Environment International, 83, 72-85.
  14. [14]  Pennington, D.W., Margni, M., Ammann, C. and Jolliet, O. (2005), Multimedia fate and human intake modeling: spatial versus nonspatial insights for chemical emissions in Western Europe. Environmental Science & Technology, 39(4), 1119-1128.
  15. [15]  Rosenbaum, R.K., Bachmann,T.M., Gold, L.S., Huijbregts, M., Jolliet, O., Juraske R., Köhler, A., Larsen, H.F., MacLeod, M., Margni, M., McKone, T.E., Payet, J., Schuhmacher, M., van de Meent, D., Hauschild, M.Z. (2008), USEtox—The UNEP-SETAC toxicitymodel: recommended characterisation factors for human toxicity and freshwater ecotoxicity in Life Cycle Impact Assessment. International Journal of Life Cycle Assessment, 13(7), 532-546.
  16. [16]  Stefánsson, A., Končar, N., and Jones, A.J. (1997), A note on the Gamma Test. Neural Computing and Applications, 5, 131-133.
  17. [17]  Van Zelm, R., Huijbregts, M.A.J. and Van de Meent, D. (2009), USES-LCA 2.0 — a global nested multimedia fate, exposure, and effects model. International Journal of Life Cycle Assessment, 14, 282-284.
  18. [18]  Wegener Sleeswijk A. and Heijungs R. (2010), GLOBOX: A spatially differentiated global fate, intake and effect model for toxicity assessment in LCA. Science of the Total Environment, 408, 2817-2832.