Skip to main content

News about Scitation

In December 2016 Scitation will launch with a new design, enhanced navigation and a much improved user experience.

To ensure a smooth transition, from today, we are temporarily stopping new account registration and single article purchases. If you already have an account you can continue to use the site as normal.

For help or more information please visit our FAQs.

banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
1.Materials Genome Initiative for Global Competitiveness, OSTP, June 2011.
2.T. Hey, S. Tansley, and K. Tolle, The Fourth Paradigm: Data-Intensive Scientific Discovery (Microsoft Research, 2009).
3.Materials Genome Initiative Strategic Plan, National Science and Technology Council Committee on Technology Subcommittee on the Materials Genome Initiative, June 2014.
4.C. H. Ward, J. A. Warren, and R. J. Hanisch, “Making materials science and engineering data more valuable research products,” Integr. Mater. Manuf. Innovation 3, 117 (2014).
5.A. A. White, “Big data are shaping the future of materials science,” MRS Bull. 38, 594595 (2013).
6.S. R. Kalidindi and M. D. Graef, “Materials data science: Current status and future outlook,” Annu. Rev. Mater. Res. 45, 171193 (2015).
7.K. Rajan, “Materials informatics: The materials ‘gene’ and big data,” Annu. Rev. Mater. Res. 45, 153169 (2015).
8.G. B. Olson, “Computational design of hierarchically structured materials,” Science 277, 12371242 (1997).
9.A. Agrawal, P. D. Deshpande, A. Cecen, G. P. Basavarsu, A. N. Choudhary, and S. R. Kalidindi, “Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters,” Integr. Mater. Manuf. Innovation 3, 119 (2014).
10.B. Meredig, A. Agrawal, S. Kirklin, J. E. Saal, J. W. Doak, A. Thompson, K. Zhang, A. Choudhary, and C. Wolverton, “Combinatorial screening for new materials in unconstrained composition space with machine learning,” Phys. Rev. B 89, 17 (2014).
11.R. Liu, A. Kumar, Z. Chen, A. Agrawal, V. Sundararaghavan, and A. Choudhary, “A predictive machine learning approach for microstructure optimization and materials design,” Sci. Rep. 5, 11551 (2015).
12.H. George, “John and Pat Langley. Estimating continuous distributions in Bayesian classifiers,” in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (Morgan Kaufmann Publishers Inc., 1995), pp. 338345.
13.R. R. Bouckaert, “Naive Bayes classifiers that perform well with continuous variables,” in AI 2004: Advances in Artificial Intelligence (Springer, 2004), pp. 10891094.
14.D. Hosmer and S. Lemeshow, Applied Logistic Regression (John Wiley and Sons, Inc., 1989).
15.E. Weher, “Edwards, Allen, L.: An introduction to linear regression and correlation. (A series of books in psychology.) W. H. Freeman and Comp., San Francisco 1976. 213 S., Tafelanh., s 7.00,” Biom. J. 19, 8384 (1977).
16.D. W. Aha and D. Kibler, “Instance-based learning algorithms,” Mach. Learn. 6, 3766 (1991).
17.C. Bishop, Neural Networks for Pattern Recognition (Oxford University Press, 1995).
18.L. Fausett, Fundamentals of Neural Networks (Prentice Hall, New York, 1994).
19.V. N. Vapnik, The Nature of Statistical Learning Theory (Springer, 1995).
20.R. Kohavi, “The power of decision tables,” in Proceedings of the 8th European Conference on Machine Learning, ECML ’95 (Springer-Verlag, London, UK, 1995), pp. 174189.
21.I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann Publication, 2005).
22.J. Quinlan, C4. 5: Programs for Machine Learning (Morgan Kaufmann, 1993).
23.Y. Freund and L. Mason, “The alternating decision tree learning algorithm,” in Proceeding of the Sixteenth International Conference on Machine Learning (Citeseer, 1999), pp. 124133.
24.N. Landwehr, M. Hall, and E. Frank, “Logistic model trees,” Mach. Learn. 59, 161205 (2005).
25.M. Sumner, E. Frank, and M. Hall, “Speeding up logistic model tree induction,” in Knowledge Discovery in Databases: PKDD 2005 (Springer, 2005), pp. 675683.
26.Y. Wang and I. Witten, “Induction of model trees for predicting continuous classes,” inProceedings of European Conference on Machine Learning Poster Papers, Prague, Czech Republic (Springer, 1997), pp. 128137.
27.J. R. Quinlan, Learning with Continuous Classes (World Scientific, 1992), pp. 343348.
28.Y. Freund and R. E. Schapire, “Experiments with a new boosting algorithm,” in Proceedings of the 13th International Conference on Machine Learning 96, 148-156 (1996).
29.L. Breiman, “Bagging predictors,” Mach. Learn. 24, 123140 (1996).
30.T. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 832844 (1998).
31.L. Breiman, “Random forests,” Mach. Learn. 45, 532 (2001).
32.J. Rodriguez, L. Kuncheva, and C. Alonso, “Rotation forest: A new classifier ensemble method,” IEEE Trans. Pattern Anal. Mach. Intell. 28, 16191630 (2006).
33.S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, and G. Ceder, “Predicting crystal structures with data mining of quantum calculations,” Phys. Rev. Lett. 91, 135503 (2003).
34.C. C. Fischer, K. J. Tibbetts, D. Morgan, and G. Ceder, “Predicting crystal structure by merging data mining with quantum mechanics,” Nat. Mater. 5, 641646 (2006).
35.G. Hautier, C. C. Fischer, A. Jain, T. Mueller, and G. Ceder, “Finding natures missing ternary oxide compounds using machine learning and density functional theory,” Chem. Mater. 22, 37623767 (2010).
36.K. Gopalakrishnan, A. Agrawal, H. Ceylan, S. Kim, and A. Choudhary, “Knowledge discovery and data mining in pavement inverse analysis,” Transport 28, 110 (2013).
37.P. Deshpande, B. P. Gautham, A. Cecen, S. Kalidindi, A. Agrawal, and A. Choudhary, “Application of statistical and machine learning techniques for correlating properties to composition and manufacturing processes of steels,” in 2nd World Congress on Integrated Computational Materials Engineering (John Wiley & Sons, Inc., 2013), pp. 155160.
38.A. G. Kusne, T. Gao, A. Mehta, L. Ke, M. C. Nguyen, K.-M. Ho, V. Antropov, C.-Z. Wang, M. J. Kramer, C. Long et al., “On-the-fly machine-learning for high-throughput experiments: Search for rare-earth-free permanent magnets,” Sci. Rep. 4, 6367 (2014).
39.R. Liu, Y. C. Yabansu, A. Agrawal, S. R. Kalidindi, and A. N. Choudhary, “Machine learning approaches for elastic localization linkages in high-contrast composite materials,” Integr. Mater. Manuf. Innovation 4, 117 (2015).
40.P. V. Balachandran, J. Theiler, J. M. Rondinelli, and T. Lookman, “Materials prediction via classification learning,” Sci. Rep. 5, 13285 (2015).
41.F. Faber, A. Lindmaa, O. A. von Lilienfeld, and R. Armiento, “Crystal structure representations for machine learning models of formation energies,” Int. J. Quantum Chem. 115, 10941101 (2015).
42.Y. Bengio, “Learning deep architectures for ai,” Found. Trends® Mach. Learn. 2, 1127 (2009).
43.W. Fan and A. Bifet, “Mining big data: Current status, and forecast to the future,” ACM SIGKDD Explor. Newsl. 14, 15 (2013).
44.A. Agrawal, M. Patwary, W. Hendrix, W.-k. Liao, and A. Choudhary, “Big Data and High Performance Computing,” in Cloud Computing and Big Data, edited byL. Grandinetti, Advances in Parallel Computing Vol. 23 (IOS Press, 2013), pp. 192211.
45.M. Patwary, D. Palsetia, A. Agrawal, W.-k. Liao, F. Manne, and A. Choudhary, “Scalable parallel optics data clustering using graph algorithmic techniques,” in Proceedings of 25th International Conference on High Performance Computing, Networking, Storage and Analysis (Supercomputing, SC’13) (ACM, 2013), pp. 112.
46.Z. Chen, S. W. Son, W. Hendrix, A. Agrawal, W.-k. Liao, and A. Choudhary, “Numarck: Machine learning algorithm for resiliency and checkpointing,” in Proceedings of 26th International Conference on High Performance Computing, Networking, Storage and Analysis (Supercomputing, SC’14) (ACM, 2014), pp. 733744.
47.Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed graphlab: A framework for machine learning and data mining in the cloud,” Proc. VLDB Endowment 5, 716727 (2012).
48.Y. Xie, D. Palsetia, G. Trajcevski, A. Agrawal, and A. Choudhary, “Silverback: Scalable association mining for temporal data in columnar probabilistic databases,” in Proceedings of 30th IEEE International Conference on Data Engineering (ICDE), Industrial and Applications Track (IEEE, 2014), pp. 10721083.
49.S. Jha, J. Qiu, A. Luckow, P. Mantha, and G. Fox, “A tale of two data-intensive paradigms: Applications, abstractions, and architectures,” in Big Data (BigData Congress), 2014 IEEE International Congress on (IEEE, 2014), pp. 645652.
50.Y. Xie, P. Daga, Y. Cheng, K. Zhang, A. Agrawal, and A. Choudhary, “Reducing infrequent-token perplexity via variational corpora,” in Proceedings of the 53rd Annual Meeting of the Association of Computational Linguistics (ACL) and the 7th International Joint Conference on Natural Language Processing (ACL Anthology, 2015), pp. 609615, available at
51.G. Linden, B. Smith, and J. York, “ recommendations: Item-to-item collaborative filtering,” IEEE Internet Comput. 7, 7680 (2003).
52.Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan, “Large-scale parallel collaborative filtering for the netflix prize,” in Algorithmic Aspects in Information and Management (Springer, 2008), pp. 337348.
53.Y. Xie, D. Honbo, A. Choudhary, K. Zhang, Y. Cheng, and A. Agrawal, “Voxsup: A social engagement framework,” in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (Demo Paper) (ACM, 2012), pp. 15561559.
54.H. C. Koh, G. Tan et al., “Data mining applications in healthcare,” J. Healthcare Inf. Manage. 19, 6472 (2005), available at
55.A. Agrawal, S. Misra, R. Narayanan, L. Polepeddi, and A. Choudhary, “Lung cancer survival prediction using ensemble data mining on seer data,” Sci. Program. 20, 2942 (2012).
56.J. S. Mathias, A. Agrawal, J. Feinglass, A. J. Cooper, D. W. Baker, and A. Choudhary, “Development of a 5 year life expectancy index in older adults using predictive mining of electronic health record data,” J. Am. Med. Inf. Assoc. 20, e118e124 (2013).
57.K. Lee, A. Agrawal, and A. Choudhary, “Real-time disease surveillance using twitter data: Demonstration on flu and cancer,” in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13 (ACM, New York, NY, USA, 2013), pp. 14741477.
58.L. Liu, J. Tang, Y. Cheng, A. Agrawal, W.-k. Liao, and A. Choudhary, “Mining diabetes complication and treatment patterns for clinical decision support,” in Proceedings of 22th ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, USA (ACM, 2013), pp. 279288.
59.K. Lee, A. Agrawal, and A. Choudhary, “Mining social media streams to improve public health allergy surveillance,” in Proceedings of IEEE/ACM International Conference on Social Networks Analysis and Mining (ASONAM) (IEEE, 2015), pp. 815822.
60.C. K. Reddy and C. C. Aggarwal, Healthcare Data Analytics (CRC Press, 2015), Vol. 36.
61.A. R. Ganguly, E. Kodra, A. Agrawal, A. Banerjee, S. Boriah, S. Chatterjee, S. Chatterjee, A. Choudhary, D. Das, J. Faghmous, P. Ganguli, S. Ghosh, K. Hayhoe, C. Hays, W. Hendrix, Q. Fu, J. Kawale, D. Kumar, V. Kumar, W.-k. Liao, S. Liess, R. Mawalagedara, V. Mithal, R. Oglesby, K. Salvi, P. K. Snyder, K. Steinhaeuser, D. Wang, and D. Wuebbles, “Toward enhanced understanding and projections of climate extremes using physics-guided data mining techniques,” Nonlinear Processes Geophys. 21, 777795 (2014).
62.C. Jin, Q. Fu, H. Wang, W. Hendrix, Z. Chen, A. Agrawal, A. Banerjee, and A. Choudhary, “Running map inference on million node graphical models: A high performance computing perspective,” in Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (IEEE, 2015), pp. 565575.
63.V. Lakshmanan, E. Gilleland, A. McGovern, and M. Tingley, Machine Learning and Data Mining Approaches to Climate Science (Springer, 2015).
64.S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, “Gapped BLAST and PSI-BLAST: A new generation of protein database search programs,” Nucl. Acids Res. 25, 33893402 (1997).
65.A. Agrawal and X. Huang, “PSIBLAST_PairwiseStatSig: Reordering PSI-BLAST hits using pairwise statistical significance,” Bioinformatics 25, 10821083 (2009).
66.A. Agrawal and X. Huang, “Pairwise statistical significance of local sequence alignment using sequence-specific and position-specific substitution matrices,” IEEE/ACM Trans. Comput. Biol. Bioinf. 8, 194205 (2011).
67.S. Misra, A. Agrawal, W.-k. Liao, and A. Choudhary, “Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing,” Bioinformatics 27, 189195 (2011).
68.A. ODriscoll, J. Daugelaite, and R. D. Sleator, “big data, hadoop and cloud computing in genomics,” J. Biomed. Inf. 46, 774781 (2013).
69.Y. Xie, Z. Chen, K. Zhang, Y. Cheng, D. K. Honbo, A. Agrawal, and A. Choudhary, “Muses: A multilingual sentiment elicitation system for social media data,” IEEE Intell. Syst. 29, 3442 (2013).
70.Y. Cheng, A. Agrawal, H. Liu, and A. Choudhary, “Social role identification via dual uncertainty minimization regularization,” in Proceedings of International Conference on Data Mining (ICDM) (IEEE, 2014), pp. 767772.
71.R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction (Cambridge University Press, 2014).
72.See for National Institute of Materials Science, accessed on Jan 12, 2016.
73.G. E. Dieter, Mechanical Metallurgy, 3rd ed. (Mc Graw-Hill Book Company, 1986).
74.R. Liu, A. Agrawal, Z. Chen, W. keng Liao, and A. Choudhary, “Pruned search: A machine learning based meta-heuristic approach for constrained continuous optimization,” in Proceedings of 8th IEEE International Conference on Contemporary Computing (IC3) (IEEE, 2015), pp. 1318.

Data & Media loading...


Article metrics loading...



Our ability to collect “big data” has greatly surpassed our capability to analyze it, underscoring the emergence of the fourth paradigm of science, which is data-driven discovery. The need for data informatics is also emphasized by the Materials Genome Initiative (MGI), further boosting the emerging field of materialsinformatics. In this article, we look at how data-driven techniques are playing a big role in deciphering processing-structure-property-performance relationships in materials, with illustrative examples of both forward models(property prediction) and inverse models(materials discovery). Such analytics can significantly reduce time-to-insight and accelerate cost-effective materials discovery, which is the goal of MGI.


Full text loading...


Access Key

  • FFree Content
  • OAOpen Access Content
  • SSubscribed Content
  • TFree Trial Content
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd