Material Science via Machine Learning: The How and What Data?

Since the birth of machine learning, it has evolved and applied in many areas of our daily life, such as recommendation and content personalisation systems on various multimedia (Netflix, YouTube, Disney); social media (Twitter, LinkedIn, Facebook); search engine (Google, Bing) platforms and so on. Machine learning algorithms are data-hungry and require massive amount of data for reliable predictions. Data can be in different forms such as numerals, words, pictures, clicks and anything relevant which can be accessed. Materials science is no exception and in the last few years machine learning is used in this area to shift from knowledge-driven to data-driven approaches 1. Machine learning in materials science is mostly used as a black box and hence can pose several challenges. These challenges are associated with the quantity, format, and availability of reliable experimental data. Data in the context of materials science could be microstructure, deformation curves, failure behaviour, interaction of the materials with the environment, and so on. The question here is What type of data is necessary and How much data is needed for confident and reliable predictions?

Figure 1:Machine learning and predicting hydrogen interaction with single crystal steel during deformation

The answer is not straight forward and depends on many factors including what needs to be predicted. In the following, an example of hydrogen interaction with a single crystal austenitic steel is presented to demonstrate the importance of the type and quantity of data needed for reliable predictions using machine learning.

Hydrogen storage devices use different materials including steels. As shown in Figure 1, hydrogen can diffuse in steel and interact with defects (cavities), precipitates, vacancies, grain boundaries, dislocations, and interphases. The presented example is focussed on hydrogen interaction with cavities (voids) during deformation. Data used for machine learning is generated through a mechanistic model 2,3 after experimental validations. Mechanistic model is based in crystal plasticity theory 4,5 implemented in finite element methods (CPFEM). CPFEM is used to investigate the interaction of hydrogen with single crystal deformation, defects and loading type, where plastic deformation is governed by dislocation motion on 12 possible slip systems in face-centred-cubic (FCC) crystal of austenitic stainless steel. While hydrogen enhanced localised plasticity (HELP) mechanism is incorporated in single crystal plasticity theory to account for hydrogen effects. Different defects are modelled to understand their interaction with deformation and hydrogen. To cover all the possible loading conditions, a wide spectrum of loading is simulated by controlling triaxiality and Lode parameter. The data generated through CPFEM contains single crystalline information during deformation; this include stresses, strains, initial crystal orientations, initial defect sizes, initial hydrogen concentrations, plastic slip activity, triaxialities, lode parameters, along with evolutions of these quantities with deformation (see bottom left contour plot in Figure 1 showing interaction of hydrogen in traps around the voids during deformation).

The generated data is then fed into the machine learning (support vector machines) through coding in Python. Initially 60% of the data is used to train the model, while 40% data is used for predictions. It can be seen in the Figure 1 (case 1), machine learning struggles to predict the stress-strain response for different loading and hydrogen conditions. After careful quantification of this uncertainty, elastic part of the single crystal deformation was incorporated in the machine learning through elasticity theory with cubic symmetry for FCC crystal, in other words making it physics informed machine learning. Results of this modification are plotted as case 2 in the Figure 1. It can be inferred from the contour plots that the predictions obtained from physics informed machined learning compare well with CPFEM results for equivalent stress, strain and triaxiality while using the same type and amount of data.

Looking forward, there is no doubt that machine learning will play a crucial role in predicting materials science, however the question remain subtle, especially what type of data is necessary and most importantly how much data is needed? One option is discussed above, viz development of physics informed machine learning algorithms to remove experimental uncertainty and data starvation issues, this is the top trend in machine learning based materials science.


1.         Siddiq, M. A. Data-driven finite element method: Theory and applications. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 0, 0954406220938805

2.         Ogosi, E. I., Asim, U. B., Siddiq, M. A. & Kartal, M. E. Modelling Hydrogen Induced Stress Corrosion Cracking in Austenitic Stainless Steel. J. Mech. 36, 213–222 (2020).

3.         Ogosi, E., Siddiq, A., Asim, U. Bin & Kartal, M. E. Crystal plasticity based study to understand the interaction of hydrogen, defects and loading in austenitic stainless-steel single crystals. Int. J. Hydrogen Energy 45, 32632–32647 (2020).

4.         Asim, U. B., Siddiq, M. A. & Kartal, M. A CPFEM based study to understand the void growth in high strength dual-phase Titanium alloy (Ti-10V-2Fe-3Al). Int. J. Plast. 122, 188–211 (2019).

5.         Asim, U. B., Siddiq, M. A. & Kartal, M. Representative Volume Element (RVE) based Crystal Plasticity studty of Void Growth on Phase Boundary in Titanium alloys. Comput. Mater. Sci. 161, 346–350 (2019).