SNCF convinces internally about the advantages of AI

The SNCF Data & IA factory has developed a predictive machine learning model to help Materials branch suppliers in their decision-making. Each result is explained to the user to enable adoption.

Several factors contribute to the acceptance of artificial intelligence by businesses within organizations. Acculturation is one of these levers, as is also confidence, or explainability.

At the SNCF, the explainability of the results of the algorithms put into production is also a strategic line of work for the Data & IA factory of ITnovem, the technological subsidiary of the French group.

Read also: At SNCF Connect & Tech, UX and technology directors talk about safety with one voice

Explainability has become a necessity to transform

“Explainability has almost become a necessity for us. We work with people who know the business, but not machine learning. If an AI model operates like a black box, the business is not convinced by the result it offers”, testified at the Big Data show Amine Souiki, senior data scientist and data engineer.

During the projects it carries out for the company’s entities, the factory therefore defines the most relevant solutions to explain the predictions of a machine learning algorithm to the business concerned.

This methodology was also implemented as part of the design of a model intended to predict the risks of overstocking for SNCF’s Equipment activity. This manages the supply of parts to the technicentres. These purchases constitute a substantial item of expenditure for the carrier.

In 2020, the SNCF Materiel IT Department wanted to optimize its inventory management by limiting overstocks as much as possible. ITnovem’s Data & IA factory was commissioned for this project.

The need was for the design of an AI tool for predicting overstock risks. The solution aimed to “provide predictions within the ERP to suppliers every week to help them in their decision-making”, specifies Camille Samson, director of the Data mission at the SNCF.

Predictive at the service of inventory management

SNCF IA timetables

Classically for an ML project, the first step took the form of a PoC starting with the construction of a dataset. The explained variable of the model (overstock) and the different explanatory variables (or features) were defined.

Overstock has been categorized into three levels – from non-critical to critical. To establish this classification, the company used various source data. These include the amount of purchase orders, planning history, and repositories on items and technicentres.

The task of data scientists then comes down to analyzing “the correlation between the different explanatory variables and the target variable in order to limit themselves to the most influential variables to build the most relevant machine learning algorithm possible”, indicates Amine Souiki.

These variables selected, the team was able to tackle the design stage of the model. Several distinct algorithms were tested to compare their performance and identify the most suitable.

In the end, a gradient boosting type algorithm, “very popular when working with structured data”, was chosen to provide useful predictions for the supply business.

A cloud and industrialized architecture

In the PoC phase, the proof of value of the model developed has been demonstrated. “We obtained a reduction of a factor of 8 in the estimated cost of overstock while respecting the constraints”, testifies the senior data scientist.

Thanks to these results, the data factory was therefore able to trigger the industrialization stage. This is based in particular on the design of an architecture for processing data and calculating predictions.

The data is extracted from the ERP every week and transferred to an S3 cloud data lake (AWS). A quality management step intervenes in the process in order to clean the data.

The AI ​​model is retrained monthly. As for the inference phase, it is carried out on a weekly basis. To provide this decision support tool to suppliers, ITnovem operates different cloud services on AWS.

“The trained algorithm, hyper-parameters and performance metrics are stored and versioned using MLflow, integrated with Databricks”, details Amine Souiki. As for the results of the model, they are returned within a Web application.

The business accesses the predictions, as well as the explanation of this algorithmic decision in the form of a PNG file. The business user is informed of the nature of the variables that weighed the most in the result.

A facilitating IT department and an end-to-end involved profession

For Camille Samson, the quality of the technical solution is not, however, enough to explain the success of this project, which has resulted in a real change in operational processes and for employees, “not necessarily sensitive to AI” .

Among the key success factors, the mission director cites first and foremost the constitution of a “solid and tripartite team made up of the IT Department, the business and ITnovem as Data expert.”

In addition, the “DSI has positioned itself as the keystone of the project. This was key to putting us in contact with the profession in order to be truly involved in the system”. In addition, the profession has been “integrated throughout the chain, at each stage of the project”.

This involvement of the profession ultimately enabled the supply of a “tailor-made tool used by the suppliers in their daily lives”, insists Camille Samson. Finally, in terms of methodology, the Data team carried out a “project subdivision, with specific objectives at each stage.”

On explainability, various workshops bringing together the profession were organized with the aim of determining the right type of graph justifying the result of the prediction algorithm, thus promoting trust while removing the barriers to adoption.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *