Back to Portfolio for the Future™

An AI Method To Classify Sentences Into 17 SDGs

April 25, 2024

By Yao Li and Michael Rockinger: HEC Lausanne, Faculty of Business and Economics, University of Lausanne.





Annual shareholder reports or sustainability reports often contain several hundred pages. How often have you then sighed at the idea of having to analyze such a document? Wouldn’t you be grateful to have a tool that summarizes the document? In a recent publication ( the authors introduce an innovative AI methodology to help you understand the documents.

This study begins with a critical inquiry: how do European banks articulate their commitment to the SDGs within their sustainability reports? This inquiry stems from the pivotal role European banks play in economic growth and directing non-financial firms’ projects towards sustainability beyond the mandates of regulatory frameworks.[i] Acknowledging that legal origins have impact on firm’s engagement in CSR and, by extension, SDGs, this study focuses on a subset of European banks. Based on previous studies that manually classified bank reports to identify SDG engagement,[ii] this study offers a more efficient and a state-of-the-art text-analysis system for SDG classification.

Why does this matter?

For analysts and researchers, time is of the essence in terms of decision making while facing a large volume of information. This novel approach aims to provide them a powerful solution to discover what firms disclose.  This can also help to speed up the flow of information that is needed to make better, faster, and more sustainable decisions.


Aiming to have an AI assistant that can read through hundreds of pages in minutes and can extract targeted information from dense reports, the authors developed a state-of-the-art BERT model to uncover the information through the mountain of words in sustainability reports. In order to make the AI decipher sustainability reports effectively, it went through a fine-tuning process using sentences from genral texts related to sustainability. Next, sentences from banks’ sustainability reports are converted into a structured format that the BERT model can digest and are then fed into the model. The fine-tuned BERT can then classify the sentences and into their corresponding SDG. By counting the words allocated to each SDG as a percentage of the total number of words in a report, it is possible to score each report according to the 17 SDGs.

Key results

  1. Performance of the Classifier

This novel AI classifier achieved an F1 score of 0.93, demonstrating its capability to classify sentences according to the various SDGs. The model proved particularly well performing for climate action (SDG 13) and clean water and sanitation (SDG 6).

  1. Evolution of SDG Prevalence over Time 

Over the years, we do see banks’ sustainability conversation evolve. Table 1 shows how the 17 SDGs evolved from 2010 to 2022. 


The most discussed SDGs are displayed in red, and the least discussed ones are in green. Back in 2010, banks put their emphasis on work (SDG 8), responsible consumption (SDG 12), partnerships (SDG 17), industry (SDG 9), and peace and justice (SDG 16). By 2022, however, environmental issues took the center stage (climate action (SDG 13) and clean energy (SDG 7)). The consistently high ranking of SDG 17 aligns with the insights that shareholders and lenders are active participants in driving the sustainability agenda.[1] Notably, the implementation of EU's Directive 2017/95 brought little changes into the rankings. Despite an increase in the quantity of reports from 2017 on, the content of these reports remained consistent across the banks. Banks newly adopting reporting practices seemed to draw inspiration from the established content frameworks of their counterparts. Moreover, since 2019, the outbreak of COVID-19 pandemic led to an increasing attention on health (SDG 3), energy (SDG 7), climate action (SDG 13), and life on land (SDG 15). Conversely, the emphasis on education (SDG 4), work (SDG 8), industry (SDG 9), and partnerships (SDG 17) showed a decline. Those findings align with many studies collectively arguing that banks shifted their priorities towards addressing immediate health needs (SDG 3) and mitigating economic disturbances (SDGs 8 and 9) amidst the health challenges and economic aftershocks due to the pandemic. 

What does the future hold for sustainability conversations? It is anticipated that albeit the introduction of the sustainability disclosure Directive 2022/246 will bring more transparency and depth to how organizations report on sustainability, the content of the reports will remain homogeneous.

When looking at how different regions within the European Union addressed sustainability, Northern Europe exhibited a more moderate increase in attention on SDG 3, aligning with its comprehensive public health coverage which could help mitigate the blow of the pandemic. Furthermore, SDG 13 showed a substantial increase in Central and Eastern Europe. Starting from the highest starting point in the EU, Central and Eastern Europe experienced an increase in emissions when other EU regions cut emission from 2017 to 2021. Similarly, in 2021, as other regions cut down their environmental spending, Central and Eastern Europe increased its investment in green initiatives.


For an AI classifier, it is important to recognize the issue of greenwashing. This refers to the practice where banks might present themselves as more environmentally friendly or committed to sustainability than they actually are. An AI classifier can identify how banks talk about sustainability but is not able to assess the sincerity of these commitments. Potentially, banks can report extensively on their initiatives towards SDGs without meaningful actions and efforts.


Imagine this: when you are overwhelmed with hundreds of pages in reports, an AI pops out of your computer to understand the key information within minutes for you. The authors show that with the right technological tools, diving into the vast ocean of reports doesn’t have to be a scary task. This changes the game for analysts and researchers, rendering the analysis and decision-making more efficient.

With this cutting-edge AI methodology, we correlate the evolution of sustainability dialogues along with global issues. For example, during the pandemic, there was a lot of talk about partnerships and looking after employees’ well-being. But as the world started to recover, banks began refocusing on their pre-pandemic priorities. Additionally, since 2019, climate-related issues started to take the central stage. Interestingly, banks often talked about certain goals such as decent work, industry innovation, responsible consumption, strong institutions, and partnerships.

This study offers critical insights for further investigation into the textual analysis of sustainability-related concerns. The automated AI tool significantly reduces the complexity of navigating voluminous reports, promising to transform the practices for both analysts and industry professionals alike.

Li, Yao, and Michael Rockinger. 2024. "Unfolding the Transitions in Sustainability Reporting" Sustainability 16, no. 2: 809.

The codes can be accessed via


[1] Allen, F.; Gale, D. Comparing Financial Systems; MIT Press: Cambridge, MA, USA, 2000.

[2] Cosma, S.; Venturelli, A.; Schwizer, P.; Boscia, V. Sustainable development and European banks: A non-financial disclosureanalysis. Sustainability 2020, 12, 6146.

[3] Weber, O. Sustainable finance and the SDGs: The role of the banking sector. Published in the book Achieving the Sustainable Development Goals; Routledge: London, UK, 2019; pp. 226–239.

About the Authors: 

Yao Li graduated with a Bachelor's in Industrial Engineering from Tongji University (Shanghai) before obtaining a Master’s in Finance from Queen Mary (London) and one from HEC Lausanne. Presently, she is working on her PhD in Finance combining Machine Learning with Sustainability.



Michael Rockinger is Professor of Finance at HEC Lausanne. His research interest is in advanced text analysis with an emphasis on ESG related topics.