Decoding ESG Data with AI

A system-ESGLens-processes sustainability reports from major market indices-QQQ, S&P 500, and Russell 1000-through a five-stage pipeline of data collection, PDF processing utilizing [latex]FAISS[/latex] vector databases and [latex]OpenAI[/latex] embeddings, targeted data extraction guided by GRI standards, ChatGPT-driven summarization, and ultimately, regression-model-either Neural Network or LightGBM-based scoring to generate a quantitative ESG assessment benchmarked against existing LSEG data, demonstrating an attempt to distill complex qualitative information into a measurable, comparable metric subject to the inherent decay of any derived score.

A new framework uses artificial intelligence to automatically analyze corporate sustainability reports and predict ESG performance.