Skip to main content

Innovation for Impact Fund

Construction in Brooklyn with view of Manhattan skyline in the background
Return to An Intelligent Natural Language Processing Pipeline for Public Procurement Data: Enabling Predictive Policy Analysis

2026: An Intelligent Natural Language Processing Pipeline for Public Procurement Data: Enabling Predictive Policy Analysis (EDF)

This project addresses a critical bottleneck in evidence-based food policy: the severe fragmentation and inconsistency of public procurement data. We propose to develop and validate an open-source, intelligent data-cleaning pipeline that uses AI to automate the transformation of raw, unstructured bid data into a unified, analysis-ready resource. To demonstrate its utility, we will conduct a proof-of-concept case study, applying an econometric model to the cleansed data to derive initial insights into bidding dynamics. This foundational infrastructure will directly unlock timely, rigorous analysis of values-based procurement policies, empowering municipalities to design strategies that effectively support local economies, disadvantaged vendors, and environmental goals.

Cornell: Houtian (Frank) Ge (Cornell SC Johnson College of Business / Dyson School)
EDF: Daniel Kaiser (Director of Agriculture Innovation, Climate-Smart Agriculture)

Sign up for our newsletter:

Subscribe