Data Cleaning

Enabling SQL-based Training Data Debugging for Federated Learning

How can we debug a logistical regression model in a federated learning setting when seeing the model behave unexpectedly (e.g., the model rejects all high-income customers' loan applications)? The SQL-based training data debugging framework has …

Explaining Inference Queries with Bayesian Optimization

Obtaining an explanation for an SQL query result can enrich the analysis experience, reveal data errors, and provide deeper insight into the data. Inference query explanation seeks to explain unexpected aggregate query results on inference data; such …

Towards Complaint-driven ML Workflow Debugging

As the need for ML increases rapidly across all industry sectors, there is a significant interest in integrating model inference into critical decision making workflows. Debugging ML enabled workflows is very challenging since an unexpected workflow …

Complaint-driven Training Data Debugging for Query 2.0

As the need for machine learning (ML) increases rapidly across all industry sectors, there is a significant interest among commercial database providers to support "Query 2.0", which integrates model inference into SQL queries. Debugging Query 2.0 is …