Skip to main content

AGENT SQL Reference

Overview

AGENT SQL is a differentially private SQL framework designed to provide a secure and private way of querying sensitive data. This innovative solution enables users to integrate differential privacy into their SQL workflows, ensuring that their data analysis and querying processes comply with stringent privacy standards.

AGENT SQL allows seamless integration with various SQL client applications, providing a flexible and adaptable solution for data scientists and analysts. By leveraging AGENT SQL, users can perform complex data analysis tasks while maintaining the confidentiality and integrity of their data.

Supported and Unsupported Query Types

This section provides an overview of the query types supported by AGENT SQL, as well as those that are not supported. Understanding these capabilities will help you effectively utilize AGENT SQL for your data querying needs. First and the most important, the dialect of the SQL query should match the dialect of the database you are querying. Unless stated otherwise, AGENT SQL doesn't support any functions which are not supported in the underlying database.

Supported Query Types

CategoryQuery TypeDescription
AggregatesCOUNT, SUM, AVG, STDDEV, VARBasic aggregate functions for statistical analysis.
SubqueriesSubquery in the FROM, WHERE, HAVING clause, Subquery as a Scalar ValueSee below for examples of the supported subquery types.
JoinsFULL JOIN, OUTER JOIN, LEFT JOIN, RIGHT JOINVarious types of joins for combining tables within the same dataset.
ConditionsWHERE, WHERE IN, WHERE BETWEEN, WHERE IS NULL / IS NOT NULLConditional filtering of data.
GroupingGROUP BYGroups rows that have the same values in specified columns.
AliasesASRenames columns or tables in output for better readability.
ExpressionsArithmetic OperationsSupports operations like '(+)', '(-)', '(*)', '(/)', '(%)'.
FunctionsABS, CEIL, FLOOR, SQRT, etc.Mathematical functions for data manipulation.
Date and Time FunctionsCURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, EXTRACT, DAYNAMEFunctions for handling and manipulating date and time values.

Unsupported Query Types

CategoryQuery TypeDescription
AggregatesMEDIAN, MODEMore complex statistical functions are not supported.
SubqueriesCorrelated SubqueriesSubqueries that reference columns from the outer query are not supported.
JoinsCross Join, Joins across different datasetsCross joins that produce a Cartesian product of the two tables are not supported. Joins are only supported within the same dataset.
DMLMERGE, UPDATE, INSERT, DELETEData manipulation operations are not supported.
DDLCREATE, DROP, ALTER, RENAME, TRUNCATE, etcManipulating the structure of tables or columns is not supported.
FunctionsNULL IFConditional functions like NULL IF are not supported.
Date and Time FunctionsDATEADD, DATEDIFF, DATEPART, FORMAT, GETDATE, ISDATE, SYSDATETIMEThese functions are not supported.
Table Referencesdb_name.schema_name.table_nameThis format is not supported; use aliases instead. For single table queries, dataset_name.dataset_name or simply dataset_name is supported.

Subquery Patterns

AGENT SQL supports several patterns of subquery usage, with specific considerations for Differential Privacy:

Pattern 1

Subquery in the FROM Clause:

  • These subqueries act as temporary tables that the outer query can operate on.
  • Example:
    WITH DP (eps = 1, delta = 0.0001, mechanism = 'Laplace')
    SELECT SUM(sum_n2)
    FROM (SELECT SUM(num1) AS sum_n2 FROM public.my_table GROUP BY num2)
Pattern 2

Subquery as a Scalar Value:

  • These subqueries return a single value used in the outer query, often in a SELECT list or a WHERE clause. The WITH DP clause can be applied to specify privacy parameters for each subquery.
  • Example:
    WITH DP (eps = 1, delta = 0.0001, mechanism = 'Laplace')
    SELECT COUNT(num1),
    (
    WITH DP (eps = 1, delta = 0.0001, mechanism = 'Laplace')
    SELECT COUNT(num1) FROM public.my_table2
    ) AS b
    FROM public.my_table2
    WHERE num1 < b
Pattern 3

Subquery in the WHERE Clause:

  • These subqueries are used to filter results based on a condition that involves another query. The WITH DP clause can be used to manage privacy parameters.
  • Example:
    SELECT employee_id, employee_name
    FROM employees
    WHERE department_id IN (
    WITH DP (eps = 1, delta = 0.0001, mechanism = 'Laplace')
    SELECT department_id
    FROM departments
    WHERE location = 'New York'
    )
Pattern 4

Subquery in the HAVING Clause:

  • These subqueries are used to filter groups of rows based on aggregate conditions. The WITH DP clause can be applied to ensure privacy.
  • Example:
    SELECT department_id, COUNT(*)
    FROM employees
    GROUP BY department_id
    HAVING COUNT(*) > (
    WITH DP (eps = 1, delta = 0.0001, mechanism = 'Laplace')
    SELECT AVG(dept_count)
    FROM (
    SELECT COUNT(*) AS dept_count
    FROM employees
    GROUP BY department_id
    ) AS dept_counts
    )
Additional Notes
1
Differential Privacy

When using subqueries and multiple queries, the WITH DP clause allows specifying Epsilon and Delta for each step. Advanced composition techniques are used to deduct from the privacy budget, ensuring compliance with privacy standards.

2
Aggregates

Basic aggregate functions such as COUNT, SUM, AVG, STDDEV, and VAR are supported. However, more complex functions like MEDIAN and MODE are not available.

3
Conditions

A variety of conditional statements, including comparisons, range checks, and null checks are supported.

4
Expressions and Functions

A wide range of arithmetic operations and mathematical functions are supported, including date and time functions like CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, EXTRACT, and DAYNAME.

5
Joins

Various join operations are supported, including FULL, OUTER, LEFT, and RIGHT joins, but only within the same dataset.

6
Data Manipulation and Definition

Operations like MERGE, UPDATE, INSERT, and RENAME are not supported. AGENT SQL only supports querying capabilities.

7
Table References

For single table queries, dataset_name.dataset_name or simply dataset_name is supported. The db_name.schema_name.table_name format is not supported. Use aliases to reference tables in complex queries.