Skip to content

APIv4 Explicit Joins

In a relational database, you often need to work with data from multiple entities (tables) at once. If the simpler notations for implicit join or option transformation don't meet a need, you can use the explicit syntax to indicate exactly how 2 tables should be joined.

Get Only

Unlike the implicit join and option transformation syntaxes which work for both read and write operations, explicit joins only work for get actions.

Joins or Chaining

There is some overlap between joins and Chaining; in some cases either technique can accomplish the same thing. Because chaining executes a new query for every result, joins are usually more efficient when given a choice between the two.

Specifying a Join

An explicit join is defined as an array with the following properties:

  1. EnityName AS alias - the alias can be any string of your choosing; it will be used as an identifier in other clauses (ON, SELECT, WHERE, HAVING, ORDER BY)
  2. Join type (LEFT, INNER or EXCLUDE)
    • LEFT joins make the related entity optional - rows from the main entity are always returned.
    • INNER joins make the related entity required - rows from the main entity are only returned if the join is found.
    • EXCLUDE only returns rows from the main entity are only returned if the join is not found.
  3. EntityBridge (optional) name of bridge to use (not shown, see following section on EntityBridges)
  4. One or more ON clause arrays; each is either a single clause with [expression, operator, expression], or a conjunction (AND, OR, NOT) followed by sub-clauses.
// Traditional Syntax:
civicrm_api4('Contact', 'get', [
  'join' => [
    [
      'Participant AS participants',          // Entity & alias
      'LEFT',                                 // Type of join
      NULL,                                   // entity bridge to use
      ['id', '=', 'participants.contact_id'], // ON clauses...
      ['participants.status_id:name', '=', "'Attended'"],
    ],
  ],
  ...

Quoting Strings in ON Clauses

Unlike WHERE or HAVING, the ON clause can reference field names on either side of the operator, e.g. 'participants.contact_id'. To specify a string literal like "Attended", place it in a second set of quotation marks (single or double). Note that arrays of strings do not need extra quotes, as in the below example:

The OOP syntax has a sugar addJoin method which takes the name/alias and side as the first 2 arguments, then an optional bridge (not shown, see following section on EntityBridges), and a variable number of clauses.

// OOP Syntax:
Civi\Api4\Contact::get()
  ->addJoin(
    'Participant AS participants',
    'LEFT',
    NULL,
    ['id', '=', 'participants.contact_id'],
    ['participants.status_id:name', 'IN', ['Registered', 'Attended']]
  )
  ...

Specifying 0 Clauses Deprecated

If you pass 0 clauses to the join (e.g. 'join' => [['Participant AS participants', 'LEFT']]) The API will attempt to find an FK between the two entities and automatically add it to the ON clause for you. Because this can lead to ambiguity and unexpected behavior, it is recommended to always explicitly give the FK fields in the ON clause of your join, as do the above examples.

GROUP BY and Aggregation

The relationship between joined entities is not necessarily 1-1. By default this will cause duplicate rows to be returned for the main entity. For example, an API call to Contact.get with a join on Email will return the same contact 3 times if they have 3 email addresses.

This example groups the results by Contact.id to return just one row per contact, and aggregates their email addresses into an array:

civicrm_api4('Contact', 'get', [
  'select' => ['display_name', 'GROUP_CONCAT(email.email) AS emails'],
  'join' => [
    ['Email AS email', 'LEFT', ['id', '=', 'email.contact_id']],
  ],
  'groupBy' => ['id'],
  ...

Using an EntityBridge

An EntityBridge is a small table whose primary purpose is to link two other entities together. They exist because MySQL cannot directly link a row in one table to multiple rows in another. The EntityBridge solves this by storing a row for every join. For example, a single activity can have more than one target contact, so the ActivityContact stores a row for every Contact linked to an Activity.

APIv4 allows you to join two entities through an EntityBridge with a single join clause. Just add the name of the bridge API in front of the join clauses:

civicrm_api('Activity', 'get', [
  'join' => [
    [
      "Contact AS target_contacts",               // Entity & alias
      "LEFT",                                     // Type of join
      "ActivityContact",                          // EntityBridge name
      ['id', "=", 'target_contacts.activity_id'], // ON clauses...
      ['target_contacts.record_type_id:name', '=', '"Activity Targets"'],
    ],
  ],
  ...

Notice that the ON clause specifies joining directly to the Contact entity, even though the fields activity_id or record_type_id actually belong to ActivityContact. This is because behind the scenes, the API combines the two, allowing you to treat the Contact entity as if it also has fields belonging to the ActivityContact entity.