Query execution

The second step in the query extraction and execution process is the actual execution of the queries to enable data delivery. In the Gatsby bootstrap, queries are executed by Gatsby invoking the createQueryRunningActivity function in src/query/index.js. The other two files involved in the query execution process are queue.ts and query-runner.ts, both located in the same Gatsby source directory.

TIP

For a diagram illustrating the flow involved in this step, consult the Gatsby documentation’s guide to on query execution.

The first thing Gatsby needs to do to properly execute queries is select which queries need to be executed in the first place—a stage complicated by the fact that it also needs to support the gatsby develop process. For this reason, it isn’t simply a matter of executing the queries as they were enqueued at the end of the extraction step. The runQueries function is responsible for this logic.

First, all queries are identified that were enqueued after having been extracted by src/query/query-watcher.js. Then, Gatsby proceeds to catalogue those queries that lack node dependencies: namely, queries whose component paths are not listed in componentDataDependencies. During schema generation, each type resolver records dependencies between pages whose queries are being executed and successfully resolved nodes of that type. As such, if a component is listed in the components Redux namespace but is unavailable in componentDataDependencies, the query has not yet been executed and requires execution. This logic is found in findIdsWithoutDataDependencies.

As we know from spinning up a local development server using the gatsby develop command, each time a node is created or updated, the node must be dynamically updated—or, internally speaking, added to the enqueuedDirtyActions collection. As queries are executed, Gatsby searches for all nodes within this collection in order to map them to those pages that depend on them. Pages depending on dirty nodes (nodes that have gone stale and need updating) have queries that must be executed. This third step in the query execution process also concerns dirty connections that depend on a node’s type. If the node is dirty, Gatsby designates all connections of that type dirty as well. This logic is found in popNodeQueries.

Now that Gatsby has an authoritative list of all queries requiring execution at its disposal, it will queue them for actual execution, kicking off the step by invoking the runQueriesForPathnames function. For each individual page or static query, Gatsby creates a new query job, an example of which is shown here:

{
  id: // Page path, or static query hash
  hash: // Only for static queries
  jsonName: // jsonName of static query or page
  query: // Raw query text
  componentPath: // Path to file where query is declared
  isPage: // true if not static query
  context: {
    path: // If staticQuery, is jsonName of component
    // Page object. Not for static queries
    ...page
    // Not for static queries
    ...page.context
  }
}

Each individual query job contains all of the information it needs to execute the query and encode dependencies between pages and nodes therein. The query job is enqueued in src/query/query-queue.js, which uses the better-queue library to facilitate parallel query execution. Because Gatsby has dependencies only between pages and nodes, not queries themselves, parallel query execution is possible. Each time an item surfaces from the queue, Gatsby invokes query-runner.ts to execute the query, which involves the following three parameters passed to the graphql-js library:

The Gatsby schema that was inferred during schema generation
The raw query text, acquired from the query job’s contents
The context, available in the query job, containing the page’s path and other elements for dependencies between pages and nodes

Thereafter, the graphql-js library will parse and execute the top-level query, invoking the resolvers defined during the schema generation process to query over all nodes of that type in the Redux store. Afterwards, the result is passed through the inner portions of the query, upon which each type’s resolver is called. In some cases, these resolver invocations will use custom plugin field resolvers. Because this step may generate artifacts such as manipulated images, the query execution step of the Gatsby bootstrap is often the most time-consuming. Once this step is complete, the query result is returned.

Finally, as queries are removed from the queue and executed, their results are saved to Redux, and by extension the disk, for later consumption. This process includes conversion of the query result to pure JSON and saving it to its associated dataPath (relative to public/static/d), including the jsonName and hash of the result. For static queries, rather than employing the page’s jsonName, Gatsby utilizes the hash of the query. Once this process is complete, Gatsby stores a mapping of the page to the query result in Redux for later retrieval using the json-data-paths reducer in Redux.

NOTE

For more information about how Gatsby handles normal queries and static queries differently in query extraction and query execution, consult the documentation’s guide to Gatsby’s internal handling of static versus normal queries.

TIP

NOTE

Comments

Leave a Reply Cancel reply