Avoiding Updates

Consider the following, trivial graph:

[1]:
import logging
logging.basicConfig(level=logging.DEBUG)

import graphcat.notebook

graph = graphcat.StaticGraph()
logger = graphcat.Logger(graph)

graph.add_task("A", graphcat.constant(3.14))
graph.add_task("B", graphcat.passthrough())
graph.add_links("A", "B")

graphcat.notebook.display(graph)
../_images/user-guide_avoiding-updates_1_0.svg

The graphcat.passthrough task function simply copies one of its inputs to its output. When we request the output from “B”, both tasks are executed and the value from “A” is returned:

[2]:
print("Output:", graph.output("B"))
graphcat.notebook.display(graph)
INFO:graphcat.common:Task A updating.
INFO:graphcat.common:Task A executing. Inputs: {}
INFO:graphcat.common:Task A finished. Output: 3.14
INFO:graphcat.common:Task B updating.
INFO:graphcat.common:Task B executing. Inputs: {None: 3.14}
INFO:graphcat.common:Task B finished. Output: 3.14
Output: 3.14
../_images/user-guide_avoiding-updates_3_2.svg

If we replace the task “A” function, both tasks become unfinished:

[3]:
graph.set_task("A", graphcat.constant(42))
graphcat.notebook.display(graph)
../_images/user-guide_avoiding-updates_5_0.svg

And retrieving the task “B” output executes both tasks again:

[4]:
print("Output:", graph.output("B"))
graphcat.notebook.display(graph)
INFO:graphcat.common:Task A updating.
INFO:graphcat.common:Task A executing. Inputs: {}
INFO:graphcat.common:Task A finished. Output: 42
INFO:graphcat.common:Task B updating.
INFO:graphcat.common:Task B executing. Inputs: {None: 42}
INFO:graphcat.common:Task B finished. Output: 42
Output: 42
../_images/user-guide_avoiding-updates_7_2.svg

All this is as expected. But what if we were to re-assign the same task function returning 42 to “A”? What if task “B” takes a long time to execute? What if there are many expensive tasks downstream from “B”? You may be concerned (rightly so) that this would lead to lots of unnecessary computation, if downstream tasks have to be re-executed even when the upstream value hasn’t actually changed.

Not to worry, Graphcat has your back! Let’s try re-assigning the same task function to see what happens:

[5]:
graph.set_task("A", graphcat.constant(42))
graphcat.notebook.display(graph)
../_images/user-guide_avoiding-updates_9_0.svg

Notice that even though we assigned the new task function to “A”, both tasks are still marked as finished.

[6]:
print("Output:", graph.output("B"))
graphcat.notebook.display(graph)
INFO:graphcat.common:Task A updating.
INFO:graphcat.common:Task B updating.
Output: 42
../_images/user-guide_avoiding-updates_11_2.svg

Neither task is executed, and the returned value is the same. What’s going on here? How does Graphcat know?

In a nutshell, when setting the function for a task, Graphcat only marks the task as unfinished if the new task function compares unequal to the old function. The constant function returns instances of a callable class - Constant - which is what is executed by the graph. Normally, two instances of a Python object will always compare unequal, even when they’re the same type and contain the same data, but Constant overloads the equality operator __eq__ so that two instances compare equal if their return values compare equal. Thus, multiple calls to set_task() using constant won’t trigger execution if their values are the same.

This behavior applies to all of the builtin task functions provided with Graphcat, and you can apply the same technique to your own task functions to suppress unnecessary updates. For example, you might create your own version of Constant that only considers two floating point values unequal if they differ by more than some threshold. Then, you could feed continous incoming sensor data to the graph, with the graph only executing when the data changes enough to trigger it, using your own definition of what constitutes a significant change.