3. Document Object Model
The Document Object Model (DOM) is a cross-
platform and language-independent application
programming interface that treats an HTML,
XHTML, or XML document as a tree structure
wherein each node is an object representing a part
of the document.
Wikipedia
6. Data join
Here’s the deal. Instead of telling D3 how to do
something, tell D3 what you want. You want the
circle elements to correspond to data. You want
one circle per datum. Instead of instructing D3 to
create circles, then, tell D3 that the selection
"circle" should correspond to data. This concept
is called the data join.
Mike Bostock
svg.selectAll("circle")
.data(radii)
.enter().append("circle")
.attr(“cx", 60)
.attr("cy", function(d, i) { return return i * 100 + 30; })
.attr(“r”, function(d, i) { return Math.sqrt(d)); })
8. Velocity Verlet
D3 uses a Velocity Verlet numerical
integrator for simulating physical forces on
particles:
Constant unit time step: Δt = 1
Constant unit mass: m = 1
Constant acceleration: a
Force F = ma
Velocity Vn = Vn-1 + aΔt
Position Pn = Pn-1 + VΔt
10. D3 Force simulation
Node properties:
• index - the node’s zero-based index into nodes
• x, y - the node’s current x- and y-positions
• vx, vy - the node’s current x- and y- velocities
velocityDecay - “friction” that slows down a node
alpha - “cooling parameter” reduces the effect of forces
tick - iteration for calculation new position, increments alpha by
alphaDecay and decrements velocity by velocityDecay
11. Minimal Working Examples
• Force-Directed Graph by Mike Bostock -
https://bl.ocks.org/mbostock/4062045
• D3-Force Testing Playground by Steve Haroz -
https://bl.ocks.org/steveharoz/8c3e2524079a8c440df60c1
ab72b5d03
12. Closure = function + outer content
Closure is a record storing a function together with an
environment: a mapping associating each free variable of the
function (variables that are used locally, but defined in an
enclosing scope) with the value or reference to which the
name was bound when the closure was created.
13. JavaScript Closures
function draw_network(json, svg){
var nodes_data = json[“nodes"];
...
function splitting_force() {
for (var i = 0, n = nodes_data.length; i < n; i++) {
node = nodes_data[i];
if(node.group == "tf"){
node.y += 10;
} else if(node.group == "kinase"){
node.y -= 10;
}
}
}
...
}
14. JavaScript Closures
function draw_network(json, svg){
var nodes_data = json[“nodes"];
...
function splitting_force() {
for (var i = 0, n = nodes_data.length; i < n; i++) {
(function(i){
node = nodes_data[i];
if(node.group == "tf"){
node.y += 10;
} else if(node.group == "kinase"){
node.y -= 10;
}
})(i);
}
}
...
}
Editor's Notes
Today I'm gonna tell you about D3. I'll give you superficial introduction, explain the cornerstone concept and show a few examples including force-directed graph layout.
D3 stands for Data Driven Documents and it sounds a little bit misleading because it's a JavaScript library used for producing dynamic, interactive, online data visualizations.
Mike Bostock started developing data visualization library called Protovis at 2009 when he was a PhD student at Stanford. In those days there were only a few libraries for interactive online data visualization libraries, although demand was high.
At 2011 Protovis team finished developing it and started developing D3 which inherited
And at 2015 Mike Bostock left New York Times and full-time on D3.
The last important thing in D3 timeline is release of d3.v4 year ago. I recommend to use this version as it is more convenient and has better performance.
D3 operates on DOM elements. Every element of webpage is a DOM element and DOM considers every element as a node of a tree. In case of D3 we're talking about SVG - Scalable Vector Graphics.
It's an XML-based vector image format for two-dimensional graphics with support for interactivity and animation.
For example, this code will generate three circles with a radius of 10 pixels. (0)
'circle' specifies the shape
'cx' and 'cy' specifies x and y positions on canvas, and 'r' is a radius of a circle.
Using D3 we can modify those circles. Let's change their radius. (1)
This is very cute but what if we had four numbers to display, rather than three? We wouldn’t have enough circles, and we would need to create more elements to represent our data. We can append new nodes manually, but a more powerful way is the enter selection computed by a data join.
Data join is cornerstone concept of D3.
It's somewhat similar to SQL joins. In SQL you have two tables and when you join then, you merge elements that have same ids.
In D3 on one side you have data. On the other side you have SVG elements.
So when we do this in code, we append our data to SVG elements. One datum per element and datum and element become an object. (1 again)
How does D3 understand which datum it should relate to which element?
D3 has 'key function' which you can rewrite if you wish. By default as you data is array and your elements are nodes of DOM tree, it takes first datum, first node and creates first element. And so on.
If we go back to hypothetical situation with four data points and three objects, we'll append three data points to three elements and one left element will be hanging somewhere there.
In D3 this 'somewhere there' is called 'enter' selection.
Whenever code is run, it recomputes the data join and maintains the correspondence between elements and data. If the new dataset is smaller than the old one, the surplus elements end up in the exit selection and get removed. If the new dataset is larger, the surplus data ends up in the enter selection and new nodes are added. If the new dataset is exactly the same size, then all the elements are simply updated with new positions, and no elements are added or removed.
Questions?
Force layouts are visually interesting. They are expanding and collapsing, you can drag them around. It’s fun and engaging.
Also implementation is interesting on a back-end (in terms of underlying principles) and easy (in terms of code).
And as we do network analysis in the lab this is useful as it is the most obvious way to visualize networks.
And force is one of the places where performance really matters in terms of doing stuff in the browser. With naive approach and N nodes it will be N-squared calculations. So D3 uses technique that comes from astrophysical simulation of n-body problem where they use quad-tree that allows you to do N*logN calculations instead N-squared.
D3 implements three primary forces upon the nodes at every iteration:
The sum of the forces acting on each node by all other nodes.
Many-Body simulates gravity (attraction) or electrostatic charge (repulsion).
Centering force translates nodes uniformly so that the the center of mass is at the given position.
Collision treats nodes as circus with certain radius.
The force pushing and pulling between two linked nodes
The force pulling each node to a focal point, usually the center of the user-defined space
At each tick, the nodes data array is directly manipulated with the calculated x- and y-positions. D3 triggers a “tick” event at each of these iterations, and an “end” event when the simulation ends.
On top of the three primary forces, there are two more concepts that affect the placement of the nodes at each tick:
“friction”, that slows down the rate at which the node travels from its original position to its newly calculated position,
and alpha, or the “cooling parameter”, that decrements at each tick and reduces the effect each of the forces play on the position of the nodes.
These concepts exist because, if we were to position the nodes based on the above three forces at each tick and render them, the nodes would fly everywhere. To prevent this from happening, friction slows the nodes down at each tick, and alpha slows them down between each tick. After a certain threshold is reached for alpha, the force layout stops calculating, freezing the graph into what is hopefully an optimal layout.
Let's say you wrote your beautiful visualization and now you want to integrate it into a webpage. So you wrap your code into one main function. User inputs data, presses a button, data is sent to your function and the function draws a graph.
If you have functions in your main function, and if those functions contain loops, at this point you'll be disappointed and confused because your function isn't gonna work properly.
The problem happened when you wrapped your code into the main function. Because it created closures. Closure is a function than contains references to variables declared outside the body of this function and not being its parameters.
A closure does not merely pass the value of a variable or even a reference to the variable. A closure captures the variable itself!
So when you write something like this which is totally normal, instead of doing what you expect, JavaScript will share variable i across functions PLUS the current function/scope/context. Think of it as a sort of private global variable that only the functions involved can see.
What we want is an instance of that variable or at least a simple reference to the variable instead of the variable itself. Fortunately JavaScript already has a mechanism for passing a reference (for objects) or value (for strings and numbers): function arguments!