org.jrubyparser.util.diff
Class NodeDiff

java.lang.Object
  extended by org.jrubyparser.util.diff.NodeDiff

public class NodeDiff
extends Object

The NodeDiff class takes two Node objects, and via the SequenceMatcher class, determines the differences between the two objects and creates a diff, which is a List of Change objects containing a pair of Nodes, the original and its replacement (or a single Node, representing an insertion or deletion, depending on whether its in the new or old AST) as well as other relevant information about the difference.

Additionally, if given the String each Node was parsed from, NodeDiff can create a DeepDiff. The DeepDiff will contain all the information of the original diff, but it will also create subdiffs of certain types of Nodes such as methods and classes. By using a Levenshtein distance algorithm, which would be too imprecise when comparing the entire diff, but works within the smaller context, the subdiff is able to re-match changes that were previously disconnected and recognized as a separate deletion and insertion. Careful use of the subdiff information can be used to dramatically improve diffing results.

Finally, by passing in a callback, isJunk , which implements the #IsJunk interface, and specifically, the #checkJunk() method, one can instruct NodeDiff to ignore particular Nodes, or types of Nodes.

Example:

 String codeStringA = "def foo(bar)\n bar\n end\n foo('astring')";
 String codeStringB = "'a'\ndef bloo(bar)\n puts bar\n end\n foo('astring')";

 //The implementation of parseContents is left to the reader.
 Node nodeA = parseContents(codeStringA);
 Node nodeB = parseContents(codeStringB);

 NodeDiff nd = new NodeDiff(nodeA, codeStringA, nodeB, codeStringB, new
 IsJunk() {
   public boolean checkJunk(Node node) {
       if (!(node instanceof ILocalScope && !(node instanceof RootNode))) {
           return true;
       }
         return false;
     }
 });

 ArrayList<DeepDiff> diff = nd.getDeepDiff();

 System.out.println(diff);
 // Output:
 //[
 // Change:
 // New Node: (DefnNode:foo, (MethodNameNode:foo), (ArgsNode, (ListNode, (
 // ArgumentNode:bar))), (NewlineNode, (LocalVarNode:bar))) Complexity: 7
 // Position: <code>:[0,2]:[0,22]
 //,
 // Change:
 // Old Node: (DefnNode:bloo, (MethodNameNode:bloo), (ArgsNode, (ListNode, (
 // ArgumentNode:bar))), (NewlineNode, (FCallNode:puts, (ArrayNode, (
 // LocalVarNode:bar))))) Complexity: 9 Position: <code>:[1,3]:[4,32]
 //]
 
 

See Also:
getDeepDiff(), getSubdiff(Change), NodeDiff(org.jrubyparser.ast.Node, org.jrubyparser.ast.Node, IsJunk), NodeDiff(org.jrubyparser.ast.Node, String, org.jrubyparser.ast.Node, String, IsJunk), SequenceMatcher, Change, DeepDiff, IsJunk, IsJunk.checkJunk(org.jrubyparser.ast.Node)

Field Summary
protected  List<DeepDiff> deepdiff
           
protected  List<Change> diff
           
protected  IsJunk isJunk
           
protected  String newDocument
           
protected  Node newNode
           
protected  String oldDocument
           
protected  Node oldNode
           
protected  SequenceMatcher SequenceMatch
           
 
Constructor Summary
NodeDiff(Node newNode, Node oldNode)
          Create a NodeDiff object without passing in the Strings that the nodes were parsed from.
NodeDiff(Node newNode, Node oldNode, IsJunk isJunk)
          Create a NodeDiff object without passing in the Strings that the nodes were parsed from.
NodeDiff(Node newNode, String newDocument, Node oldNode, String oldDocument)
          Create a NodeDiff object by passing in both the Nodes to be diffed as well as the Strings they were parsed from.
NodeDiff(Node newNode, String newDocument, Node oldNode, String oldDocument, IsJunk isJunk)
          Create a NodeDiff object by passing in both the Nodes to be diffed as well as the Strings they were parsed from.
 
Method Summary
protected  List<DeepDiff> createDeepDiff(List<Change> roughDiff)
           
protected  List<Change> createDiff(SequenceMatcher sequenceMatch)
           
 List<DeepDiff> getDeepDiff()
          Fetch or create a deep diff of the Nodes newNode and oldNode.
 List<Change> getDiff()
          Fetch or create a diff of the Nodes newNode and oldNode.
protected  List<Change> getSubdiff(Change change)
          Sorts through a diff, checking for specific, important types of Nodes like classes, methods, etc and performs subdiffs on those.
 void setNewDocument(String newDocument)
           
 void setOldDocument(String oldDocument)
           
protected  List<Change> sortSubdiff(List<Change> subdiff)
          We sort through subdiffs, trying to match up insertions with deletions.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

diff

protected List<Change> diff

deepdiff

protected List<DeepDiff> deepdiff

SequenceMatch

protected SequenceMatcher SequenceMatch

oldDocument

protected String oldDocument

newDocument

protected String newDocument

newNode

protected Node newNode

oldNode

protected Node oldNode

isJunk

protected IsJunk isJunk
Constructor Detail

NodeDiff

public NodeDiff(Node newNode,
                Node oldNode)
Create a NodeDiff object without passing in the Strings that the nodes were parsed from. You can create a normal diff from this. However, to create a DeepDiff (with subdiffs) it will be necessary to set both oldDocument and newDocument.

Parameters:
newNode - The current version of the node being diffed.
oldNode - The original version of the node being diffed.

NodeDiff

public NodeDiff(Node newNode,
                Node oldNode,
                IsJunk isJunk)
Create a NodeDiff object without passing in the Strings that the nodes were parsed from. You can create a normal diff from this. However, before creating a DeepDiff (with subdiffs) it will be necessary to set both oldDocument and newDocument.

Passing in isJunk allows you to customize the diffs by choosing to ignore specific Nodes or types of Nodes. isJunk is an object which implements the IsJunk interface and the #checkJunk() method. checkJunk is a method which takes a Node and determines whether or not it should be compared in the diff or skipped. If isJunk is non-null, isJunk#checkJunk() will be called at each pass through SequenceMatcher.findChanges(org.jrubyparser.ast.Node, org.jrubyparser.ast.Node). NewlineNode and BlockNode nodetypes are ignored automatically.

Parameters:
newNode - The current version of the node being diffed.
oldNode - The original version of the node being diffed.
isJunk - A callback used to let users choose nodes not to be checked in diff.

NodeDiff

public NodeDiff(Node newNode,
                String newDocument,
                Node oldNode,
                String oldDocument)
Create a NodeDiff object by passing in both the Nodes to be diffed as well as the Strings they were parsed from. The object constructed can perform both a diff of its nodes as well as a DeepDiff (subdiff of particular nodes).

Parameters:
newNode - The current version of the node being diffed.
newDocument - The String that newNode was parsed from.
oldNode - The original version of the node being diffed.
oldDocument - The String that oldNode was parsed from.

NodeDiff

public NodeDiff(Node newNode,
                String newDocument,
                Node oldNode,
                String oldDocument,
                IsJunk isJunk)
Create a NodeDiff object by passing in both the Nodes to be diffed as well as the Strings they were parsed from. The object constructed can perform both a diff of its nodes as well as a DeepDiff (subdiff of particular nodes).

Passing in isJunk allows you to customize the diffs by choosing to ignore specific Nodes or types of Nodes. isJunk is an object which implements the IsJunk interface and the #checkJunk() method. checkJunk is a method which takes a Node and determines whether or not it should be compared against the other node. If isJunk is non-null, isJunk#checkJunk() will be called at each pass through SequenceMatcher.findChanges(org.jrubyparser.ast.Node, org.jrubyparser.ast.Node). NewlineNode and BlockNode nodetypes are ignored automatically.

Parameters:
newNode - The current version of the node (AST) being diffed.
newDocument - The String that newNode was parsed from.
oldNode - The original version of the node (AST) being diffed.
oldDocument - The String that oldNode was parsed from.
isJunk - A callback used to let users choose nodes not to be checked in diff.
Method Detail

setOldDocument

public void setOldDocument(String oldDocument)

setNewDocument

public void setNewDocument(String newDocument)

getDiff

public List<Change> getDiff()
Fetch or create a diff of the Nodes newNode and oldNode. This is done via SequenceMatcher.

Returns:
Returns an ArrayList of Change objects representing the diff.

getDeepDiff

public List<DeepDiff> getDeepDiff()
Fetch or create a deep diff of the Nodes newNode and oldNode. The ArrayList of DeepDiff objects returned will contain both primary diff information (Change objects) as well as subdiffs of some of those changes.

Because it uses the Strings which the nodes were parsed from for matching purposes, if these have not been set (either at object construction or via #setOldDocument and #setNewDocument) it will throw a NullPointerException.

Returns:
Returns an ArrayList of DeepDiff objects representing the diff and subdiff.
Throws:
NullPointerException
See Also:
getSubdiff(Change), Change, DeepDiff

createDiff

protected List<Change> createDiff(SequenceMatcher sequenceMatch)

createDeepDiff

protected List<DeepDiff> createDeepDiff(List<Change> roughDiff)

getSubdiff

protected List<Change> getSubdiff(Change change)
Sorts through a diff, checking for specific, important types of Nodes like classes, methods, etc and performs subdiffs on those. It calls #sortSubdiff(java.util.ArrayList) for matching nodes from the original and current version, which uses a Levenshtein distance measurement for this purpose. The subdiff, since it is comparing a much smaller set of potential matches, can be more optimistic than the matching which occurs for an ordinary diff. Careful usage of the subdiff information can dramatically improve the diff results.

Parameters:
change - The Change object being subdiffed.
Returns:
Returns an ArrayList of Change objects. Essentially a diff.

sortSubdiff

protected List<Change> sortSubdiff(List<Change> subdiff)
We sort through subdiffs, trying to match up insertions with deletions. While the diff is weighted to avoid false positives, given that the subdiff has a much smaller number of nodes to be compared against, we can be a bit more liberal, though the possibility of false positives does exist, it is far less critical if one does occur.

Parameters:
subdiff - An ArrayList which is a diff of the nodes in a Change object.
Returns:
Returns an ArrayList this is the subdiff object, after sorting.


Copyright © 2013. All Rights Reserved.