Class WebScavenger

  extended byjavax.swing.tree.DefaultMutableTreeNode
      extended byorg.embl.ebi.escience.scuflui.workbench.Scavenger
          extended byorg.embl.ebi.escience.scuflui.workbench.WebScavenger
All Implemented Interfaces:
java.lang.Cloneable, javax.swing.tree.MutableTreeNode,, javax.swing.tree.TreeNode

public class WebScavenger
extends Scavenger

A scavenger that does a web crawl starting at the specified URL to find scufl xml files. If it finds any, it adds the appropriate WorkflowProcessorFactory nodes to the scavenger tree. If it finds talisman tscript definitions it adds those too. Code modified from that found at

Nested Class Summary
Nested classes inherited from class javax.swing.tree.DefaultMutableTreeNode
Field Summary
static java.lang.String DISALLOW
Fields inherited from class javax.swing.tree.DefaultMutableTreeNode
allowsChildren, children, EMPTY_ENUMERATION, parent, userObject
Constructor Summary
WebScavenger(java.lang.String initialURL)
Method Summary
(package private)  void getXScuflURLs(java.lang.String initialURL)
static void main(java.lang.String[] args)
(package private)  boolean robotSafe( url)
(package private)  boolean robotSafeOld( url)
          Check whether there is a robots.txt that would ban access to the URL and those below it
 java.lang.String[] search(java.lang.String initialURL)
          Return an array of strings of URLs of XScufl files found by a web crawl from the initial URL.
Methods inherited from class javax.swing.tree.DefaultMutableTreeNode
add, breadthFirstEnumeration, children, clone, depthFirstEnumeration, getAllowsChildren, getChildAfter, getChildAt, getChildBefore, getChildCount, getDepth, getFirstChild, getFirstLeaf, getIndex, getLastChild, getLastLeaf, getLeafCount, getLevel, getNextLeaf, getNextNode, getNextSibling, getParent, getPath, getPathToRoot, getPreviousLeaf, getPreviousNode, getPreviousSibling, getRoot, getSharedAncestor, getSiblingCount, getUserObject, getUserObjectPath, insert, isLeaf, isNodeAncestor, isNodeChild, isNodeDescendant, isNodeRelated, isNodeSibling, isRoot, pathFromAncestorEnumeration, postorderEnumeration, preorderEnumeration, remove, remove, removeAllChildren, removeFromParent, setAllowsChildren, setParent, setUserObject, toString
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Detail


public static final java.lang.String DISALLOW
See Also:
Constructor Detail


public WebScavenger(java.lang.String initialURL)
             throws ScavengerCreationException
Method Detail


public static void main(java.lang.String[] args)
                 throws java.lang.Exception


void getXScuflURLs(java.lang.String initialURL)
             throws ScavengerCreationException


boolean robotSafe( url)


boolean robotSafeOld( url)
Check whether there is a robots.txt that would ban access to the URL and those below it


public java.lang.String[] search(java.lang.String initialURL)
Return an array of strings of URLs of XScufl files found by a web crawl from the initial URL.