java.lang.Object
javax.swing.tree.DefaultMutableTreeNode
org.embl.ebi.escience.scuflui.workbench.Scavenger
org.embl.ebi.escience.scuflui.workbench.WebScavenger
- All Implemented Interfaces:
- java.lang.Cloneable, javax.swing.tree.MutableTreeNode, java.io.Serializable, javax.swing.tree.TreeNode
- public class WebScavenger
- extends Scavenger
A scavenger that does a web crawl starting at the specified
URL to find scufl xml files. If it finds any, it adds the
appropriate WorkflowProcessorFactory nodes to the scavenger
tree. If it finds talisman tscript definitions it adds those
too.
Code modified from that found at
http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/
Methods inherited from class javax.swing.tree.DefaultMutableTreeNode |
add, breadthFirstEnumeration, children, clone, depthFirstEnumeration, getAllowsChildren, getChildAfter, getChildAt, getChildBefore, getChildCount, getDepth, getFirstChild, getFirstLeaf, getIndex, getLastChild, getLastLeaf, getLeafCount, getLevel, getNextLeaf, getNextNode, getNextSibling, getParent, getPath, getPathToRoot, getPreviousLeaf, getPreviousNode, getPreviousSibling, getRoot, getSharedAncestor, getSiblingCount, getUserObject, getUserObjectPath, insert, isLeaf, isNodeAncestor, isNodeChild, isNodeDescendant, isNodeRelated, isNodeSibling, isRoot, pathFromAncestorEnumeration, postorderEnumeration, preorderEnumeration, remove, remove, removeAllChildren, removeFromParent, setAllowsChildren, setParent, setUserObject, toString |
DISALLOW
public static final java.lang.String DISALLOW
- See Also:
- Constant Field Values
WebScavenger
public WebScavenger(java.lang.String initialURL)
throws ScavengerCreationException
main
public static void main(java.lang.String[] args)
throws java.lang.Exception
getXScuflURLs
void getXScuflURLs(java.lang.String initialURL)
throws ScavengerCreationException
robotSafe
boolean robotSafe(java.net.URL url)
robotSafeOld
boolean robotSafeOld(java.net.URL url)
- Check whether there is a robots.txt that would ban access to
the URL and those below it
search
public java.lang.String[] search(java.lang.String initialURL)
throws java.net.MalformedURLException
- Return an array of strings of URLs of XScufl files found
by a web crawl from the initial URL.