Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

org.embl.ebi.escience.scuflui.workbench
Class WebScavenger  view WebScavenger download WebScavenger.java

java.lang.Object
  extended byjavax.swing.tree.DefaultMutableTreeNode
      extended byorg.embl.ebi.escience.scuflui.workbench.Scavenger
          extended byorg.embl.ebi.escience.scuflui.workbench.WebScavenger
All Implemented Interfaces:
java.lang.Cloneable, javax.swing.tree.MutableTreeNode, java.io.Serializable, javax.swing.tree.TreeNode

public class WebScavenger
extends Scavenger

A scavenger that does a web crawl starting at the specified URL to find scufl xml files. If it finds any, it adds the appropriate WorkflowProcessorFactory nodes to the scavenger tree. If it finds talisman tscript definitions it adds those too. Code modified from that found at http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/


Nested Class Summary
 
Nested classes inherited from class javax.swing.tree.DefaultMutableTreeNode
 
Field Summary
static java.lang.String DISALLOW
           
 
Fields inherited from class javax.swing.tree.DefaultMutableTreeNode
allowsChildren, children, EMPTY_ENUMERATION, parent, userObject
 
Constructor Summary
WebScavenger(java.lang.String initialURL)
           
 
Method Summary
(package private)  void getXScuflURLs(java.lang.String initialURL)
           
static void main(java.lang.String[] args)
           
(package private)  boolean robotSafe(java.net.URL url)
           
(package private)  boolean robotSafeOld(java.net.URL url)
          Check whether there is a robots.txt that would ban access to the URL and those below it
 java.lang.String[] search(java.lang.String initialURL)
          Return an array of strings of URLs of XScufl files found by a web crawl from the initial URL.
 
Methods inherited from class javax.swing.tree.DefaultMutableTreeNode
add, breadthFirstEnumeration, children, clone, depthFirstEnumeration, getAllowsChildren, getChildAfter, getChildAt, getChildBefore, getChildCount, getDepth, getFirstChild, getFirstLeaf, getIndex, getLastChild, getLastLeaf, getLeafCount, getLevel, getNextLeaf, getNextNode, getNextSibling, getParent, getPath, getPathToRoot, getPreviousLeaf, getPreviousNode, getPreviousSibling, getRoot, getSharedAncestor, getSiblingCount, getUserObject, getUserObjectPath, insert, isLeaf, isNodeAncestor, isNodeChild, isNodeDescendant, isNodeRelated, isNodeSibling, isRoot, pathFromAncestorEnumeration, postorderEnumeration, preorderEnumeration, remove, remove, removeAllChildren, removeFromParent, setAllowsChildren, setParent, setUserObject, toString
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DISALLOW

public static final java.lang.String DISALLOW
See Also:
Constant Field Values
Constructor Detail

WebScavenger

public WebScavenger(java.lang.String initialURL)
             throws ScavengerCreationException
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception

getXScuflURLs

void getXScuflURLs(java.lang.String initialURL)
             throws ScavengerCreationException

robotSafe

boolean robotSafe(java.net.URL url)

robotSafeOld

boolean robotSafeOld(java.net.URL url)
Check whether there is a robots.txt that would ban access to the URL and those below it


search

public java.lang.String[] search(java.lang.String initialURL)
                          throws java.net.MalformedURLException
Return an array of strings of URLs of XScufl files found by a web crawl from the initial URL.