ca.cbc.sportwire
Class WireFeeder

java.lang.Object
  |
  +--ca.cbc.sportwire.WireFeeder
All Implemented Interfaces:
WireFeederProperties

public class WireFeeder
extends java.lang.Object
implements WireFeederProperties

Copyright (C) 2001 Canadian Broadcasting Corporation (cbc.ca) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA


Field Summary
protected static java.util.Date boottime
          boottime: variable
(package private) static org.apache.log4j.Category cat
          Set up a reporting category in Log4J
private  org.apache.commons.collections.ExtendedProperties config
           
private  DocHandler docHandler
           
private  SportwireFeed feed
           
private  java.util.List ignorePatterns
           
private static int MIN_LINES
           
private  DocQueue queue
           
protected  java.util.Date starttime
           
protected  java.util.Date timestamp
           
 
Fields inherited from interface ca.cbc.sportwire.WireFeederProperties
CONFIGFILE_DEFAULT, CONFIGFILE_PROPERTY, DEFAULT_FILENAME_XPATH, DEFAULT_SAX_DRIVER_CLASS, DEFAULT_XML_PATH, DEFAULT_XMLRPC_PORT, DOCHANDLER_DEFAULT, DOCHANDLER_PROPERTY, DOCWORKERS_DEFAULT, DOCWORKERS_PROPERTY, DTD_PATH_PROPERTY, FEED_REGEX_PROPERTY, FEEDCLASS_DEFAULT, FEEDCLASS_PROPERTY, FEEDFILTER_DEFAULT, FEEDFILTER_PROPERTY, FILENAME_XPATH_PROPERTY, IGNOREFILE_PROPERTY, LOGFILE_DEFAULT, LOGFILE_PROPERTY, MAP_PATH_PROPERTY, SAX_CLASS_PROPERTY, WATCHDOG_IDLE_DEFAULT, WATCHDOG_IDLE_PROPERTY, XML_PATH_PROPERTY, XMLRPC_PORT_PROPERTY, XSL_PATH_PROPERTY
 
Constructor Summary
protected WireFeeder()
          Creates a new WireFeeder instance.
protected WireFeeder(java.lang.String conf)
          WireFeeder constructor accepts a config filename.
 
Method Summary
protected  org.apache.commons.collections.ExtendedProperties getConfig()
          getConfig: Access the properties list.
protected  DocHandler getDocHandler()
          Get the current docHandler.
protected  SportwireFeed getFeed()
          Access the current feed object.
protected  DocQueue getQueue()
          getQueue: access the document queue.
protected  void loadIgnoreList(java.lang.String ignoreFile)
          loadIgnoreList: Loads in and pre-compiles a list of perl expressions that specify document tags to ignore.
private  void loadProperties(java.lang.String conf)
           
private  void loadRegexMap(org.apache.commons.collections.ExtendedProperties conf)
           
static void main(java.lang.String[] argv)
          Command line main function; use a property to determine the specific input feed or doc processing module (should someday add a command line override)
private  boolean onIgnoreList(java.lang.String doctag)
           
protected  void readFeed()
          readFeed rips through the continuous stream of input until the read function starts returning consecutive nulls, which probably means we are out of input; when input runs out, we flag the workers to exit and return to await job completion.
protected  void setConfig(org.apache.commons.collections.ExtendedProperties v)
          Set the properties list
protected  void setDocHandler()
          Set the value of docHandler from the class name found in the property DOCHANDLER_PROPERTY.
protected  void setDocHandler(java.lang.String dh)
          setDocHandler: set the handler to a class from the class name.
protected  void setFeed()
          Set the feed from a classname specified by the FEEDCLASS_PROPERTY
protected  void setFeed(java.lang.String f)
          setFeed: set the feed by name, invoking the getInstance (the class is assumed to be a Singleton pattern)
private static void setLogging(boolean debug)
          setLogging loads filters and appenders from the property file specified by wirefeeder.log4j.conf.
private  void startWatchdog()
           
 java.util.Map status()
          status: XMLRPC method to report on the system status; called as sportwire.status from localhost:8484/, this returns a few uptime diagnostics useful for ensuring the system is still alive.
static void usage()
          usage: print the command line options to stderr
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

cat

static org.apache.log4j.Category cat
Set up a reporting category in Log4J

MIN_LINES

private static int MIN_LINES

docHandler

private DocHandler docHandler

feed

private SportwireFeed feed

queue

private DocQueue queue

config

private org.apache.commons.collections.ExtendedProperties config

ignorePatterns

private java.util.List ignorePatterns

timestamp

protected java.util.Date timestamp

starttime

protected java.util.Date starttime

boottime

protected static java.util.Date boottime
boottime: variable
Constructor Detail

WireFeeder

protected WireFeeder()
              throws java.io.IOException,
                     java.lang.ClassNotFoundException
Creates a new WireFeeder instance. This should have the effect of a total warm restart. The constructor will instance the feed and default document handler.
Throws:
java.io.IOException - if an input error occurs
java.lang.ClassNotFoundException - if the feeder fails to load

WireFeeder

protected WireFeeder(java.lang.String conf)
              throws java.io.IOException,
                     java.lang.ClassNotFoundException,
                     java.io.FileNotFoundException
WireFeeder constructor accepts a config filename.
Parameters:
conf - a String value
Throws:
java.io.IOException - if an error occurs
java.lang.ClassNotFoundException - if an error occurs
java.io.FileNotFoundException - if an error occurs
Method Detail

getDocHandler

protected DocHandler getDocHandler()
Get the current docHandler.
Returns:
value of type DocHandler.

setDocHandler

protected void setDocHandler()
                      throws java.lang.ClassNotFoundException,
                             java.lang.IllegalAccessException,
                             java.lang.InstantiationException,
                             java.lang.NoSuchMethodException,
                             java.lang.reflect.InvocationTargetException
Set the value of docHandler from the class name found in the property DOCHANDLER_PROPERTY.
Throws:
java.lang.ClassNotFoundException - if missing
java.lang.IllegalAccessException - if security error
java.lang.InstantiationException - if constructor fails
java.lang.NoSuchMethodException - if no getInstance
java.lang.reflect.InvocationTargetException - if other error

setDocHandler

protected void setDocHandler(java.lang.String dh)
                      throws java.lang.ClassNotFoundException,
                             java.lang.IllegalAccessException,
                             java.lang.InstantiationException,
                             java.lang.NoSuchMethodException,
                             java.lang.reflect.InvocationTargetException
setDocHandler: set the handler to a class from the class name.
Parameters:
dh - a classname String value
Throws:
java.lang.ClassNotFoundException - if missing
java.lang.IllegalAccessException - if security error
java.lang.InstantiationException - if constructor fails
java.lang.NoSuchMethodException - if no getInstance
java.lang.reflect.InvocationTargetException - if other error

getFeed

protected SportwireFeed getFeed()
Access the current feed object.
Returns:
SportwireFeed object

setFeed

protected void setFeed()
                throws java.lang.ClassNotFoundException,
                       java.lang.NoSuchMethodException,
                       java.lang.IllegalAccessException,
                       java.lang.reflect.InvocationTargetException
Set the feed from a classname specified by the FEEDCLASS_PROPERTY
Throws:
java.lang.ClassNotFoundException - if missing
java.lang.NoSuchMethodException - if no getInstance
java.lang.IllegalAccessException - if security error
java.lang.reflect.InvocationTargetException - if other error occurs

setFeed

protected void setFeed(java.lang.String f)
                throws java.lang.ClassNotFoundException,
                       java.lang.NoSuchMethodException,
                       java.lang.IllegalAccessException,
                       java.lang.reflect.InvocationTargetException
setFeed: set the feed by name, invoking the getInstance (the class is assumed to be a Singleton pattern)
Parameters:
f - a String value
Throws:
java.lang.ClassNotFoundException - if missing
java.lang.NoSuchMethodException - if no getInstance
java.lang.IllegalAccessException - if security error
java.lang.reflect.InvocationTargetException - if other error occurs

getQueue

protected DocQueue getQueue()
getQueue: access the document queue.
Returns:
a DocQueue value

getConfig

protected org.apache.commons.collections.ExtendedProperties getConfig()
getConfig: Access the properties list.
Returns:
ExtendedProperties value of config.

setConfig

protected void setConfig(org.apache.commons.collections.ExtendedProperties v)
Set the properties list
Parameters:
v - Value to assign to config.

loadProperties

private void loadProperties(java.lang.String conf)
                     throws java.io.IOException,
                            java.io.FileNotFoundException

loadIgnoreList

protected void loadIgnoreList(java.lang.String ignoreFile)
loadIgnoreList: Loads in and pre-compiles a list of perl expressions that specify document tags to ignore. For most feeds, the tag is the systemID of the DocType; for ESPN, the tag is the keyword slug.
Parameters:
ignoreFile - a String filename value

onIgnoreList

private boolean onIgnoreList(java.lang.String doctag)

loadRegexMap

private void loadRegexMap(org.apache.commons.collections.ExtendedProperties conf)

readFeed

protected void readFeed()
                 throws java.io.IOException
readFeed rips through the continuous stream of input until the read function starts returning consecutive nulls, which probably means we are out of input; when input runs out, we flag the workers to exit and return to await job completion.
Throws:
java.io.IOException - if an error occurs

setLogging

private static void setLogging(boolean debug)
setLogging loads filters and appenders from the property file specified by wirefeeder.log4j.conf. If debug is false, messages below WARN priority will be suppressed. This can be undone in a production releasey through the log4j.disableOverride property.

usage

public static void usage()
usage: print the command line options to stderr

status

public java.util.Map status()
status: XMLRPC method to report on the system status; called as sportwire.status from localhost:8484/, this returns a few uptime diagnostics useful for ensuring the system is still alive.
Returns:
a Map value

startWatchdog

private void startWatchdog()

main

public static void main(java.lang.String[] argv)
Command line main function; use a property to determine the specific input feed or doc processing module (should someday add a command line override)
Parameters:
argv - a String[] of ignored command line parms