International Journal of Document
Analysis and Recognition |
||||||||||
SPECIAL ISSUE CALL FOR PAPERS
Detection and Understanding
of Tables and Forms for Document Processing Applications
In many document image-processing tasks, detection
and understanding of tables and forms pose a huge challenge. Detection
of tables and forms in electronic documents can be one small part
of a series of tasks required for document processing, but often
the accuracy of physical and logical layout extraction from these
documents depends to a great extent on the accuracy of this solution
available to the system. Because errors at this stage tend to get
magnified as the subsequent processes are applied in sequence. Beside
the obvious 'tables' and 'forms', other manifestations of these structures
include 'lists' and 'table of contents' (TOC). Some common features
of these structures include, but are not limited to:
–
Tabular representation
of information
–
Coexistence
of graphical, syntactical and semantic structure
–
Absence of
obvious sequential order
–
Necessity
to understand the whole structure for its interpretation
–
Variety of
basic construction principles
–
Combinations
of construction principles
What particular combination of features makes a table,
or a form? What is the role of the user in defining these structures
(e.g. in industry, "the customer often defines what a table is")?
How is a table different from a list? How is a list different from
a TOC? Does geometric alignment of content guarantee a robust structure?
Or how much semantic relationship needs to be included within the
content to qualify a region as such? What is the role of cross-disciplinary
techniques used by Information Retrieval (IR) and Natural Language
Processing (NLP) communities in understanding tables and forms? These
also can be recursive, e.g. a table can contain images, lists and
even tables. These are only some of the questions that need answering.
Some applications of this technology may include,
but are not limited to, electronic document conversion from one format
to another (e.g. PDF to HTML), invoice reading/processing, archiving,
alternate views on small screen devices (e.g. handhelds and cell
phones), audio readout for very small screen devices (e.g. a watch
with internet connectivity), faster understanding of content, locating
relevant content and/or finding answers within document repositories
etc., to name a few.
High quality unpublished full length papers are sought
on any aspect of this subject area, but the following will be particularly
welcome:
o
Papers which
establish some theoretical underpinning
o
Papers which
demonstrate comparative evaluation of available or new techniques/systems
o
Papers which
introduce novel detection algorithms
o
Papers which
describe complete document processing systems with focus on the role
of table or form detection and understanding
Please submit your manuscripts (full papers) electronically
according to the instructions on the IJDAR web page (http://ijdar.cfar.umd.edu/).
Any questions can be sent to the Guest Editors below with a CC to
the editorial office (ijdar@cfar.umd.edu). Electronic (email) correspondence
is preferred.
Full Paper Submission: June
30th, 2004 Editors' Decisions: January
31st, 2005 Publication Date: March-April,
2005
For additional information, please visit http://www.bcltechnologies.com/rd/special_issue/ or http://www.dfki.uni-kl.de/~klein/IJDAR-special-issue.
|