|
BCL
offers tools and technology for heterogeneous
data and information management systems. There
are several strands of functionalities of this
type of systems, including efficient data storage,
any-to-any document conversion, archiving, tools
to manipulate and harvest data, and mobile access
to data. Our technology offers distributed data
structure, security, robustness, efficient access
to the data and finally modularity in design so
that future extensions to the system are efficient
and painless.
BCL hold several patents
in this field. Its research
is published widely in refereed journals and peer-reviewed
conferences.
|
|
|
 |
FEDERAL
AWARDS |
|
|
Title |
Description |
Agency |
Status |
 |
KMC–
Knowledge Management Center |
BCL
Knowledge Management Center (KMC) is a knowledge
based document and information management
platform. This will offer universal information
access for the Mobile Warrior. |
Army
– DOD |
Phase
II research began in January 2003
A012-0933 |
 |
UIAMW
– Universal Information Access for the
Mobile Warrior |
To
make any electronic document of any format
universally accessible to the Mobile Warrior
from any electronic device, including handheld
Personal Digital Assistants (PDAs) using wireless
connections |
Army
– DOD |
Phase
I research started in early 2002
A012-0933 |
 |
FaxAssist
– Automatically routing unconstrained
faxes to email recipients |
Unconstrained
processing of faxes and converting them to
emails. |
DARPA |
Phase
I started in 1996
DAAH01-96-C-R242 |
 |
FaxAssist
– Automatically routing unconstrained
faxes to email recipients |
Implementing
a prototype based on Phase I. |
DARPA |
Phase
II completed in 1999An initial prototype
fax routing system was deployed and tested
at
DARPADAAH01-98-C-R013 |
 |
TableAssist
– Integration of Information from Heterogeneous
Sources |
Developing
methods for locating tables in documents and
extracting them into structured repositories.
|
DARPA |
Phase
I started in 1994
DAAH01-94-C-R156 |
 |
TableAssist
– Integration of Information from Heterogeneous
Sources |
Implementing
a prototype based on Phase I. |
DARPA |
Phase
II completed in 1998.Based on this technology,
BCL has commercialized several Adobe Acrobat
Plug-in products since 1998
DAAH01-94-C-R156 |
|
|
 |
TECHNOLOGY
SUITE |
|
|
BCL
has a suite of technologies in Document analysis,
understanding and Knowledge Management.
- Document
Analysis
BCL has considerable experience in document
analysis. This is derived from its DARPA sponsored
research projects. It involves analysis of layout
in terms of content, relationship among different
types of content, content extraction, and logical
layout generation.
- Document
Conversion
BCL's expertise in document analysis allows
it to design efficient document conversion tools.
This involves converting any electronic document
from one format to another, including HTML and
XML. The original layout of the document is
retained at all times.
- Repository
BCL's technology allows the use of a data repository
where the documents are kept in a web-server
like environment. Depending on the specific
input and output needs of the client, components
of the system will convert PDF, as well as Word
Perfect, Word, Power Point, Excel, Quark and
PageMaker documents to HTML. It then adds the
appropriate logical label tags and style sheets
that allow the documents to be viewed by the
client, while faithfully preserving the original
content and flow.
- Universal
Data Storage
Universal Data Storage involves document conversion
to PDF, layout reproduction, conversion to HTML,
and finally to produce and XML representation.
Although they are sometimes described under
the umbrella of data storage requirements, they
offer significant research challenge individually.
BCL has significant research and commercial
experience in this field.
- Universal
Data Access
BCL is an industry leader in data sharing, display
of documents and universal data access from
multiple devices. It is possible to have different
versions of the same document in our data repository.
The first is the original document that can
be accessed and downloaded using the standard
interface, such as File Transfer Protocol (FTP).
The second is the converted HTML document, that
can be accessed using a standard web interface,
and the third is the XML representation containing
the natural language summary of the document
for quick review.
-
Information Organization
BCL’s technology allows organization of
data. In this respect, we have a set of tools
to organize the data differently, if needed.
The criteria of information organization include
correlation, pattern detection, association,
and classification. BCL makes this free of any
heuristics and uses it for maximum optimization
of Knowledge Management challenges within a
specified domain.
|
 |
RESEARCH
PAPERS |
|
|
- A
Commercial Web based Digital Library for
Sharing and Distributing Documents.
Fuad Rahman and Hassan Alam.
1st Int. Workshop on Document Image Analysis for Libraries (DIAL'04), January,
2004. (To appear)
- Conversion
of PDF Documents into HTML: A Case Study
of Document Image Analysis.
Fuad Rahman, Hassan Alam.
37th IEEE Asilomar Conference on Signals, Systems, and Computers, 2003.
- Assuming
Accurate Layout Information for Web Documents
is Available, What Now?
Hassan Alam, Rachmat Hartono, Aman Kumar,
Fuad Rahman, Yuliya Tarnikova and Che Wilcox.
Third
International Workshop on
Document Layout Interpretation and its Applications
(DLIA2003).

- Assuming Accurate Layout Information is
Available: How do we Interpret the Content Flow in HTML Documents?
Hassan Alam and Fuad Rahman.
Third International Workshop on Document Layout
Interpretation and its Applications (DLIA2003).

- A
Pair-wise Decision Fusion Framework: Recognition of Human Faces.
Hassan Alam,
Fuad
Rahman, Yuliya Tarnikova and Rachmat Hartono. 6th
Int. Conf. on Information Fusion (FUSION 2003),
2003. In press.
- Use
of Genetic Algorithms for Optimizing a
Decision Fusion Framework
Fuad Rahman, Michael
Fairhurst, Hassan
Alam and Rachmat Hartono.
6th Int. Conf. on Information Fusion (FUSION
2003), 2003. In press.
- Web
Document Manipulation for Small Screen Devices:
A Review
Hassan Alam and Fuad Rahman.
Web Document Analysis Workshop (WDA), 2003.

- Web
Document
Analysis: How can Natural Language Processing Help in Determining
Correct Content Flow?
Hassan Alam,
Fuad Rahman and Yuliya Tarnikova. Web Document
Analysis Workshop (WDA), 2003.

- When
is a List is a List?: Web Page Re-authoring
for Small Display Devices.
Hassan Alam, Fuad Rahman and Yuliya Tarnikova.
Proc. 12th Int. World Wide World Conference
(WWW2003), 20-24 May 2003, Budapest, Hungary.
In press.

- Structured
and Unstructured Document Summarization: Design
of a Commercial Summarizer using Lexical Chains.
H. Alam, A. Kumar, M. Nakamura, A. F. R. Rahman,
Y. Tarnikova and C. Wilcox.
7th
Int. Conf. on Document Analysis and Recognition
(ICDAR2003), 2003.

-
Web Page Summarization for Handheld
Devices: A Natural Language Approach.
H. Alam, R. Hartono, A. Kumar, A. F. R. Rahman,
Y. Tarnikova and C. Wilcox.
7th Int. Conf. on Document Analysis and Recognition
(ICDAR2003), 2003.

- Solving
Problems Two at a Time: Classification of Web
Pages using a Generic Pair-wise Multiple Classifier
System.
Hassan Alam, Fuad Rahman and Yuliya Tarnikova.
4th Int. Conf. on Multiple Classifier Systems,
2003.

-
Universal Document Management System
for the Mobile Warrior.
H. Alam, R. Hartono, F. Rahman, Y. Tarnikova,
T. Tjahjadi, C. Wilcox.
Symposium on Document Image Understanding Technology,
SDIUT'03, 2003.

- Exploring
a Hybrid of Support Vector Machines (SVMs) and
a Heuristic Based System in Classifying Web
Pages.
Hassan Alam, Yuliya Tarnikova and Ahmad Rahman.
Document Recognition and Retrieval X, 15th Annual
IS&S/SPIE Symposium, 2003. In press.

-
Extraction and Management of Content
from Html Documents. Chapter in the book titled
"Web Document Analysis: Challenges and
Opportunities".
H. Alam, R. Hartono and A. F. R. Rahman.
World Scientific Series in Machine Perception
and Artificial Intelligence, 2002. In press.

-
Multiple Classifier Decision Combination
Strategies for Character Recognition: A Review.
Special Issue on Multiple classifiers for document
analysis applications.
A. Rahman and M. C. Fairhurst.
International Journal on Document Analysis and
Recognition (IJDAR), in press.

- Multiple
Classifier Combination for Character Recognition:
Revisiting the Majority Voting System and its
Variations.
A. F. R. Rahman, H. Alam and M. C. Fairhurst.
Lecture Notes in Computer Science, LNCS 2423,
Document Analysis Systems V, pages 167-178.
2002.

-
Fusion of n-tuple Based Classifiers for High
Performance Handwritten Character Recognition.
K. Sirlantzis, S. Hoque, M.C. Fairhurst and
A.F.R.Rahman.
Lecture Notes in Computer Science LNCS 2396,
T. Caelli, A. Amin, R. Duin, M. Kamel and D.
Ridder (Eds.), pages 770-778, 2002.

-
Novel Approaches to Optimized Self-configuration
in High Performance Multiple-Expert Classifiers.
A. F. R. Rahman M. C. Fairhurst and S. Hoque.
8th IWFHR, August 6-8, 2002 in Niagara-on-the-Lake,
Ontario, Canada.

-
Challenges in Web Document Summarization: Some
Myths and Reality.
A. Rahman and H. Alam.
Document Recognition and Retrieval IX, Electronic
Imaging Conference, SPIE 4670-27, 2002.

-
Understanding the Flow of Content in Summarizing
HTML Documents.
A. F. R. Rahman, H. Alam and R. Hartono.
Int. Workshop on Document Layout Interpretation
and its Applications, DLIA01, Seattle, USA,
Sep., 2001.

-
Content Extraction from HTML Documents.
A. F. R. Rahman, H. Alam and R. Hartono.
Int. Workshop on Web Document Analysis, WDA01,
pp. 7-10, Seattle, USA, Sep., 2001.

-
Automatic Summarization of Web Content
to Smaller Display Devices
A. F. R. Rahman, H. Alam, R. Hartono and K.
Ariyoshi
6th Int. Conf. On Document Analysis and Recognition,
ICDAR01, Seattle, USA, pp. 1064-1068, Sep.,
2001.

-
Decision combination of multiple classifiers
for pattern classification: Hybridization of
majority Voting and Divide and Conquer Techniques
A. F. R. Rahman and M. C. Fairhurst
IEEE Int. Workshop on Applications of Computer
Vision (WACV2000), Palm Springs, California,
USA, 2001, pages 58-63.

-
Comparison of some multiple expert strategies:
An investigation of resource pre-requisites
and achievable performance
A. F. R. Rahman and M. C. Fairhurst
In 15th Int. Conf. on Pattern Recognition, Barcelona,
Spain, 2000, pages 841-844.
- A
system for table understanding
13. C. Peterman, C. Chang and H. Alam
In Proc. Symposium on Document Image Understanding
Technology, SDIUT'97, pp. 55-62, 1997.
- BDOC
- A document Representation method
A. Dong, S. Tupaj, C. Change and H. Alam
In Proc. Symposium on Document Image Understanding
Technology, SDIUT'97, pp. 63-73, 1997.
- FaxAssist:
Inbound fax routing using document understanding
S. Tupaj, H. Dediu and H. Alam
In Proc. Symposium on Document Image Understanding
Technology, SDIUT'97, pp. 74-84, 1997.
|
 |
PATENTS |
|
|
|
Patents
Assigned
-
Processor Based Method for extracting Tables
from Printed Documents. #5,737,4224.
-
Processor Based Method for extracting Tablets
from Printed Documents. #5,965,422.
-
Network Fax Routing via Email. #6,104,500.
-
Conversion Data Representing a Document to Other
Formats for Manipulation and Display. #6,336,124
B1.
Patents
Pending
-
Exploring a hybrid of support vector machines
(SVMs) and a heuristic based system in classifying
web pages. Provisional Patent, 60/371,046 filed
on April 8th, 2002.
-
Displaying Java scripts on PDAs. Provisional
Patent, 60/408,795, filed on September 6th,
2002.
-
Universal email for PDAs. Provisional Patent,
60/408,796, filed on September 6th, 2002.
-
Structured and Unstructured Document Summarization:
Design of a Commercial Summarizer using Lexical
Chains. Provisional Patent application filed
on December 18th, 2002.
|
|
|
|