HOME
 
 
Search BCL Website >
|  Japanese website
 
| MY ACCOUNT | CONTACT US
 
| BCL RESEARCH GROUP
     
 
 
 

BCL offers tools and technology for heterogeneous data and information management systems. There are several strands of functionalities of this type of systems, including efficient data storage, any-to-any document conversion, archiving, tools to manipulate and harvest data, and mobile access to data. Our technology offers distributed data structure, security, robustness, efficient access to the data and finally modularity in design so that future extensions to the system are efficient and painless.

BCL hold several patents in this field. Its research is published widely in refereed journals and peer-reviewed conferences.

 
Federal Awards
Technology Suite
Research Papers
Patents

FEDERAL AWARDS
 
Title Description Agency Status
KMC– Knowledge Management Center BCL Knowledge Management Center (KMC) is a knowledge based document and information management platform. This will offer universal information access for the Mobile Warrior. Army – DOD

Phase II research began in January 2003

A012-0933

UIAMW – Universal Information Access for the Mobile Warrior To make any electronic document of any format universally accessible to the Mobile Warrior from any electronic device, including handheld Personal Digital Assistants (PDAs) using wireless connections Army – DOD

Phase I research started in early 2002

A012-0933

FaxAssist – Automatically routing unconstrained faxes to email recipients Unconstrained processing of faxes and converting them to emails. DARPA

Phase I started in 1996

DAAH01-96-C-R242

FaxAssist – Automatically routing unconstrained faxes to email recipients Implementing a prototype based on Phase I. DARPA

Phase II completed in 1999An initial prototype fax routing system was deployed and tested at

DARPADAAH01-98-C-R013

TableAssist – Integration of Information from Heterogeneous Sources Developing methods for locating tables in documents and extracting them into structured repositories. DARPA

Phase I started in 1994

DAAH01-94-C-R156

TableAssist – Integration of Information from Heterogeneous Sources Implementing a prototype based on Phase I. DARPA

Phase II completed in 1998.Based on this technology, BCL has commercialized several Adobe Acrobat Plug-in products since 1998

DAAH01-94-C-R156

 

TECHNOLOGY SUITE
 

BCL has a suite of technologies in Document analysis, understanding and Knowledge Management.

  • Document Analysis
    BCL has considerable experience in document analysis. This is derived from its DARPA sponsored research projects. It involves analysis of layout in terms of content, relationship among different types of content, content extraction, and logical layout generation.
  • Document Conversion
    BCL's expertise in document analysis allows it to design efficient document conversion tools. This involves converting any electronic document from one format to another, including HTML and XML. The original layout of the document is retained at all times.
  • Repository
    BCL's technology allows the use of a data repository where the documents are kept in a web-server like environment. Depending on the specific input and output needs of the client, components of the system will convert PDF, as well as Word Perfect, Word, Power Point, Excel, Quark and PageMaker documents to HTML. It then adds the appropriate logical label tags and style sheets that allow the documents to be viewed by the client, while faithfully preserving the original content and flow.
  • Universal Data Storage
    Universal Data Storage involves document conversion to PDF, layout reproduction, conversion to HTML, and finally to produce and XML representation. Although they are sometimes described under the umbrella of data storage requirements, they offer significant research challenge individually. BCL has significant research and commercial experience in this field.
  • Universal Data Access
    BCL is an industry leader in data sharing, display of documents and universal data access from multiple devices. It is possible to have different versions of the same document in our data repository. The first is the original document that can be accessed and downloaded using the standard interface, such as File Transfer Protocol (FTP). The second is the converted HTML document, that can be accessed using a standard web interface, and the third is the XML representation containing the natural language summary of the document for quick review.
  • Information Organization
    BCL’s technology allows organization of data. In this respect, we have a set of tools to organize the data differently, if needed. The criteria of information organization include correlation, pattern detection, association, and classification. BCL makes this free of any heuristics and uses it for maximum optimization of Knowledge Management challenges within a specified domain.

RESEARCH PAPERS
 
  1. A Commercial Web based Digital Library for Sharing and Distributing Documents.
    Fuad Rahman and Hassan Alam.
    1st Int. Workshop on Document Image Analysis for Libraries (DIAL'04), January, 2004. (To appear)
     
  2. Conversion of PDF Documents into HTML: A Case Study of Document Image Analysis.
    Fuad Rahman, Hassan Alam.
    37th IEEE Asilomar Conference on Signals, Systems, and Computers, 2003.
     
  3. Assuming Accurate Layout Information for Web Documents is Available, What Now?
    Hassan Alam, Rachmat Hartono, Aman Kumar, Fuad Rahman, Yuliya Tarnikova and Che Wilcox.
    Third International Workshop on Document Layout Interpretation and its Applications (DLIA2003).
    Powerpoint Presentation Poster in PDF
     
  4. Assuming Accurate Layout Information is Available: How do we Interpret the Content Flow in HTML Documents?
    Hassan Alam and Fuad Rahman.
    Third International Workshop on Document Layout Interpretation and its Applications (DLIA2003).

    Powerpoint Presentation Poster in PDF
     
  5. A Pair-wise Decision Fusion Framework: Recognition of Human Faces.
    Hassan Alam, Fuad Rahman, Yuliya Tarnikova and Rachmat Hartono. 6th
    Int. Conf. on Information Fusion (FUSION 2003), 2003. In press.
    Webpage
     
  6. Use of Genetic Algorithms for Optimizing a Decision Fusion Framework
    Fuad Rahman, Michael Fairhurst, Hassan Alam and Rachmat Hartono.
    6th Int. Conf. on Information Fusion (FUSION 2003), 2003. In press.
    Webpage
     
  7. Web Document Manipulation for Small Screen Devices: A Review
    Hassan Alam and Fuad Rahman. Web Document Analysis Workshop (WDA), 2003.
    Powerpoint Presentation
    Webpage
     
  8. Web Document Analysis: How can Natural Language Processing Help in Determining Correct Content Flow?
    Hassan Alam, Fuad Rahman and Yuliya Tarnikova. Web Document Analysis Workshop (WDA), 2003.
    Powerpoint Presentation
    Webpage
     
  9. When is a List is a List?: Web Page Re-authoring for Small Display Devices.
    Hassan Alam, Fuad Rahman and Yuliya Tarnikova.
    Proc. 12th Int. World Wide World Conference (WWW2003), 20-24 May 2003, Budapest, Hungary. In press.
    Webpage
     
  10. Structured and Unstructured Document Summarization: Design of a Commercial Summarizer using Lexical Chains.
    H. Alam, A. Kumar, M. Nakamura, A. F. R. Rahman, Y. Tarnikova and C. Wilcox.
    7th Int. Conf. on Document Analysis and Recognition (ICDAR2003), 2003.
    Poster in PDF Webpage
     
  11. Web Page Summarization for Handheld Devices: A Natural Language Approach.
    H. Alam, R. Hartono, A. Kumar, A. F. R. Rahman, Y. Tarnikova and C. Wilcox.
    7th Int. Conf. on Document Analysis and Recognition (ICDAR2003), 2003.
    Poster in PDF Webpage
     
  12. Solving Problems Two at a Time: Classification of Web Pages using a Generic Pair-wise Multiple Classifier System.
    Hassan Alam, Fuad Rahman and Yuliya Tarnikova.
    4th Int. Conf. on Multiple Classifier Systems, 2003.
    Poster in PDF Webpage
     
  13. Universal Document Management System for the Mobile Warrior.
    H. Alam, R. Hartono, F. Rahman, Y. Tarnikova, T. Tjahjadi, C. Wilcox.
    Symposium on Document Image Understanding Technology, SDIUT'03, 2003.
    Webpage
     
  14. Exploring a Hybrid of Support Vector Machines (SVMs) and a Heuristic Based System in Classifying Web Pages.
    Hassan Alam, Yuliya Tarnikova and Ahmad Rahman.
    Document Recognition and Retrieval X, 15th Annual IS&S/SPIE Symposium, 2003. In press.
    Powerpoint Presentation
    Webpage
     
  15. Extraction and Management of Content from Html Documents. Chapter in the book titled "Web Document Analysis: Challenges and Opportunities".
    H. Alam, R. Hartono and A. F. R. Rahman.
    World Scientific Series in Machine Perception and Artificial Intelligence, 2002. In press.
    Webpage
     
  16. Multiple Classifier Decision Combination Strategies for Character Recognition: A Review. Special Issue on Multiple classifiers for document analysis applications.
    A. Rahman and M. C. Fairhurst.
    International Journal on Document Analysis and Recognition (IJDAR), in press.
    Webpage
     
  17. Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and its Variations.
    A. F. R. Rahman, H. Alam and M. C. Fairhurst.
    Lecture Notes in Computer Science, LNCS 2423, Document Analysis Systems V, pages 167-178. 2002.
    Powerpoint Presentation
    Webpage
     
  18. Fusion of n-tuple Based Classifiers for High Performance Handwritten Character Recognition.
    K. Sirlantzis, S. Hoque, M.C. Fairhurst and A.F.R.Rahman.
    Lecture Notes in Computer Science LNCS 2396, T. Caelli, A. Amin, R. Duin, M. Kamel and D. Ridder (Eds.), pages 770-778, 2002.
    Webpage
     
  19. Novel Approaches to Optimized Self-configuration in High Performance Multiple-Expert Classifiers.
    A. F. R. Rahman M. C. Fairhurst and S. Hoque.
    8th IWFHR, August 6-8, 2002 in Niagara-on-the-Lake, Ontario, Canada.
    Powerpoint Presentation
    Webpage
     
  20. Challenges in Web Document Summarization: Some Myths and Reality.
    A. Rahman and H. Alam.
    Document Recognition and Retrieval IX, Electronic Imaging Conference, SPIE 4670-27, 2002.
    Powerpoint Presentation
    Webpage
     
  21. Understanding the Flow of Content in Summarizing HTML Documents.
    A. F. R. Rahman, H. Alam and R. Hartono.
    Int. Workshop on Document Layout Interpretation and its Applications, DLIA01, Seattle, USA, Sep., 2001.
    Powerpoint Presentation
    Webpage
     
  22. Content Extraction from HTML Documents.
    A. F. R. Rahman, H. Alam and R. Hartono.
    Int. Workshop on Web Document Analysis, WDA01, pp. 7-10, Seattle, USA, Sep., 2001.
    Powerpoint Presentation
    Webpage
     
  23. Automatic Summarization of Web Content to Smaller Display Devices
    A. F. R. Rahman, H. Alam, R. Hartono and K. Ariyoshi
    6th Int. Conf. On Document Analysis and Recognition, ICDAR01, Seattle, USA, pp. 1064-1068, Sep., 2001.
    Webpage
     
  24. Decision combination of multiple classifiers for pattern classification: Hybridization of majority Voting and Divide and Conquer Techniques
    A. F. R. Rahman and M. C. Fairhurst
    IEEE Int. Workshop on Applications of Computer Vision (WACV2000), Palm Springs, California, USA, 2001, pages 58-63.
    Powerpoint Presentation
    Webpage
     
  25. Comparison of some multiple expert strategies: An investigation of resource pre-requisites and achievable performance
    A. F. R. Rahman and M. C. Fairhurst
    In 15th Int. Conf. on Pattern Recognition, Barcelona, Spain, 2000, pages 841-844.
    Webpage

     
  26. A system for table understanding
    13. C. Peterman, C. Chang and H. Alam
    In Proc. Symposium on Document Image Understanding Technology, SDIUT'97, pp. 55-62, 1997.
     
  27. BDOC - A document Representation method
    A. Dong, S. Tupaj, C. Change and H. Alam
    In Proc. Symposium on Document Image Understanding Technology, SDIUT'97, pp. 63-73, 1997.
     
  28. FaxAssist: Inbound fax routing using document understanding
    S. Tupaj, H. Dediu and H. Alam
    In Proc. Symposium on Document Image Understanding Technology, SDIUT'97, pp. 74-84, 1997.

PATENTS
 


 

Patents Assigned

  1. Processor Based Method for extracting Tables from Printed Documents. #5,737,4224.
  2. Processor Based Method for extracting Tablets from Printed Documents. #5,965,422.
  3. Network Fax Routing via Email. #6,104,500.
  4. Conversion Data Representing a Document to Other Formats for Manipulation and Display. #6,336,124 B1.

Patents Pending

  1. Exploring a hybrid of support vector machines (SVMs) and a heuristic based system in classifying web pages. Provisional Patent, 60/371,046 filed on April 8th, 2002.
  2. Displaying Java scripts on PDAs. Provisional Patent, 60/408,795, filed on September 6th, 2002.
  3. Universal email for PDAs. Provisional Patent, 60/408,796, filed on September 6th, 2002.
  4. Structured and Unstructured Document Summarization: Design of a Commercial Summarizer using Lexical Chains. Provisional Patent application filed on December 18th, 2002.

 

 

 

 

 

 

 

Copyright 1993 - BCL Technologies, Inc. All rights reserved. All other trademarks are the property of their respective owners.