Database integration with the Web for biologists to share data and information Yulu Xia*National Science Fundation Center for Integrated Pest Management Department of Entomology Department of Computer Science North Carolina State University Raleigh, North Carolina, USA Tel: 919 513 1432 Fax: 919 513 1114 http://cipm.ncsu.edu yulu_xia@ncsu.edu Roland
E. Stinner Ping-Chu
Chu * Corresponding author
Biological sciences are data-intensive. There are enormous amount of data existed in the fields such as genomics and entomology, and new data are generated in exponentially increasing rate due to the adoption of newer laboratory technologies such as DNA micro-arrays. Providing universal access to share and utilize those data becomes increasing important and challenging. In this paper, author discuss the issues relating to integrate biological data with the World Wide Web (Web hereafter). To integrate biological data with the Web, the first thing is to store data in database. Modern concept of database can be described as a collection of data managed by database management system (DBMS). A DBMS is a software system for creating, manipulating, and managing data. Some well-known DBMS include Oracle, Sybase, DB2, and SQL Server. Database allows us to store, retrieve, or modify data easily and efficiently regardless of the amount of data being manipulated. Another major advantage using database for biological data is that database can be easily integrated with the Web. This is a real revolution in terms of information sharing and exchanging. It brings us enormous opportunity and flexibility for sharing and utilizing biological data. Client can access biological data from any where at any time. Development of a database can be a major effort or simple task depending on each project. Generally speaking, database development does not require previous programming experience. However, some knowledge on DBMS, Structural Query Language (SQL) which is a simple language for creating and manipulating relational database, and the major principles of database design is required before embarking on a database project. The next task is to integrate database with the Web once the database is in place. To integrate database with the Web, one of many programming technologies, such as CGI, ASP, JSP, ColdFusion, and PHP, is needed. These technologies are usually based on one or more general purpose programming languages such as Perl, C, C++, and Java. Mastering one of the programming languages is a must for the task. Some of the languages are relative easy to learn and inexpensive to use. But they are generally less powerful. Other languages can be complicated and have longer learning curve. However, those languages are usually more power when dealing with large project. Choosing a suitable language is upon to your experience with the language and the nature of your project. Authors suggest starting from a small and simple project. This will help gain the experience needed for larger and more complicated projects later. By integrating biological database with the Web, we provide universal access to our data. However, to achieve true universal data sharing, we need standardization. How can others share our genomic sequence data if there are multiple names for one gene? or there is no standard format for data submission and storing? Another increasing significance of data standardization is for automated data exchange, process, and publication. By using eXtensible Markup Language (XML) and other technology, we can let computer understand the meaning of the data and process the data automatically. Many progresses have been made in process biological data using XML based technology. For example, Bioinformatic Sequence Markup Language (BSML) can be found at http://www.oasis-open.org/cover.bsml.html. In summer, this paper
covers some basics on database development, Web programming, and XML
based data standardization technologies. Technological progresses
in the fields have been extremely fast recently. To keep up with the
newer technologies, one needs to update knowledge constantly. |
Home | Mail to Editor | Search | Archive |