Unit 5
Introduction to Database
DATA
Data is a collection of facts – unorganized, but able to be organized into useful information. Data are raw facts and figures in isolation. These isolated facts and figures convey meanings but generally are not useful to them. For example Aryan, lives, boy, birtamode, etc.
INFORMATION
Information is the product or result of processing data into a meaningful form. In other words, we can say that information is a data which is placed in a meaningful form to the users. For example:
“Aryan is a boy and he lives in birtamode”.
DATABASE
A database is a collection of interrelated data that is organized so that its contents can easily be accessed, managed and updated. In other words, a database is an organized collection of data. A database contains a collection of related items or facts arranged in a specific structure. The simple example of a non-computerized database is a telephone directory. A collection of data designed to be used by different people is called a database. It is a collection of interrelated data stored together, with controlled redundancy to serve one or more applications in an optimal fashion. The data are stored in such a fashion that they are independent of the programs of people using the data.
Characteristic of a database are:
1. Should be able to store all kinds of data that exist in this real world. Since we need to work with all kinds of data and requirements, the database should be strong enough to store all kinds of data that is present around us.
2. Should be able to relate the entities/tables in the database by means of a relation. i.e.; any two tables should be related. Let us say, an employee works for a
department. This implies that Employee is related to a particular department. We should be able to define such a relationship between any two entities in the database. There should not be any table lying without any mapping.
3. Data and application should be isolated. Because the database is a system which gives the platform to store the data, and the data is the one which allows the database to work. Hence there should be a clear differentiation between them.
4. There should not be any duplication of data in the database. Data should be stored in such a way that it should not be repeated in multiple tables. If repeated, it would be an unnecessary waste of DB space and maintaining such data becomes chaos.
5. DBMS has a strong query language. Once the database is designed, this helps the user to retrieve and manipulate the data. If a particular user wants to see any specific data, he can apply as many filtering conditions that he wants and pull the data that he needs.
6. Multiple users should be able to access the same database, without affecting the other user. i.e.; if teachers want to update a student’s marks in Results table at the same time, then they should be allowed to update the marks for their subjects, without modifying other subject marks. A good database should support this feature.
7. It supports multiple views to the user, depending on his role. In a school database, Students will able to see only their reports and their access would be read-only. At the same time, teachers will have access to all the students with the modification rights. But the database is the same. Hence a single database provides different views to different users.
8. The database should also provide security, i.e.; when there are multiple users are accessing the database, each user will have their own levels of rights to see the database. Some of them will be allowed to see the whole database, and some will have only partial rights. For example, an instructor who is teaching Physics will have access to see and update marks of his subject. He will not have access to other subjects. But the HOD will have full access to all the subjects.
DATABASE MANAGEMENT SYSTEM (DBMS)
A database management system (DBMS) is a software tool that allows multiple users to store, access, and process data or facts into useful information. In other words, a database management system (DBMS) is a system or software designed to manage a database, and run operations on the data requested by numerous clients.
DBMS is a complex software system that constructs, expands and maintains the database. The primary goal of DBMS is to provide an environment, i.e. both convenient & effective to use in retrieving & storing database. DBMS is an interface between the application program and physical data files. The common language for accessing most database systems is SQL (Structured Query Language).
Some popular DBMS includes dBase, Visual FoxPro, Oracle, DB2, Informix, MS SQL Server, MySQL, and Microsoft Access
Note: Database System = Database + DBMS + application programs (or, queries)
OBJECTIVE OF DBMS
Some of the objectives of DBMS are as follows: Provides relevant data to users
• Easy access to data and information
• Provides quick response to the user request for data
• Eliminates the duplicate data
• Allows multiple users to access and share data
• Allows the scalability of database
• Protects data from unauthorized access
• Provides an abstract view of data that hides details of data from users
• Creates relationships between items of data
Difference between Database and DBMS
Database
|
DBMS
|
It is a collection of data or related information.
|
It is a software package to manage the database.
|
It consists of data.
|
It manages data stored in the database.
|
It is a part of DBMS.
|
It is a software system that contains the database.
|
E.g. Phonebook, Attendance Register, etc.
|
E.g. FoxPro, Access, Oracle, etc.
|
ADVANTAGES OF DBMS
1. makes easy to add new data.
2. makes easy to modify the database.
3. makes easy to delete existing data
4. Organized the data in proper sequence.
5. It reduces the data redundancy to a large extent.
6. It can control data inconsistency to a large extent.
7. Maintains data integrity i.e. accurate, consistent and up-to-date data
8. Make easy to access the data for the authorized user.
9. Allow multiple users to be active at one time (i.e. data in the database may be shared among several users)
10. Protecting data against unauthorized access.
11. Allow for growth in the database system.
DISADVANTAGES OF DBMS
1. Complex to understand and implement
2. Costly
3. Too many rules
4. Fast changing technology
5. Chance of losing the data
6. Chance of data leakage and hacking
7. Unavailability of trained manpower
Characteristics of DBMS
• Relation-based tables − DBMS allows entities and relations among them to form tables. A user can understand the architecture of a database just by looking at the table names.
• Isolation of data and application − A database system is entirely different than its data. A database is an active entity, whereas data is said to be passive, on which the database works and organizes. DBMS also stores metadata, which is data about data, to ease its own process.
• Less redundancy − DBMS follows the rules of normalization, which splits a relation when any of its attributes is having redundancy in values. Normalization is a mathematically rich and scientific process that reduces data redundancy.
• Consistency − Consistency is a state where every relation in a database remains consistent. There exist methods and techniques, which can detect attempt of leaving the database in the inconsistent state. A DBMS can provide greater consistency as compared to earlier forms of data storing applications like file-processing systems.
• Query Language − DBMS is equipped with a query language, which makes it more efficient to retrieve and manipulate data. A user can apply as many and as different filtering options as required to retrieve a set of data. Traditionally it was not possible where the file-processing system was used.
• ACID Properties − DBMS follows the concepts of Atomicity, Consistency, Isolation, and Durability (normally shortened as ACID). These concepts are applied to transactions, which manipulate data in a database. ACID properties help the database stay healthy in multi-transactional environments and in case of failure.
• Multiuser and Concurrent Access − DBMS supports multi-user environment and allows them to access and manipulate data in parallel. Though there are restrictions on transactions when users attempt to handle the same data item, users are always unaware of them.
• Multiple views − DBMS offers multiple views for different users. A user who is in the Sales department will have a different view of a database than a person working in the Production department. This feature enables the users to have a concentrated view of the database according to their requirements.
• Security − Features like multiple views offer security to some extent where users are unable to access data of other users and departments. DBMS offers methods to impose constraints while entering data into the database and retrieving the same at a later stage. DBMS offers many different levels of security features, which enables multiple users to have different views with different features. For example, a user in the Sales department cannot see the data that belongs to the Purchase department. Additionally, it can also be managed how much data of the Sales department should be displayed to the user. Since a DBMS is not saved on the disk as traditional file systems, it is very hard for miscreants to break the code.
DBA (Database Administrator)
DBA is a special person (super users) who controls both the data and the programs that access those data, i.e. controls the overall system. The DBA is responsible for ensuring that the data in the database meets the information needs of the organization. The DBA must have a sound knowledge of the structure of the database & of the DBMS. The DBA must also be thoroughly conversant with the organization, its system & the information needs of the managers. A DBA needs the following:
a. Knowledge of the operating system in which database server is running.
b. Knowledge of SQL
c. Sound knowledge of database design
d. General understanding of network architectures
e. Knowledge of the database server.
Roles/Functions/Responsibilities of database administrator (DBA)
The DBA is responsible for ensuring that:
a) The data in the database meets the information needs of the organization
b) The facilities for retrieving data and for structuring reports are appropriate to the needs of the organization.
c) The DBA is responsible for the data dictionary (data about data or metadata, i.e. define the structure of data) and manuals for users describing the facilities the database offers and how to make use of these facilities.
d) Another function of DBA is to supervise the modification (insert, delete and update) of data.
e) The DBA is also responsible for the security of database and requirements of privacy.
f) The DBA is also responsible for database integrity maintenance (changes made to the database, do not result in a loss of data).
DATABASE SYSTEM ARCHITECTURE
The architecture of a database system is greatly influenced by the underlying computer system on which the database system runs.
(1) Centralized Database System
Centralized database systems are those that run on a single computer system and do not interact with other computer systems i.e. these database systems are used only in a single user system (personal computers). A typical single user system is a desktop unit used by a single person, has only one CPU, one or two hard disks and has an operating system that may support only one user. The centralized database works on a client-server basis. The structure of a centralized databases system is shown in the figure.
Fig: A centralized Database System
(2) Distributed database system
A distributed database system is a collection of databases that share a common schema and coordinates to access no local data, i.e., in a distributed database system, the database is stored on several computers. The computers in a distributed system communicate with one another through various communication media, such as networks or telephone lines. They do not share main memory or disks. The computer in a distributed system is called site or nodes. The general structure of the distributed system is shown in the figure below.
Fig: A Distributed Database System
Types of Database
Document Oriented Database – This database is free from any type of strict schema. It does not store data in the form of the data table but in the form of text records. This type of database is suitable for storing dynamic data. CouchDB and RavenDB are examples of document databases. It is useful for an application which is document-based. Documents are encoded using some standard formats.
Fig: document model
Embedded Database – An embedded database runs within an application, and therefore it does not run as a separate application. Unlike general purpose databases, this database is embedded as inline code or linked library. It saves time wasted on issues related to installations or maintenance. These types of databases are generally found in the set-top boxes, mobile phones, etc. RDM server and RDM Embedded are examples of these types of databases.
Graph Database – It is based on the relationship of resources with each other, and no particular resource has any essential importance on the other. These types of graphs help in storing data in a dynamic schema. It provides index-free adjacency. In this graph database, each vertex works as a mini-index for its adjacent elements. Infogrid type of graph database should be preferred for model flexibility.
Hypertext Database – These types of databases are used for organizing a large sum of dissimilar information. The type of information is not devised for carrying out numerical analysis. An object is linked with any other object in a hypertext-type of database. This kind of database system was invented by Ted Nelson. They are preferred for maintaining online encyclopedias. Unlike traditional databases, it has no regular structure, and therefore the user can reach to the desired information through different ways.
Operational Database – It contains data related to the operations going on in an organization or enterprise. Some of the main information it contains are regarding information of employees, data describing transactions, etc. This type of database is updated regularly. It works on the same approach as OLTP. The focus of this database is to record current data. It is often differentiated with the data warehouse.
Distributed Database - It consists of a set of databases which are located on different computers, but all these databases work as one database logically. Therefore, the data can be accessed and modified simultaneously with the help of a network. It is controlled by a local DBMS. It is important to maintain consistency while dealing with this type of arrangement.
Flat-File Database – These are data files in which records hold no structured relationship. Additional information is often required for understanding or interpreting these files. In simple language, if we have one table in a database, it will be referred to as a flat file database. It is useful for storing a few records. A spreadsheet application like Excel works as a flat file database.
Click here to download pdf!
Thank you!
Thank you!
No comments: