Sunday, May 22, 2011

What is a Database?


This is a little difficult to explain in basic terms - at least for me but I am going to try my best. Database is for storing data. You would ask me, then why do we have a Hard Drive, Flash Drive, CDs, Floppy Disks and a Computer. So Database is for storing data within your Storage medium. The next question is, what data? Say you write something on Microsoft word, that is data. You compose an email that is data. Anything that can occupy at least a 'bit' space. All information is finally converted to binary '1's and '0's. This is the only way information can be stored in a storage medium. For example, when you store the number 8 - the binary equivalent 1000(read as 'One' 'Zero' 'Zero' 'Zero') gets stored. When you type a text in MS Word, you are storing data in a word file. When you use MS Excel, you are again storing data within the excel file. All these files contain data. Now you can relate that a database is similar to file. But there are vast differences. MS Word is not same as MS Excel and so is a Database not similar to any of these. But the overall concept is the same.

A Database stores information in an organized order. Like what? Say you would like to store the information about the employees of an organization. A database table will look like this. 

Emp Id
Name
Designation
Salary
E-1001
ABC
CEO
1,000,000
E-1002
XYZ
MANAGER
100,000
E-1003
MNO
Programmer
20,000
E-1004
KLM
Team Leader
40,000

If you have to store information in this format and retrieve back in the same format with E-1001 referring only and only to Name ABC and Designation CEO, then you would have to use the appropriate software which is a Database. Otherwise softwares which are not designed to understand that ‘relation’ between data will not be able to fetch back the exact information when you want to use it. 

Say you type the above content in a Notepad, the data would look like this (below). Because Notepad is only a Text Editor (to write and store text information), it will not understand in first place that E-0001 is Emp Id. Notepad does not provide the feature to understand ‘relation’ between data. No two lines or words are linked in anyway. It is only to type content. Absolutely no thinking capacity incorporated. Like you writing something on a paper. As a user you will have to remember that layout and relations, and understand that MNO, which is the next to E-1003 could be the Employee Name. 

Emp Id Name    Designation      Salary
E-1001 ABC     CEO     1,000,000
  E-1002 XYZ      MANAGER        100,000
  E-1003  MNO     Programmer      20,000
 E-1004  KLM     Team Leader     40,000

Whereas, it is possible and the basic functionality of a database to fetch all the details if just E-0001 is given. Because it understands that ABC, CEO and 1,000,000 correspond to E-0001. There are ways to write program in a database also and play with the data. They are called PL/SQL (Procedural Language/Structured Query Language). I will not get into any more details at this point. I am going to be covering these in detail in my later posts.

All information in stored in the database in an organized manner with relations like in the table above. So you know cannot use a database when you have to write a simple letter - instead MS Word should be your perfect choice!!

Compilers and Interpreters


Now you wrote some steps, but what happens next! How do you get the output and how would your computer understand what you wrote. There has to be a mechanism by which the computer understand the program and returns the desired result. This is done by another program called the compiler/interpreter. They take your code, break them down to bits and bytes, go through your steps, calculate the result and give the output. Compiler/Interpreter are themselves programs. They are written by the specific vendors - example C, C++, Matlab, BASIC, and Java. You never get access these and do not have worry about how they are written. They are in the background - acting as an inter mediator between your code and the computer.

To understand better, let’s take the example of a calculator. Have you ever wondered how it works? You say 10+5, and press Enter and you see 15 on the screen. Every calculator has a small micro controller chip with a 'program' in it. This program is similar to a compiler/interpreter in our context. The calculator chip is your hardware, and your maths (10 + 5) is your code. The micro controller program understands 10 and 5, converts to bits, performs the operation '+' and then returns 15.

In the first para, I was mentioning that each language has its own compilers/interpreters and syntax, but why? Because, they are developed by different vendors for different purposes. Sometimes or probably most of the times, you will be able to solve the same problem in different languages. But you pick a language because it has a unique feature or probably it suits your type of work, cost, platform [ Windows, Unix etc]. The vendor who designed the programming language decides on its syntax. That is why you would see some languages doing basic functions and very simple like writing in English and some others a little more complicated.


I have used Compiler/Interpreter through out and it would be unfair if I leave this topic without explaining what they are. On a high level, a Compiler would read and convert the entire program into a machine language. But an interpreter, will read one line/set of related line, and execute that first. Then read the second set of lines, execute and related it to the first executed code. By execute I mean, come to a conclusion of a calculation. An Interpreter is generally slow.

What is a Program?

This is my first post and I am super excited. This post will explain the basic terminologies used in the software world. Let’s start with a 'Program'. My mother always asks me, what do you in the office all day with the computer. I am sure with this she is going to will understand that I am doing useful stuff. Ok, now, what is a program? Something similar to the steps involved in solving a mathematical problem!! Let’s start with an example, if you were to find out, say the perimeter of a square given its area, then you would plan to write something like this

Let 'a' be the side of the square
Let 'p' be the perimeter
Given 'area' is 25
Calculate the side 'a' as square root of 'area'
So p = 4 * a

The same happens in a programming language too. You know the logic but instead of writing in English you are going to be writing in a different way. This way or the rules/method is called the syntax of a language. Each language has its own syntax/way of writing the code. Why a different syntax for each language? I will get into this in the next topic - Compilers/Interpreters. I tell all the beginners, first write your logic in English called the Pseudo code. Once you know how to solve the problem, and then start with your program. Additionally each language offers its own features, functionality and they have their own purpose for existence. A programming language is not only to solve a mathematical problem though it first originated for calculations and automation. [I will get into what automation is in my future posts]. I will conclude saying a program is a sequence of steps to solve a problem.