Sunday, 11 August 2013

Basics of Hacking


Linux live cd is good option to take files from computer without logging in.When you boot with linux live cd it does not understand windows permission so you can boot into any computer without password and copy files

Hacking is any non conventional way of interacting with computer

using backdoor url used for testing in windows
easter eggs extra feature that programmers aded so that only programmers can make use of .Eg cheat words in games

host file is kept in system 32 folder whenever you put a address in browser it usually looks in the host file if it cannot find then it goes to local dns (which can be your server if in office network) otherwise it goes to public dns

if you install a antivirus it might make a lot of entires in host file and point them to local so that you cant access sites which it consider has viruses

you can use a alternate public dns like opendns.org which has list of sites which are harmful and you can configure so taht everytime your employees try to go to these sites it blocks them

host file can be easily changes with scripts in xp but in windows 7 it wil give a pop up yes no.if you have set a administrator password for your machine then only user with admin access can change host files

Hacking windows registery 

registery can be edited to changes setting like hide task manager proxy serves start up task . scripts can edit your reqistery when you click on certain sites scripts can be embedded that can make this change






below are some of denial of service attacks

•Ping of Death - bots create huge electronic packets and sends them on to victims
•Mailbomb - bots send a massive amount of e-mail, crashing e-mail servers
•Smurf Attack - bots send Internet Control Message Protocol (ICMP) messages to reflectors, see above illustration
•Teardrop - bots send pieces of an illegitimate packet; the victim system tries to recombine the pieces into a packet and crashes as a result


A ping of death (abbreviated "PoD") is a type of attack on a computer that involves sending a malformed or otherwise malicious ping to a computer. A ping is normally 56 bytes in size (or 84 bytes when the Internet Protocol [IP] header is considered); historically, many computer systems could not handle a ping packet larger than the maximum IPv4 packet size, which is 65,535 bytes. Sending a ping of this size could crash the target computer


Generally, sending a 65,536-byte ping packet would violate the Internet Protocol as written in RFC 791, but a packet of such a size can be sent if it is fragmented; when the target computer reassembles the packet, a buffer overflow can occur, which often causes a system crash.


The Smurf Attack is a denial-of-service attack in which large numbers of Internet Control Message Protocol (ICMP) packets with the intended victim's spoofed source IP are broadcast to a computer network using an IP Broadcast address. Most devices on a network will, in their default settings, respond to this by sending a reply to the source IP address. If the number of machines on the network that receive and respond to these packets is very large, the victim's computer will be flooding with traffic. This can slow down the victim's computer to the point where it becomes impossible to work on.

IP Spoofing In computer networking, IP address spoofing or IP spoofing is the creation of Internet Protocol (IP) packets with a forged source IP address, with the purpose of concealing the identity of the sender or impersonating another computing system.[1]

ICMP request are usually request for machine information and all that


A broadcast address is a logical address at which all devices connected to a multiple-access communications network are enabled to receive datagrams. A message sent to a broadcast address is typically received by all network-attached hosts, rather than by a specific host.

 A ping flood is a simple denial-of-service attack where the attacker overwhelms the victim with ICMP Echo Request (ping) packets. This is most effective by using the flood option of ping which sends ICMP packets as fast as possible without waiting for replies. Most implementations of ping require the user to be privileged in order to specify the flood option. It is most successful if the attacker has more bandwidth than the victim (for instance an attacker with a DSL line and the victim on a dial-up modem). The attacker hopes that the victim will respond with ICMP Echo Reply packets, thus consuming both outgoing bandwidth as well as incoming bandwidth. If the target system is slow enough, it is possible to consume enough of its CPU cycles for a user to notice a significant slowdown.

A flood ping can also be used as a diagnostic for network packet loss and throughput issues.[1]

Friday, 9 August 2013

Networking Basics

Hi All

Below are some networking basics


Sometime there are patch panels between firewall and switch patch panels are basically a box with lot of ports are are cheaper than switches suppose your office want 1000 connection we can have 5 patch panes with 200 each and then connect those 5 patch panes to switch ports . Like you telephone line splitter. one wire goes in and two wire come out



VPN ( Virtual Private network)

VPN works on client server architecture. Suppose you have ciso vpn client then you need to have a cisco vpn server to connect to.It uses tunneling protocol

So VPN uses tunneling and encryption inside that tunneling to keep data secure.Now suppose a hacker tries to access the tunnel it drops the tunnel and creates a new tunnel taking a different route .

so how does it comes to know someone is penetratring its tunnel. suppose the signal it receives form client is not steady or if there are package loses it treats this as intrusion and drops teh tunnel so if you are working on DSL line (phone line modem) and if the wiring is old then the packets might drop due to bad wiring and this might cause the vpn to drop so you wont be able to connect to office network

second is the network speed inside your office is 100Mbps that is 10MB but when you are connecting through outside you speed will depend on the connection speed fo your internet so it might drop to 535KB. so it will take long time to access network


TCP/IP and Subnet Masking 

TCP stand for transmission control protocol and IP stands for internet protocol there are two version of tcp/IP version 4 and version 6.You must have noticed this as IPV4 nad IPV6 when you do ipconfig

IP- internet protocol deals with IP addresses ,subnet masking and default gateways.Its a routable protocol it allows n/w to be divided into multiple subnetwork.If it was non routable then all computers will be able to communicate with all others.IP is layer 3 of OSI layer


Windowing

This is a important concept in IP consider packets are sent from computer a to computer b then for every package sent to B computer A receives acknowlegement .now packages are sent in a group called window suppose first window only one packet is sent and if it receives acknowledgment packets are increased.Suppose in middle package starts droppeing so it receives acknowledgment that 2 packets received and it has sent 1000 packets then it will again sent those 1000 packets 

Now this IPV4 concepts of windowing might be problematic for real time communication like skype 

DCHP 

Dynamic host control protocol - there are two ways a IP can be assigned to your system.Static IP and dynamic IP.If you look into your modem or wifi router you will see this two settings.Consider you want to play counter strike with your roomate what you do is just make your IP static and ask your roomate to connect to your ip. then you two are no LAN and not on internet. Usually there is a lease time for which your WLAN router will assign ip to your computer it can be one day ,  one week one year 

ipconfig/release will renew your ip



NAT -- Network address translation 

Now when the internet intially started everyone thought each one would require a unique IP like a phone number so every computer that you buy should be registered NAT solves this issue.Insider your home if you have 5 computers then each one will have unique ip but they might start like 192.168.1.1 ,192.168.1.2,192.168.1.3 and so on. But even i will have the same ip for my home pc. this is because inside your lan you can your reuse ip. But your internet line comming to your home will have a unique ip.This is tracable from outside world .Suppose you do any cyber crime you will be caught by this ip. as they know to which provider a set of IP are assigned and they will get the house address from the internet provider like reliance. 


Subnet Masking

when you do ipconfig you will get ip address , subnet masking and default gateway.So subnet masking tells us which part of your ip is your computer number 

like example 

IP address---192.168.1.10
Subnet mask -- 255.255.255.0 

where you know the last bit wil be your computer numbers (10) as last is 0 in subnet mask.You can have 255 computers in your subnetwork. Consider you did not have subnetwork all the computers connected to a router or a same physical switch would be able to communicate that is if you are in one building with 20 flats 5 for each floor may be each floor is subnetwork using a subnet masking otherwise everyone can hack into others system if ports are open.

VOIP 

Skype , Cisco phones that we use in office are VOIP.VOIP hardphones are like computers and have their own ip address through which they can be accessed for configuration.

To configure VOIP in office you need a VOIP server ( a normal pc on which VOIP service is installed ) windows server is loaded with this service.But you can also have a cisco VOIP server. you will also need a client a softphone or a hardphone.

All VOIP phones (hard or soft) use SIP protocols ( session initiation protocol) but CISCO uses a free protocol whereas Avaya and Skype have their own protocols so if you have a Avaya server you cant connect other hardphones.

Codec -- determines the amount of bandwidth allocated to your VOIP communication in gtalk its 4.5kbps.Higher the bandwidth better the quality. you can set VOIP on priority in your router

In VOIP phones the latency is 75ms to 100ms.In normal phones it 45ms.The way VOIP reduces cost for most enterprises

Consider there is a office in middle of india and you have suboffices across.then office in village will have VOIP server and to that VOIP server you will have phones lines connected apart from internet line so when you call the call will come from delhi on internet line then from the local VOIP server connect to village local exchange so that call is charged as local and there is saving in terms of phone bill.


Network Mapping

ICMP - Internet control messaging protocol.It will have SNMP protocol inside it.

SMB shares , SNMP ( Simple network management protocol)

SNMP gives info like all software updates on your machine, all updates, all hardware info like RAM.this is how your network admin know you have installed unlicenced software on your laptop.
you can disable this service from your machine.There are software for network mapping which work on SNMP that gives you all the information

some useful commands are 

Tracert -- this will give you the full path that is how your computer reaches facebook 

tracert www.facebook.com 

will give you all details like routers , exchangs on the way and all that.

Gateway is anything that connects you to outside world.Usually in this world its modem.












Thursday, 8 August 2013

Linux Basics


These commands need to be arranged in term of their usability

1) A function letter does not need to be prefixed with a dash ("-"), and may be combined with other single-letter options.
2) A long function name must be prefixed with a double dash ("--").
3) Some options take a parameter; with the single-letter form these must be given as separate arguments.
 With the long form, they may be given by appending "=value" to the option.

Tar ( Tape archive)

1) A function letter does not need to be prefixed with a dash ("-"), and may be combined with other single-letter options.
2) A long function name must be prefixed with a double dash ("--").
3) Some options take a parameter; with the single-letter form these must be given as separate arguments.
 With the long form, they may be given by appending "=value" to the option.

All 4 below mean the same thing

a) tar --create --file=archive.tar file1 file2
Note -- c for create , f for file
b) tar -c -f archive.tar file1 file2
c) tar -cf archive.tar file1 file2
d) tar cf archive.tar file1 file2

Create archive archive.tar containing files file1 and file2. Here, the c tells tar you will be creating an archive; the f tells tar that the next option (here it's archive.tar) will be the name of the archive it creates. file1 and file2, the final arguments, are the files to be archived

Other useful addition to tar
A - Append ( you can add a file to already present archive file)
t- List contents (Notice T as t of LisT)
d- difference
v-Verbose
X-Extract
u- update (update a already present file in archive if its old copy)
z-zipped
c-Compressed

tar -tvf archive.tar--List the files in the archive archive.tar verbosely
tar -xf archive.tar -- Extract files from archive
tar -xzvf archive.tar.gz -- Extract files from zipped archive

Gzip (GNU zip) and Gunzip (GNU unzip)

Gzip is a compression command and is used to compress(reduce the size of file).So running gzip on 3 files will not combine them into one file.
In order to compress a folder, we need to first use tar and then zip with gzip
Example -
gzip file1 file2 file3 ... this will produce 3 files ,file1.gz, file2.gz and file3.gz with .gz extension.

Compressing a folder
tar cf – test/ | gzip > test.tar.gz

Some File manupulating commands

awk 'BEGIN {start_action} {action} END {stop_action}' filename

Note
1) $1 will print the first column
2) $0 will print the entire line
3) FS field separator
4) OFS - Output field separator variable
5) NF - Number of fileds variable
6) NR - Number of lines


awk '{print $1}' input_file -- Prints first column
awk 'BEGIN {sum=0} {sum=sum+$5} END {print sum}' input_file --Sum of values in 5th block
awk '{ if($9 == "t4") print $0;}' input_file ----notice the semicolon
awk 'BEGIN {FS=":"} {print $2}' input_file
awk 'BEGIN {OFS=":"} {print $4,$5}' input_file
awk '{print NF}' input_file  - tells us number of fields
awk '{print NR}' input_file -- tells us number of records

Cut Command

Cut command in unix (or linux) is used to select sections of text from each line of files. It can be used as substring command by specifying the start and end position to cut. Also it can be used similar to awk command by specifying the delimiter.
Note
d is used for delimeter
f is used for field posistion
c - Cut

cut -c4 file.txt
cut -c4,6 file.txt -- this means only 4th and 6th character. It will apply to all rows
cut -c4-7 file.txt--- Start position and End position.
cut -c10- file.txt ---Start position and no end position
cut -d' ' -f2 file.txt -- Notice no c here, We are picking a field

Important point to remember. Command always use delimeter to find end of a column
Example

logfile.dat
sum.pl
add_int.sh

 cut -d'.' -f1
--- this will give us logfile and others ( sum, add_int)

If we need to find the value after dot. We need to reverse the string and then cut

Reverse

echo "nixcraft" | rev

This will output tfarcxin

Head command in Linux


































Wednesday, 7 August 2013

Know your drives/USB

ST320L T007-9ZV142

2.5",SATA,3Gb/s,7200,512e

SATA - serial advanced technology attachement

provides hot swapping

SATA -- has speed upto 600mb/s .See my hard drive states 3Gb/s here Gb is giga bits and not giga bytes to it means 300MB/S
earlier one like IDE had 150mb/s

each drive connect directly to motherboard. IDE had ribbon like cables and you can conect master /slave hard drives

in SATA you need to configure from BIOS which is primary drive


Why cant you use SSD drives externally .Like you might say the external drive might be a SSD .First SSD uses SATA you dont have SATA port coming out from your machine .they are generally between hard drive and motherboard. you might have a Esata port on your laptop

now if you manage to your use SSD externally by formating partitioning and connecting to USB 3.0 then the speed will be restricted to the speed of USB 3.0


what are sectors on hard drives.512 e is byte size for a sector

hard drives have tracks and structure .Formating write file storage structure like file allocation table onto sectors.sectors are grouped
together into clusturs

tapes used to use ferric oxide and mangetic flux used to magnetise it as in tape recorders.Basically ferrix oxide is permanently magnetised
by the signal of music in a pattern while reading this magnetic field of magnetised material causes current in the head of tape reader
the same principal works for hard disks

So we know information is saved as 1 and 0 so how is that done.we know when ferromagnetic subtaances are exposeed to current they are magnetised
so if they are magnetised with NS then its a 1 and SN its a 0


SB 2.0 tranfters at rate of 480Mbits/s that is 48 mb per second

USB 3.0 transfers at rate of 5Gbits/s that is 500Mb/s .Funny part is my hard drive has 3Gbits/s capacity.

we can add USB 3.0 as PCI express card


52x 32x 52x----------- means read/Write/Read write --- 1X was 150kb-- rewrite is faster because drive is already formated accordingly

so a 52X drives reads at 52*150kb == 7800kb that is 7.8Mb/s--- a USB is 48 Mb/s

But for DVD a 1x means 1.32Mb/s so DVD drives come up to 16x that is 21.2mb/s

Theoratically DVD should be faster than USB 2.0 but becaue of seek time and other stuff USB will perform faster

what is difference between firewire and mini usb .. both look same 4 wire pin


what are display ports and DVI ports --- images are there .

display ports are used to connect laptop to monitor instead of DVI porst new laptops are coming with this. Usually we connect monitors through
VGA ports also some were using DVI ports .Display ports use the same technology as HDMI ports but are different from HDMI.

What is line in for computer

You can use the line-in connection on your sound card to connect a portable music player,
microphone, or other audio input device to your computer. Most laptops now will not have these its a old time

serial ports

they transfer one bit at a time.Not used now with coming of USB

1     Data Carrier Detect     DCD
2     Received Data     RxData
3     Transmitted Data     TxData
4     Data Terminal Ready     DTR
5     Signal Ground     Gnd
6     Data Set Ready     DSR
7     Request To Send     RTS
8     Clear To Send     CTS
9     Ring Indicator     RI


Some points on RAM and Video cards

We all know that CPU cants access directly from hard drive data has to be present in ram to be accessed. So we have different type of ram speeds depending on type of ram

If you have old motherboard around you will notice it has slot of AGP its a old standard for video cards now no longer in use.New one are pci express.

Also VGA ports cant suppose great resolution so some pc will have DVI ports latest ones might have display ports instead of DVI look for display port symbol below

you can also use video cards.They have inbuild ram upto 1gb 2gb and also sometime has VPU video processing unit




A good topic to study to understand how communication takes place

















Sunday, 4 August 2013

Cloud Computing Basics

Hi

going to write a good article on cloud computing in detail


http://www.youtube.com/watch?v=QYzJl0Zrc4M

Cloud computing is separating your application from hardware and operating system.That the reason we say the application is on the cloud.

So the simple question in anyones mind is how was the application stored earlier.Even earlier it was stored on a server machine(any
computer which acts as server).So suppose you are owner of IT company and you have software installed on your machine and the client
accesses it from web browser so will they say application is on the cloud.No.

So why not.The application is still dependent on your hardware and operating system.Your server has windows server edition and some
hardware RAM, hard drive so if your windows crashes then the application is gone even if your hard drive crashes teh application is
gone.So its not on the cloud

With the coming of virtual machine cloud computing became possible.Virtulization software like XEN make it possible for
virutalization. so you have one server( any PC)  on that instead of windows you have XEN there is one more PC which acts as your XEN
Master machine which know the status of your server.

So suppose your server fails it will migrate the application to other server which now how it does it(Eg ESXI (slave) and Vsphere
(Management softwaer on other machine).Basically when you install any virtualization software it installs operating system on that
software so basically its like a file sitting on that software it can directly copy paste that file.That is you can migrate a OS
directly like a music file.

If you want to install ubuntu on your machine then install VMplayer(Virtualization software) first and it it you can have a ubuntu
virtual machine.

before we had virtualization suppose we wanted to migrate a application from one server to other we needed to take backup of
application , install windowns on another server and then install that aplication on windowns and then migrate the application data
which would take atlesas a day

Now you have hypervisor and application like Vsphere which acts as master to these machines on which hypervisor are installed.If you
need details about vsphere read below

vSphere HA provides high availability for virtual machines by pooling the virtual machines and the hosts they reside on into a
cluster. Hosts in the cluster are monitored and in the event of a failure, the virtual machines on a failed host are restarted on
alternate hosts.

When you create a vSphere HA cluster, a single host is automatically elected as the master host. The master host communicates with
vCenter Server and monitors the state of all protected virtual machines and of the slave hosts. Different types of host failures are
possible, and the master host must detect and appropriately deal with the failure. The master host must distinguish between a failed
host and one that is in a network partition or that has become network isolated. The master host uses datastore heartbeating to
determine the type of failure.To check in depth of how clustors work.Check the link at the end of this article



Types of virtualization


client installed means we have a operating system installed on top of that we are installing a virtualization software(Eg vmware
fusion)

Hypervisor


ESXI - hypervisor from vmware .you install ESXI for this no need to install windows.It acts like a operating system

so we install the hypervisor on server(No OS installed before this step).For this you need a management software .Vsphere is a
managmenet software.EXSI is called bare metal hypervisor as it does not require any OS to run on.It has its own kernel.Some
background here.The intial kernel the program that control your hardware like RAM, hard drive were writen by unix (some scientists in
bell labs).These were used by linux which came much later and i guess to some extent by windows. so ESXI also uses them.

Wiki links

http://en.wikipedia.org/wiki/VMware_ESX



why hypervisor are powerful.


consider we have 3 physical server and we install EXSI on each of those 3 server.suppose a powersupply fails on one server it
transfers the OS with application to next server for fault tolerance this is amazing as no need to maintain redundancy like in raid
Virtualization alllows you to move entire OS from one server to other like copy paste.


So What are Amazon and Microsoft advertising for ??
so instead of buying server hardware now you can buy a instance(virtual machine).


Going into more detail like speed and bandwidth for these VM
in case of amazon edge servers there is a one central server and there are 9 servers located through US which will make it easier for
people to watch content fast.Now since they charge per usage that is how much users are using your site then in this case you need to
pay for transfter to edge server from central server and then to user computer from edger server so if you have users across country
accssing your site you need to pay for transfters to each edge servers.apart from this consider a video depending on the speed of
user there are different version of vid that central server transfters to edge server suppose user strats with 10mbps speed and then
the speed reduces to 500kb because of net issues then the first time the central server will transfer 1gb file then at the other time
it will transfter 500mb file so you are charged double

virtual iron a virtulization software can turn on physical server and turn off as needed

Things to consider

with cloud servers a lot of data goes in and out so need to worry aobu the speed of internet.but the problem with hosted exchange and
other things with cloud is the amount of time it takes to put the stuff on the internet


public v/s private clouds

Public clouds

microsfost azure ..

elastic cloud computing

edge server for amazon

Private clouds

Cloud hosted inside your company are private clouds


Now cloud computing is more than just virtulization


Some technologies before clouds


Initially there were mainframes that is there were dummy terminals (computers which did not have much processing power) which are
then connected to mainframes(big machines with processing power) so every thing you do is actually done by the mainframe server.These
were called terminal services.

Citrix MetaFrame allows you to run application from windows terminal servers ..example using citrix to run datastage

terminal services are now called remote desktop services

Then came the client server model that is software was installed on client (PC which had computing power) then connected to server
but it had its drawbacks like update to client software would take time.

The came the 3 tier architecture that the your bank login that you are accessing from home


How virtual host clustor works

http://www2.isupportyou.net/2010/07/what-is-clustering-how-it-works.html

www.youtube.com/watch?v=-NY_N_oyW4Y

Search for VMware vSphere Platform Overview videos

http://www.vmware.com/products/datacenter-virtualization/vcloud-suite/how-it-works.html

Apart from these videos there are number of vsphere resources out there.Its a big topic like how replication takes place.My idea in
this article was to give you a brief idea about virtulization.Get the highlevel of it.









Saturday, 3 August 2013

Unstructured data and its Challanges

Hi

Covering some basiscs of unstructured data annd its challenges 

Databases -Relational and Non relational

Hi Guys,

We have worked on relational database (oracle , sql server , mysql , teradata) .So with the coming of hadoop into picture i started analysis no relational databases.Non relatonal databases are used for storin unstructured data . Like your windows file system c drive , d drive and all

We can store email in our database stables but then our databases table will serve as file systems .

I will write more on this topic as i get time.

Basics about Servers

Hi Guys,

Since most of us dont make it to server rooms daily .so here are some basics about server.We do use them daily. but we dont know much
details

I will format this article later.But do go through you tube videos by ELI .they are 60 minute long but very very good

Basics of servers

http://www.youtube.com/watch?v=CDxaRfwzFrs

Server Hardware Good Video

http://www.youtube.com/watch?v=QYzJl0Zrc4M


Basically any PC can act as a server.But here we are discussing enterprise level servers which are more robust and are built not to
be easily stopped.Consider this like even if the power supply burns .If hard disks fail ,If RAMS fail,If Fans fail.It has redundency
built for all of these scenarios.

When you buy window server edition software it has certain services like exchange server(MS outlook) ,VOIP server software built in
it.It has different set of options then your windows desktop software.

Consider a Linux server.it will just give you a terminal screen (black screen) and you need to navigate only using commands.Dont
confuse it with ubuntu a desktop OS supplied by linus which has GUI.Linux servers are very very stable and will go on for years
without even requiring reboot.

It wont have issues of memory leaks like windows server which would require a reboot.

Some general Hardware of servers(Not absolute necessary)


server has ECC ram ( not usually neccessary).It checks RAM status on start up

xeon processor

redundant power supply

RAID- ( redundant array of integrated disk).It like multiple hard disk connected together which are hot swappable that is they can be
removed when server is running.Check the below video

SAN -- storage area network will is like RAID but we are using physical boxes.data is redundant that is stored inmulitple location

RAID

http://www.youtube.com/watch?v=X1x9EMd5ywY

what is esata... It a type of external sata (Generally our hard disk are SATA) .I have a article with basics of some hardware.Check
august articles

Some casing


1 u server cases--Check out the images at end of articles.Just to give idea how things look

4 u server cases---Check out the images at end of articles.Just to give idea how things look

ATX case--- Desktop PC case

PCI slots ??? for RAID cards

ITX cases

-------------------------------------------

http://www.youtube.com/watch?v=Kiftbm1L_eQ

virtualization of infrastructure

------------------------------------

Read article on cloud computing to learn more about virtulization

Virtulization software like XEN make it possible for virutalization. so you have one server( any PC)  on that instead of windows you
have XEN there is one more PC which acts as your XEN Master machine which know the status of your server.

So suppose your server fails it will migrate the application to other server which now how it does it(Eg ESXI (slave) and Vsphere
(Management softwaer on other machine).Basically when you install any virtualization software it installs operating system on that
software so basically its like a file sitting on that software it can directly copy paste that file.That is you can migrate a OS
directly like a music file.

-------------------------------------

How DB,Webserver,Active directory servers are managed

The idea of separating out instances under different operating system is to make sure that you have less ports as possible
Generally your webserver , active directory and db server will not sit on single machine(virtual machine).which makes it more secure
if one of the Virtual machine is hacked.The db server can communicate only on single port DB port and only to application server.So
you cant sent it a http request to db server.


Like database server how firewall is set up

we have a apache on different machine and mysql on other machine both are virtual machines we have set up the machine where we have
mysql database in such a manner taht it communicates only on port 1512 say(db port) and not on 80. So the security can be easily
controlled as outside hacker cannot get to this server by other port

-----------------------------------------
How server failure is managed

then we have number of physical server connected so in case of failure.so with server management software like a vm sphere
automatically it will bring up the server which is in hibernate software whenver load is there suppose for active directory

There are two Types of RAID --Hardware and software .

 In hardware Raid there is a raid controller so for operating system disk appear as if they are single disk

Software raid can be configure with linux and windows server.


RAID isn’t just a single way of combining disks. There are multiple RAID levels that provide different levels of performance and redundancy. All RAID levels have one thing in common: they combine multiple physical disks into a single logical disk that is presented to the operating system.
  • RAID 0: Unlike other RAID levels, RAID 0 provides no redundancy. However, RAID 0 allows you to increase performance using multiple disks. When you use RAID 0, data your computer writes to a hard disk is split across two (or more) hard drives evenly. For example, if your computer writes a 100MB file, 50MB will be written to one hard drive and 50MB will be written to the other hard drive. When the computer needs to read the file back, it can read 50MB from one hard drive and 50MB from the other hard drive at the same time — this will be faster than reading 100MB from a single hard drive. However, if any of the hard drives in the RAID array dies, you’ll lose your data. When you use RAID 0, your multiple disks appear to be a larger and faster hard disk — but they’re much more fragile.
  • RAID 1: In RAID 1, two disks are configured to mirror each other. When your computer writes 100MB of data to its disks, it will write the same 100MB to both hard disks. Each disk contains a complete copy of the data. This ensures that, if one of the disks ever fails, you will always have a complete, up-to-date copy of your data.
  • RAID 2, 3, and 4: These RAID levels are little-used and often considered obsolete.
  • RAID 5: To use RAID 5, you will need at least three disks. RAID 5 uses striping to divide data across all hard drives, with additional parity data divided across all disks. If one of the hard drives dies, you won’t lose any of your data. RAID 5 offers data redundancy with less storage cost than RAID 1 — for example, if you had four 1TB hard drives, you could create two separate RAID 1 arrays (1TB each for a total of 2TB storage space) or a single RAID 5 array with 3TB of storage space.
  • RAID 6: RAID 6 is similar to RAID 5, but adds an additional parity block, writing two parity blocks for each bit of data striped across the disks. You lose storage capacity, but RAID 6 provides additional protection from data loss. For example, if two hard drives die in a RAID 5 configuration, you’ll lose your data. If two hard drives die in a RAID 6 configuration, you’ll still have all your data.
  • RAID 10: Also known as RAID 1+0, RAID 10 divides data between primary disks and mirrors this data to secondary disks. In this way, it attempts to provide the advantages of RAID 0 (dividing data across multiple disks for a performance increase) with the advantages of RAID 1 (redundancy).

http://www.scottklarr.com/topic/23/how-raid-5-really-works/



RAID 5 in detail

RAID 5 provides a very redundant fault tolerance in addition to performance advantages allowing data to be safeguarded while only

sacrificing the equivalent of one drive's space. RAID-5 requires at least three hard drives of the same size; The total storage space

available with a RAID-5 array is equal to { (number of drives - 1) * size of smallest drive }. So if you use three 120gb hard drives,

you will have 240gb of actual usable space. If you use five 120gb hard drives, you would have 480gb of usable space. The more drives

you use, the more efficient your storage space becomes without losing any redundancy

Your data can survive a complete failure of one hard drive, however if two drives fail at the same time, ALL data will be lost. It is

very important to have an extra drive on hand so if a drive fails, you can replace it immediately for data rebuild. The RAID-5 array

can actually still be used with one drive completely missing or not working, but performance is degraded as the data must be rebuilt

on the fly. However, if you do not have an extra drive to plug in right away when one fails, it would be wise to keep the computer

and all drives powered off until you can replace the failed drive. You may think, "oh it will only be a couple days before the new

drive arrives," but ask yourself this: Is not having access to the data on these drives for only a couple days worse than taking the

risk of losing it all forever if another drive happens to fail? Probably not.

striping & Parity
Data is "striped" across the hard drives, with a dedicated parity block for each stripe. A, B, C, and D represent data "stripes."

Each stripe segment per drive can vary in size; I believe anywhere from 4kb to 256kb per stripe is normal and can be set during setup

to adjust performance. The blocks with a subscript P are the parity blocks which are a representation of the sum of all other blocks

in that stripe (explained in more detail below). The parity is responsible for the data fault tolerance and is also the reason why

you lose the amount of space equivalent to one drive. Taking notice of figure 1, let's say that the second drive fails. When a new

hard drive is put in its place the RAID controller would rebuild the data automatically. The data in segments A1 and A3 would be

compared to the AP parity block, which would allow the data for A2 to be rebuilt. This would take place on each stripe until the

entire drive is up to speed, so to speak. Parity blocks are determined by using a logical comparison called XOR (Exclusive OR) on

binary blocks of data which will be explained down further.

Performance
RAID 5 offers accelerated read performance because the data stream is accessed from multiple drives at the same time. Referring to

figure 1, let's say that stripe A was a single file. Normally on a single drive when you open that file, the whole thing would be

streamed from the one hard drive bit by bit - thus the one hard drive's max read speed is going to become a bottleneck. BUT, with a

RAID-5, that one file can be accessed in 1/3 of the time because it will be read from all 3 drives at once; block 1 has the first 1/3

of the file, block 2 has the second 1/3 section of the file, and the block 3 has the last part of the file. This, in a perfect

situation, causes your read speed to be tripled - with even more performance potential in RAID-5 arrays containing additional hard

drives!

The downfall to this is that there is an increased overhead when writing to the drives caused from parity calculation. Every single

bit written to the drives must be compared and processed to create a parity block. If your intended use involves a lot of data

writing (such as video recording, high traffic server, etc) RAID-5 would not be the most ideal choice.

XOR Comparison
Data is stored and processed at the very lowest levels in the form of binary which is of course 0s and 1s. There are methods of

comparing binary bits called operators. The one that does the magic of parity creation is called XOR, or Exclusive OR. If you have

experience in lower level programming or electronics, you probably already know what an XOR is

Basically, an XOR comparison will take two binary bits, compare them, and output a result of 0 or 1. It will return a 1 ONLY IF the

two inputs are different. If both bits are 0, the output is 0; If both bits are 1, the output is 0; If one bit is 0 and the other bit

is 1, the output is 1.

parity is distributed acroos drives so that there is no single disk which handles parity so if parity disk fails than what

Xor is exclusively or

00 0
01 1
10 1
11 0

So even if a drive fails we can rebuild the data

(Drive 1) XOR (Drive 2) = (0100) XOR (0101) = (0001)
(Result) XOR (Drive 3) = (0001) XOR (0010) = (0011)

Recovering data

(Drive 2) XOR (Drive 3) = (0101) XOR (0010) = (0111)
(Result) XOR (Drive 4) = (0111) XOR (0011) = (0100)





Thursday, 1 August 2013

File systems

Hi

Wanted to cover file systems

NTFS, ext3 , ext4,XFS .How to do and details

a file can be video audio file

it can be formated text

most opersation system has folder directories

drive lettes

how is data writen into storage media is the
thing that operating system manages

EXT2,EXT3 ,NTFS , FAT, HPFS

role of OS is to define API for file access and
the structure (directories names)


OS also provides drivers for physical access

file is a name for group of data.This group of

data
also has attribute like owners, permissions
date/times etc

structure of file can be explicit or implicit

like .exe it executible file and .txt is text

file can have access type like direct sequential

etc

file can have performance caching , organization

,memory management

file system also has directories management like

hierarchical form.


some file system can be across multipl physical

drives

like linux allows you to mount a drive so that

the directory sturcure remains the same

not always structure is hierarchical .like

mainframes the data is separated by using
abc.kkk.aaa.kklkl

reference paths ..

how data is saved on disk
. the OS always know where the root directory is

so its always saved on some posistion


so OS can decide where to start from



so how space is allocated .. usually a fixed

allocation of blocks , there is a directory that

contains a list of blocks for a file

OS access method masks use of non contiguos block
this is called fragmentaion


see how data is saved on ms block

file name 8 ext 3 attribut 1 reserved 10 time2

date 2 1st block 1 and size 4

FAT file allocation table -- the location of

first block will be kept in file allocation table


in windows 98.. the same structure FAT-32 this

was because of larger disks

base 8 ext3 attribute 1 ,NT 1,Sec 1 , creation

3,last accessed 2 , starting block 2 , last write

4 starting block 15.31 , file size 4


in unix i node number2 file name 14