Monday, January 19, 2009

Distributed databases are going to replace RDMS?

Richard Jones of Last.fm just posted an overview with a great deal of engineering insight:

http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/



Swaroop CH of Yahoo wrote an overview of distributed DBs:

http://www.swaroopch.com/notes/Distributed_Storage_Systems

The End of an Architectural Era(It’s Time for a Complete Rewrite)

http://www.vldb.org/conf/2007/papers/industrial/p1150-stonebraker.pdf



The HBase performance

http://www.mail-archive.com/hadoop-user@lucene.apache.org/msg02540.html

Thursday, January 15, 2009

Thursday, January 8, 2009

Mount new disk on CentOS linux

  1. Check what is the new hard disk device name with "fdisk -l", it should be something like /dev/sdb. You can easily identify which is the new drive by running "mount" and finding the drive that exists in "fdisk -l" but is not mounted.
    mount; fdisk -l;
  2. Create a partition on the new drive, (the sample code assumes the disk is /dev/sdb)
    echo -ne "n\np\n1\n\n\nw\n" | fdisk /dev/sdb1
  3. Create a filesystem on the new partition, we use ext3 file system.
     mkfs.ext3 /dev/sdb1
  4. Create a directory where to mount the partition
    mkdir /mount/data2
  5. Edit /etc/fstab an add a record for the new drive at the end of the file. This will make the server mount the drive automatically after reboot. Mount options (like noatime and nodiratime) can be added as a comma separated list of values after "defaults": "defaults,noatime,nodiratime"
    echo "/dev/sdb1  /mount/data2  ext3  defaults 0 0" >> /etc/fstab
Linux records information about when files were created and last modified as well as when it was last accessed. There is a cost associated with recording the last access time. Linux has a special mount option for file systems called noatime that can be added to each line that addresses one file system in the /etc/fstab file. If a file system has been mounted with this option, reading accesses to the file system will no longer result in an update to the atime information associated with the file. The importance of the noatime setting is that it eliminates the need by the system to make writes to the file system for files which are simply being read. Since writes can be somewhat expensive, this can result in measurable performance gains."

nodiratime does the same thing but for directories. I know the beginners guide says to use both mount options on filesystems, but from others I've talked to and places I've read it seems noatime implies nodiratime because noatime is a superset and nodiratime is a subset used specifically to disable it for directories but leave it on for files, and when you use noatime, it does it for everything, files/dirs
   echo "/dev/sdb1  /mount/data2  ext3  rw,noatime,nodiratime 0 0" >> /etc/fstab
Mount the drive. "mount -a" just mounts everything according to /etc/fstab.
  1. mount -a
  2. Restart the server to make sure it starts ok with the new drive mounted.
    shutdown -r now

An Introduction to ZooKeeper from Yahoo DN