Friday, April 17, 2020

Docker + HELM + Kubernetes

Thursday, February 22, 2018

Kafka benchmarking



bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic=kafka_benchmark --num-records=10000 --throughput=10 --record-size=200 --producer-props bootstrap.servers=localhost:9092 buffer.memory=67108864 batch.size=6

bin/kafka-consumer-perf-test.sh --zookeeper localhost:2181 --messages 50000000 --topic kafka_benchmark --threads 1

bin/kafka-consumer-perf-test.sh --messages 100000000 -topic network_perf --threads 3 --broker-list dl6-l2-kafka-01:9092,dl6-l2-kafka-02:9092,dl6-l2-kafka-03:9092 -show-detailed-stats

https://gist.github.com/jkreps/c7ddb4041ef62a900e6c

Sunday, December 24, 2017

GIT statistics

Show number of commits by developer

git shortlog | grep -E '^[^ ]' | sort -u | wc -l

Show number of commits for a period

git log --pretty=oneline --after=12/31/2016 | wc -l

Thursday, December 18, 2014

Useful links

Friday, January 25, 2013

Numbers Everyone Should Know


From Google Pro Tips: Numbers Everyone Should Know

To evaluate design alternatives you first need a good sense of how long typical operations will take. Dr. Dean gives this list:
  • L1 cache reference 0.5 ns
  • Branch mispredict 5 ns
  • L2 cache reference 7 ns
  • Mutex lock/unlock 100 ns
  • Main memory reference 100 ns
  • Compress 1K bytes with Zippy 10,000 ns
  • Send 2K bytes over 1 Gbps network 20,000 ns
  • Read 1 MB sequentially from memory 250,000 ns
  • Round trip within same datacenter 500,000 ns
  • Disk seek 10,000,000 ns
  • Read 1 MB sequentially from network 10,000,000 ns
  • Read 1 MB sequentially from disk 30,000,000 ns
  • Send packet CA->Netherlands->CA 150,000,000 ns 
Some things to notice:
  • Notice the magnitude differences in the performance of different options.
  • Datacenters are far away so it takes a long time to send anything between them.
  • Memory is fast and disks are slow.
  • By using a cheap compression algorithm a lot (by a factor of 2) of network bandwidth can be saved.
  • Writes are 40 times more expensive than reads.
  • Global shared data is expensive. This is a fundamental limitation of distributed systems. The lock contention in shared heavily written objects kills performance as transactions become serialized and slow.
  • Architect for scaling writes.
  • Optimize for low write contention.
  • Optimize wide. Make writes as parallel as you can.

Example: Generate Image Results Page Of 30 Thumbnails

The is the example given in the video. Two design alternatives are used as design thought experiments.

Design 1 - Serial 

  • Read images serially. Do a disk seek. Read a 256K image and then go on to the next image.
  • Performance: 30 seeks * 10 ms/seek + 30 * 256K / 30 MB /s = 560ms

Design 2 - Parallel 

  • Issue reads in parallel.
  • Performance: 10 ms/seek + 256K read / 30 MB/s = 18ms
  • There will be variance from the disk reads, so the more likely time is 30-60ms

Tuesday, April 17, 2012

Instagram - architecture that worth now 1B

  • Amazon shop. They use many of Amazon's services. With only 3 engineers so don’t have the time to look at self hosting.
  • 100+ EC2 instances total for various purposes.
  • Ubuntu Linux 11.04 (“Natty Narwhal”). Solid, other Ubuntu versions froze on them.
  • Amazon’s Elastic Load Balancer routes requests and 3 nginx instances sit behind the ELB.
  • SSL terminates at the ELB, which lessens the CPU load on nginx.
  • Amazon’s Route53 for the DNS.
  • 25+ Django application servers on High-CPU Extra-Large machines.
  • Traffic is CPU-bound rather than memory-bound, so High-CPU Extra-Large machines are a good balance of memory and CPU.
  • Gunicorn as their WSGI server. Apache harder to configure and more CPU intensive.
  • Fabric is used to execute commands in parallel on all machines. A deploy takes only seconds.
  • PostgreSQL (users, photo metadata, tags, etc) runs on 12 Quadruple Extra-Large memory instances.
  • Twelve PostgreSQL replicas run in a different availability zone.
  • PostgreSQL instances run in a master-replica setup using Streaming Replication. EBS is used for snapshotting, to take frequent backups.
  • EBS is deployed in a software RAID configuration. Uses mdadm to get decent IO.
  • All of their working set is stored memory. EBS doesn’t support enough disk seeks per second.
  • Vmtouch (portable file system cache diagnostics) is used to manage what data is in memory, especially when failing over from one machine to another, where there is no active memory profile already.
  • XFS as the file system. Used to get consistent snapshots by freezing and unfreezing the RAID arrays when snapshotting.
  • Pgbouncer is used pool connections to PostgreSQL.
  • Several terabytes of photos are stored on Amazon S3.
  • Amazon CloudFront as the CDN.
  • Redis powers their main feed, activity feed, sessions system, and other services.
  • Redis runs on several Quadruple Extra-Large Memory instances. Occasionally shard across instances.
  • Redis runs in a master-replica setup. Replicas constantly save to disk. EBS snapshots backup the DB dumps. Dumping on the DB on the master was too taxing.
  • Apache Solr powers the geo-search API. Like the simple JSON interface.
  • 6 memcached instances for caching. Connect using pylibmc & libmemcached. Amazon Elastic Cache service isn't any cheaper.
  • Gearman is used to: asynchronously share photos to Twitter, Facebook, etc; notifying real-time subscribers of a new photo posted; feed fan-out.
  • 200 Python workers consume tasks off the Gearman task queue.
  • Pyapns (Apple Push Notification Service) handles over a billion push notifications. Rock solid.
  • Munin to graph metrics across the system and alert on problems. Write many custom plugins using Python-Munin to graph, signups per minute, photos posted per second, etc.
  • Pingdom for external monitoring of the service.
  • PagerDuty for handling notifications and incidents.
  • Sentry for Python error reporting.

RAMFS vs TMPFS on Linux

RAMFS vs TMPFS on Linux

[Linux Ramfs and Tmpfs]Using ramfs or tmpfs you can allocate part of the physical memory to be used as a partition. You can mount this partition and start writing and reading files like a hard disk partition. Since you’ll be reading and writing to the RAM, it will be faster.

When a vital process becomes drastically slow because of disk writes, you can choose either ramfs or tmpfs file systems for writing files to the RAM.


Both tmpfs and ramfs mount will give you the power of fast reading and writing files from and to the primary memory. When you test this on a small file, you may not see a huge difference. You’ll notice the difference only when you write large amount of data to a file with some other processing overhead such as network.

1. How to mount Tmpfs

# mkdir -p /mnt/tmp  # mount -t tmpfs -o size=20m tmpfs /mnt/tmp

The last line in the following df -k shows the above mounted /mnt/tmp tmpfs file system.

# df -k Filesystem      1K-blocks  Used     Available Use%  Mounted on /dev/sda2       32705400   5002488  26041576  17%   / /dev/sda1       194442     18567    165836    11%   /boot tmpfs           517320     0        517320    0%    /dev/shm tmpfs           20480      0        20480     0%    /mnt/tmp

2. How to mount Ramfs

# mkdir -p /mnt/ram  # mount -t ramfs -o size=20m ramfs /mnt/ram

The last line in the following mount command shows the above mounted /mnt/ram ramfs file system.

# mount /dev/sda2 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) fusectl on /sys/fs/fuse/connections type fusectl (rw) tmpfs on /mnt/tmp type tmpfs (rw,size=20m) ramfs on /mnt/ram type ramfs (rw,size=20m)

You can mount ramfs and tmpfs during boot time by adding an entry to the /etc/fstab.

3. Ramfs vs Tmpfs

Primarily both ramfs and tmpfs does the same thing with few minor differences.

  • Ramfs will grow dynamically. So, you need control the process that writes the data to make sure ramfs doesn’t go above the available RAM size in the system. Let us say you have 2GB of RAM on your system and created a 1 GB ramfs and mounted as /tmp/ram. When the total size of the /tmp/ram crosses 1GB, you can still write data to it. System will not stop you from writing data more than 1GB. However, when it goes above total RAM size of 2GB, the system may hang, as there is no place in the RAM to keep the data.
  • Tmpfs will not grow dynamically. It would not allow you to write more than the size you’ve specified while mounting the tmpfs. So, you don’t need to worry about controlling the process that writes the data to make sure tmpfs doesn’t go above the specified limit. It may give errors similar to “No space left on device”.
  • Tmpfs uses swap.
  • Ramfs does not use swap.

4. Disadvantages of Ramfs and Tmpfs

Since both ramfs and tmpfs is writing to the system RAM, it would get deleted once the system gets rebooted, or crashed. So, you should write a process to pick up the data from ramfs/tmpfs to disk in periodic intervals. You can also write a process to write down the data from ramfs/tmpfs to disk while the system is shutting down. But, this will not help you in the time of system crash.

Table: Comparison of ramfs and tmpfs
Experimentation Tmpfs Ramfs
Fill maximum space and continue writing Will display error Will continue writing
Fixed Size Yes No
Uses Swap Yes No
Volatile Storage Yes Yes

If you want your process to write faster, opting for tmpfs is a better choice with precautions about the system crash.