Friday, April 17, 2020
Thursday, February 22, 2018
Kafka benchmarking
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic=kafka_benchmark --num-records=10000 --throughput=10 --record-size=200 --producer-props bootstrap.servers=localhost:9092 buffer.memory=67108864 batch.size=6
bin/kafka-consumer-perf-test.sh --zookeeper localhost:2181 --messages 50000000 --topic kafka_benchmark --threads 1
bin/kafka-consumer-perf-test.sh --messages 100000000 -topic network_perf --threads 3 --broker-list dl6-l2-kafka-01:9092,dl6-l2-kafka-02:9092,dl6-l2-kafka-03:9092 -show-detailed-stats
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
Sunday, December 24, 2017
GIT statistics
Show number of commits by developer
git shortlog | grep -E '^[^ ]' | sort -u | wc -lShow number of commits for a period
git log --pretty=oneline --after=12/31/2016 | wc -l
Thursday, December 18, 2014
Useful links
https://github.com/fleveque/awesome-awesomes#go
Awesome Hadoop
https://github.com/youngwookim/awesome-hadoop
Awesome Machine Learning
https://github.com/josephmisiti/awesome-machine-learning
Awesome Node.js
https://github.com/sindresorhus/awesome-nodejs
Awesome awesomes
https://github.com/fleveque/awesome-awesomes
Friday, January 25, 2013
Numbers Everyone Should Know
From Google Pro Tips: Numbers Everyone Should Know
- L1 cache reference 0.5 ns
- Branch mispredict 5 ns
- L2 cache reference 7 ns
- Mutex lock/unlock 100 ns
- Main memory reference 100 ns
- Compress 1K bytes with Zippy 10,000 ns
- Send 2K bytes over 1 Gbps network 20,000 ns
- Read 1 MB sequentially from memory 250,000 ns
- Round trip within same datacenter 500,000 ns
- Disk seek 10,000,000 ns
- Read 1 MB sequentially from network 10,000,000 ns
- Read 1 MB sequentially from disk 30,000,000 ns
- Send packet CA->Netherlands->CA 150,000,000 ns
- Notice the magnitude differences in the performance of different options.
- Datacenters are far away so it takes a long time to send anything between them.
- Memory is fast and disks are slow.
- By using a cheap compression algorithm a lot (by a factor of 2) of network bandwidth can be saved.
- Writes are 40 times more expensive than reads.
- Global shared data is expensive. This is a fundamental limitation of distributed systems. The lock contention in shared heavily written objects kills performance as transactions become serialized and slow.
- Architect for scaling writes.
- Optimize for low write contention.
- Optimize wide. Make writes as parallel as you can.
Example: Generate Image Results Page Of 30 Thumbnails
Design 1 - Serial
- Read images serially. Do a disk seek. Read a 256K image and then go on to the next image.
- Performance: 30 seeks * 10 ms/seek + 30 * 256K / 30 MB /s = 560ms
Design 2 - Parallel
- Issue reads in parallel.
- Performance: 10 ms/seek + 256K read / 30 MB/s = 18ms
- There will be variance from the disk reads, so the more likely time is 30-60ms
Tuesday, April 17, 2012
Instagram - architecture that worth now 1B
- Amazon shop. They use many of Amazon's services. With only 3 engineers so don’t have the time to look at self hosting.
- 100+ EC2 instances total for various purposes.
- Ubuntu Linux 11.04 (“Natty Narwhal”). Solid, other Ubuntu versions froze on them.
- Amazon’s Elastic Load Balancer routes requests and 3 nginx instances sit behind the ELB.
- SSL terminates at the ELB, which lessens the CPU load on nginx.
- Amazon’s Route53 for the DNS.
- 25+ Django application servers on High-CPU Extra-Large machines.
- Traffic is CPU-bound rather than memory-bound, so High-CPU Extra-Large machines are a good balance of memory and CPU.
- Gunicorn as their WSGI server. Apache harder to configure and more CPU intensive.
- Fabric is used to execute commands in parallel on all machines. A deploy takes only seconds.
- PostgreSQL (users, photo metadata, tags, etc) runs on 12 Quadruple Extra-Large memory instances.
- Twelve PostgreSQL replicas run in a different availability zone.
- PostgreSQL instances run in a master-replica setup using Streaming Replication. EBS is used for snapshotting, to take frequent backups.
- EBS is deployed in a software RAID configuration. Uses mdadm to get decent IO.
- All of their working set is stored memory. EBS doesn’t support enough disk seeks per second.
- Vmtouch (portable file system cache diagnostics) is used to manage what data is in memory, especially when failing over from one machine to another, where there is no active memory profile already.
- XFS as the file system. Used to get consistent snapshots by freezing and unfreezing the RAID arrays when snapshotting.
- Pgbouncer is used pool connections to PostgreSQL.
- Several terabytes of photos are stored on Amazon S3.
- Amazon CloudFront as the CDN.
- Redis powers their main feed, activity feed, sessions system, and other services.
- Redis runs on several Quadruple Extra-Large Memory instances. Occasionally shard across instances.
- Redis runs in a master-replica setup. Replicas constantly save to disk. EBS snapshots backup the DB dumps. Dumping on the DB on the master was too taxing.
- Apache Solr powers the geo-search API. Like the simple JSON interface.
- 6 memcached instances for caching. Connect using pylibmc & libmemcached. Amazon Elastic Cache service isn't any cheaper.
- Gearman is used to: asynchronously share photos to Twitter, Facebook, etc; notifying real-time subscribers of a new photo posted; feed fan-out.
- 200 Python workers consume tasks off the Gearman task queue.
- Pyapns (Apple Push Notification Service) handles over a billion push notifications. Rock solid.
- Munin to graph metrics across the system and alert on problems. Write many custom plugins using Python-Munin to graph, signups per minute, photos posted per second, etc.
- Pingdom for external monitoring of the service.
- PagerDuty for handling notifications and incidents.
- Sentry for Python error reporting.
RAMFS vs TMPFS on Linux
RAMFS vs TMPFS on Linux
Using ramfs or tmpfs you can allocate part of the physical memory to be used as a partition. You can mount this partition and start writing and reading files like a hard disk partition. Since you’ll be reading and writing to the RAM, it will be faster.
When a vital process becomes drastically slow because of disk writes, you can choose either ramfs or tmpfs file systems for writing files to the RAM.
Both tmpfs and ramfs mount will give you the power of fast reading and writing files from and to the primary memory. When you test this on a small file, you may not see a huge difference. You’ll notice the difference only when you write large amount of data to a file with some other processing overhead such as network.
1. How to mount Tmpfs
# mkdir -p /mnt/tmp # mount -t tmpfs -o size=20m tmpfs /mnt/tmp
The last line in the following df -k shows the above mounted /mnt/tmp tmpfs file system.
# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 32705400 5002488 26041576 17% / /dev/sda1 194442 18567 165836 11% /boot tmpfs 517320 0 517320 0% /dev/shm tmpfs 20480 0 20480 0% /mnt/tmp
2. How to mount Ramfs
# mkdir -p /mnt/ram # mount -t ramfs -o size=20m ramfs /mnt/ram
The last line in the following mount command shows the above mounted /mnt/ram ramfs file system.
# mount /dev/sda2 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) fusectl on /sys/fs/fuse/connections type fusectl (rw) tmpfs on /mnt/tmp type tmpfs (rw,size=20m) ramfs on /mnt/ram type ramfs (rw,size=20m)
You can mount ramfs and tmpfs during boot time by adding an entry to the /etc/fstab.
3. Ramfs vs Tmpfs
Primarily both ramfs and tmpfs does the same thing with few minor differences.
- Ramfs will grow dynamically. So, you need control the process that writes the data to make sure ramfs doesn’t go above the available RAM size in the system. Let us say you have 2GB of RAM on your system and created a 1 GB ramfs and mounted as /tmp/ram. When the total size of the /tmp/ram crosses 1GB, you can still write data to it. System will not stop you from writing data more than 1GB. However, when it goes above total RAM size of 2GB, the system may hang, as there is no place in the RAM to keep the data.
- Tmpfs will not grow dynamically. It would not allow you to write more than the size you’ve specified while mounting the tmpfs. So, you don’t need to worry about controlling the process that writes the data to make sure tmpfs doesn’t go above the specified limit. It may give errors similar to “No space left on device”.
- Tmpfs uses swap.
- Ramfs does not use swap.
4. Disadvantages of Ramfs and Tmpfs
Since both ramfs and tmpfs is writing to the system RAM, it would get deleted once the system gets rebooted, or crashed. So, you should write a process to pick up the data from ramfs/tmpfs to disk in periodic intervals. You can also write a process to write down the data from ramfs/tmpfs to disk while the system is shutting down. But, this will not help you in the time of system crash.
Experimentation | Tmpfs | Ramfs |
---|---|---|
Fill maximum space and continue writing | Will display error | Will continue writing |
Fixed Size | Yes | No |
Uses Swap | Yes | No |
Volatile Storage | Yes | Yes |
If you want your process to write faster, opting for tmpfs is a better choice with precautions about the system crash.