Performance


wrk - a Modern HTTP benchmarking tool

posted May 12, 2018, 8:48 AM by Chris G   [ updated May 12, 2018, 8:49 AM ]


wrk is a modern HTTP benchmarking tool capable of generating significant load when run on a single multi-core CPU. It combines a multithreaded design with scalable event notification systems such as epoll and kqueue.

Scalable Load Testing Tools

posted Aug 29, 2017, 9:16 AM by Chris G   [ updated Aug 29, 2017, 9:17 AM ]



Locust - An open source load testing tool.

Define user behaviour with Python code, and swarm your system with millions of simultaneous users.

http://locust.io/







Boomer is a better load generator for locust, written in golang. It can spawn thousands of goroutines to run your code concurrently.

It will listen and report to the locust master automatically, your test results will be displayed on the master's web UI.

https://github.com/myzhan/boomer


Troubleshooting UDP packets being dropped

posted Jan 23, 2017, 2:22 PM by Chris G   [ updated Jan 23, 2017, 2:23 PM ]

Test to find fragmented UDP packets (larger than MTU) being dropped between hosts in different subnets:

host1:~# iperf -s -p 7870 -u
------------------------------------------------------------
Server listening on UDP port 7870
Receiving 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.1 port 7870 connected with 192.168.1.1 port 45467
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-10.1 sec 494 KBytes 402 Kbits/sec 0.021 ms 0/ 344 (0%) [ 4] local 192.168.1.1 port 7870 connected with 192.168.8.8 port 43130
[ 4] 0.0-10.1 sec 494 KBytes 402 Kbits/sec 0.023 ms 0/ 344 (0%) [ 3] local 192.168.1.1 port 7870 connected with 192.168.8.8 port 47294
[ 3] 0.0-10.1 sec 494 KBytes 402 Kbits/sec 0.048 ms 0/ 344 (0%)

they ran a test from a "good" host by sending a UDP request larger than the allowed MTU:

host2:~# iperf -c host1 -u -p 7870 -l 3832
------------------------------------------------------------
Client connecting to sd6pstg-35b5, UDP port 7870
Sending 3832 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.8.8 port 47294 connected with 192.168.1.1 port 7870
[ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 1.26 MBytes 1.05 Mbits/sec [ 3] Sent 344 datagrams [ 3] Server Report: [ 3] 0.0-10.1 sec 494 KBytes 402 Kbits/sec 0.048 ms 0/ 344 (0%)

They ran the same test on a "bad" host:

host3:~# iperf -c host1 -u -p 7870 -l 3832
------------------------------------------------------------
Client connecting to sd6pstg-35b5, UDP port 7870
Sending 3832 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.9.9 port 38406 connected with 192.168.1.1 port 7870
[ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 1.26 MBytes 1.05 Mbits/sec [ 3] Sent 344 datagrams [ 3] WARNING: did not receive ack of last datagram after 10 tries.
sysctl -a |grep udp





sudo ifconfig eth0 mtu 800

ip route flush cache



ping -s 1520 -M dont -i 0.8 192.168.1.1


 

tcpdump -vv -i eth0 host 192.168.1.1 and udp -n




http://superuser.com/questions/66802/does-linux-have-an-equivalent-of-windows-pmtu-blackhole-router-discovery


so to turn on tcp_mtu_probing

# echo 2 > /proc/sys/net/ipv4/tcp_mtu_probing 

Possible values

0: disabled

1: enabled when black hole detected

2: always enabled 



root@sd6pstg-69c4:~# tracepath -n 192.168.1.1

 1?: [LOCALHOST]                                         pmtu 1464

 1:  169.254.154.42                                        1.429ms 

 1:  169.254.154.42                                        1.058ms 

 2:  169.254.32.144                                        2.134ms 

 3:  169.254.32.19                                         1.034ms 

 4:  no reply





tcpdump -nnvvXSs 1514 -i eth0


  • nn = don't resolve host names or port names
    vv = verbosity level (can be v, vv, or vvv)
    X = Payload. Shows packets contents in both ASCII and HEX. If you need the ethernet header us XX instead of just X
    S = prints absolute sequence numbers
    s = allows you to set snaplen (in this case 1514) so we capture the whole packet.



host3:~# cat /proc/sys/net/ipv4/ip_default_ttl

64


sudo sysctl net.ipv4.ip_default_ttl=129



Open Source Web Performance Dashboard

posted May 8, 2015, 1:20 PM by Chris G

The speedgun.io and PhantomJS 2 can be used to measure website performance:
 
 
 
Here is another solution, based on Docker containershttp://dashboard.sitespeed.io/
 

Making HTTPS Fast(er) with nginx

posted Oct 22, 2014, 11:25 AM by Chris G   [ updated Oct 22, 2014, 11:27 AM ]

Nginx has all the right HTTPS performance knobs and features... but the defaults can be optimized to deliver a much better out-of-the-box experience. In fact, this applies to just about every server out there.



Talk about the use of Redis at Twitter

posted Sep 19, 2014, 5:55 PM by Chris G   [ updated Sep 19, 2014, 6:25 PM ]

 
Interesting behind-the-scenes look at how Twitter uses Redis: 10K+ instances, 100TB+ of memory, and ~40M QPS! All of the timelines are stored in Redis.. and that's a lot of memory!


YouTube Video




The talk also covers the use of twemproxy aka nutcracker is a fast and lightweight proxy for memcached and redis protocol. It was primarily built to reduce the connection count on the backend caching servers.


https://github.com/twitter/twemproxy

The Linux Performance Monitoring Commands

posted Sep 4, 2014, 10:22 PM by Chris G

Perhaps the most comprehensive overview of the Linux Performance Monitoring Commands:


Google's PageSpeed Module

posted Sep 4, 2014, 3:21 PM by Chris G


PageSpeed Module - Optimizing For Bandwidth



PageSpeed's default RewriteLevel of CoreFilters is designed to reduce latency, incurring a small risk of page breakage. A related goal, bandwidth reduction, can be achieved with close to zero risk of web-site breakage:

Apache:
ModPagespeedRewriteLevel OptimizeForBandwidth
Nginx:
pagespeed RewriteLevel OptimizeForBandwidth;

This option is suitable for use in a root configuration at a hosting service, CDN, or multi-site setup, in conjunction with InheritVhostConfig. In this mode, PageSpeed does not alter HTML at all. It compresses and transcodes images in place, and minifies JavaScript and CSS. By avoiding changes to URL syntax and to HTML, the potential problem of user-written JavaScript encountering unexpected DOM elements is elimimated. There is still latency benefit due to the reduced size of resources, as well as substantial bandwidth reduction.



Minify HTML, CSS and JS

posted Sep 3, 2014, 11:51 PM by Chris G   [ updated Dec 18, 2018, 4:05 PM ]

Code minification, if done right, can boost the performance of most websites. 

Here are a few useful tools:





Penthouse is a tool generating critical path css for your web pages

posted Sep 3, 2014, 7:34 PM by Chris G   [ updated Sep 11, 2017, 5:28 PM ]

penthouse

Critical Path CSS Generator

NPM version

About

Penthouse is a tool generating critical path css for your web pages and web apps in order to speed up page rendering. Supply the tool with your site's full CSS, and the page you want to create the critical CSS for, and it will return all the CSS needed to render the above the fold content of the page. Read more about critical path css here.

The process is automatic and the generated css is production as is. If you run in to problems however, check out the Problems section further down on this page.




1-10 of 16