Tuning – uma coisa muito importante e que faz a diferença
Posted by gondim | Posted in Dicas, FreeBSD, Segurança | Posted on 13-07-2012
Tags:tuning
7
Cada Sistema Operacional tem o seu tuning próprio e isso pode realmente fazer a diferença quando você coloca um servidor em produção. Por si só e sem modificações os sistemas tem seus valores padrões de forma que funcionem em qualquer equipamento mas quando necessitamos que o servidor trabalhe com a melhor performance para aquela aplicação, aí sim você começa à quebrar a cabeça.
O problema é que muitos desses parâmetros são obscuros ou tem pouca informação na Internet e quem poderia nos informar mais sobre eles seriam os desenvolvedores que os criaram. No FreeBSD o tuning é feito em 3 lugares:
- Kernel
- loader.conf
- sysctl.conf
O kernel que é instalado é sempre o GENERIC, um kernel suficiente (mínimo) para que o sistema inicie em qualquer hardware para que se tenha uma menor probabilidade de erros com o equipamento. O resultado é um kernel que tem coisas à mais do que necessitamos e ao mesmo tempo faltando outras que precisamos. Um exemplo disso é o firewall que não vem no kernel GENERIC.
Todos os drivers de dispositivos que você não utiliza, deveriam ser removidos do nosso novo kernel mas nunca edite/altere o arquivo GENERIC. Faça uma cópia do GENERIC para outro nome e aí sim edite esse arquivo novo.
Se você instalou um FreeBSD i386 seu kernel GENERIC estará em: /sys/i386/conf/ e se for um amd64 estará em: /sys/amd64/conf/
O NOTES é um cara que possui todas as opções, devices e parâmetros que podemos usar no nosso kernel mas existem 2 NOTES, o NOTES da sua arquitetura e o NOTES genérico que serviria para qualquer arquitetura. Podemos criar um NOTES completo da seguinte forma. Vamos supor que eu esteja em uma arquitetura de 64 bits (amd64):
# cd /sys/amd64/conf
# cat /sys/conf/NOTES NOTES > NOTES-FULL
Agora sim temos um NOTES com todas as opções que podemos usar e servir de referência.
Um exemplo de tuning de Kernel seria o parâmetro HZ que se você tiver um servidor com alto tráfego, um router por exemplo, deveria ser um valor de uns 3000 em máquinas 64 bits SMP.
Então nosso tuning envolve enxugar o kernel e adicionar coisas que usaremos. Abaixo colocarei alguns exemplos de KERNCONF. No final do artigo colocarei o link no qual coletei essas configurações:
=========================================================================
# Just some of them, see also
# cat /sys/{i386,amd64,}/conf/NOTES
# This one useful only on i386
#options KVA_PAGES=512
# You can play with HZ in environments with high interrupt rate (default is 1000)
# 100 is for my notebook to prolong it’s battery life
#options HZ=100
# Eliminate datacopy on socket read-write
# To take advantage with zero copy sockets you should have an MTU >= 4k
# This req. is only for receiving data.
# Read more in man zero_copy_sockets
# Also this epic thread on kernel trap:
# http://kerneltrap.org/node/6506
# Here Linus says that “anybody that does it that way (FreeBSD) is totally incompetent”
#options ZERO_COPY_SOCKETS
# Support TCP sign. Used for IPSec
options TCP_SIGNATURE
# There was stackoverflow found in KAME IPSec stack:
# See http://secunia.com/advisories/43995/
# For quick workaround you can use `ipfw add deny proto ipcomp`
options IPSEC
# This ones can be loaded as modules. They described in loader.conf section
#options ACCEPT_FILTER_DATA
#options ACCEPT_FILTER_HTTP
# Adding ipfw, also can be loaded as modules
options IPFIREWALL
# On 8.1+ you can disable verbose to see blocked packets on ipfw0 interface.
# Also there is no point in compiling verbose into the kernel, because
# now there is net.inet.ip.fw.verbose tunable.
#options IPFIREWALL_VERBOSE
#options IPFIREWALL_VERBOSE_LIMIT=10
options IPFIREWALL_FORWARD
# Adding kernel NAT
options IPFIREWALL_NAT
options LIBALIAS
# Traffic shaping
options DUMMYNET
# Divert, i.e. for userspace NAT
options IPDIVERT
# This is for OpenBSD’s pf firewall
device pf
device pflog
# pf’s QoS – ALTQ
options ALTQ
options ALTQ_CBQ # Class Bases Queuing (CBQ)
options ALTQ_RED # Random Early Detection (RED)
options ALTQ_RIO # RED In/Out
options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC)
options ALTQ_PRIQ # Priority Queuing (PRIQ)
options ALTQ_NOPCC # Required for SMP build
# Pretty console
# Manual can be found here http://forums.freebsd.org/showthread.php?t=6134
#options VESA
#options SC_PIXEL_MODE
# Disable reboot on Ctrl Alt Del
#options SC_DISABLE_REBOOT
# Change normal|kernel messages color
options SC_NORM_ATTR=(FG_GREEN|BG_BLACK)
options SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK)
# More scroll space
options SC_HISTORY_SIZE=8192
# Adding hardware crypto device
device crypto
device cryptodev
# Useful network interfaces
device vlan
device tap #Virtual Ethernet driver
device gre #IP over IP tunneling
device if_bridge #Bridge interface
device pfsync #synchronization interface for PF
device carp #Common Address Redundancy Protocol
device enc #IPsec interface
device lagg #Link aggregation interface
device stf #IPv4-IPv6 port
# Also for my notebook, but may be used with Opteron
device amdtemp
# Same for Intel processors
device coretemp
# man 4 cpuctl
device cpuctl # CPU control pseudo-device
# Support for ECMP. More than one route for destination
# Works even with default route so one can use it as LB for two ISP
# For now code is unstable and panics (panic: rtfree 2) on route deletions.
#options RADIX_MPATH
# Multicast routing
#options MROUTING
#options PIM
# Debug & DTrace
options KDB # Kernel debugger related code
options KDB_TRACE # Print a stack trace for a panic
options KDTRACE_FRAME # amd64-only(?)
options KDTRACE_HOOKS # all architectures – enable general DTrace hooks
#options DDB
#options DDB_CTF # all architectures – kernel ELF linker loads CTF data
# Adaptive spining in lockmgr (8.x+)
# See http://www.mail-archive.com/svn-src-all@freebsd.org/msg10782.html
options ADAPTIVE_LOCKMGRS
# UTF-8 in console (8.x+)
#options TEKEN_UTF8
# FreeBSD 8.1+
# Deadlock resolver thread
# For additional information see http://www.mail-archive.com/svn-src-all@freebsd.org/msg18124.html
# (FYI: “resolution” is panic so use with caution)
#options DEADLKRES
# Increase maximum size of Raw I/O and sendfile(2) readahead
#options MAXPHYS=(1024*1024)
#options MAXBSIZE=(1024*1024)
# For scheduler debug enable following option.
# Debug will be available via `kern.sched.stats` sysctl
# For more information see http://svnweb.freebsd.org/base/head/sys/conf/NOTES?view=markup
#options SCHED_STATS
# A framework for very efficient packet I/O from userspace, capable of
# line rate at 10G (FreeBSD10+)
# See http://svnweb.freebsd.org/base?view=revision&revision=227614
#device netmap
=========================================================================
Agora veremos uns exemplos relacionados ao /boot/loader.conf:
=========================================================================
# Accept filters for data, http and DNS requests
# Useful when your software uses select() instead of kevent/kqueue or when you under DDoS
# Note: DNS accf available on 8.0+
#accf_data_load=”YES”
#accf_http_load=”YES”
#accf_dns_load=”YES”
# Async IO system calls
aio_load=”YES”
# Linux specific devices in /dev
# As for 8.1 it only /dev/full
#lindev_load=”YES”
# Adds NCQ support in FreeBSD
# WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+
# 8.0+ only
#ahci_load=”YES”
#siis_load=”YES”
# FreeBSD 9+
# New Congestion Control for FreeBSD
# http://caia.swin.edu.au/urp/newtcp/tools/cc_chd-readme-0.1.txt
# http://www.ietf.org/proceedings/78/slides/iccrg-5.pdf
# Initial merge commit message http://www.mail-archive.com/svn-src-all@freebsd.org/msg31410.html
#cc_chd_load=”YES”
# Increase kernel memory size to 3G.
#
# Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM
# Otherwise panic will happen on next reboot!
#
# It’s required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc
# Useful on highload stateful firewalls, proxies or ZFS fileservers
# (FreeBSD 7.2+ amd64 users: Check that current value is lower!)
#vm.kmem_size=”3G”
# If your server has lots of swap (>4Gb) you should increase following value
# according to http://lists.freebsd.org/pipermail/freebsd-hackers/2009-October/029616.html
# Otherwise you’ll be getting errors
# “kernel: swap zone exhausted, increase kern.maxswzone”
#kern.maxswzone=”256M”
# Older versions of FreeBSD can’t tune maxfiles on the fly
#kern.maxfiles=”200000″
# Useful for databases
# Sets maximum data size to 1G
# (FreeBSD 7.2+ amd64 users: Check that current value is lower!)
#kern.maxdsiz=”1G”
# Maximum buffer size(vfs.maxbufspace)
# You can check current one via vfs.bufspace
# Should be lowered/upped depending on server’s load-type
# Usually decreased to preserve kmem
# (default is 10% of mem)
#kern.maxbcache=”512M”
# Sendfile buffers
# For i386 only
#kern.ipc.nsfbufs=10240
# FreeBSD 9+
# HPET “legacy route” support. It should allow HPET to work per-CPU
# See http://www.mail-archive.com/svn-src-head@freebsd.org/msg03603.html
#hint.atrtc.0.clock=0
#hint.attimer.0.clock=0
#hint.hpet.0.legacy_route=1
# syncache Hash table tuning
net.inet.tcp.syncache.hashsize=32768
net.inet.tcp.syncache.bucketlimit=32
net.inet.tcp.syncache.cachelimit=1048576
# Increased hostcache
# Later host cache can be viewed via net.inet.tcp.hostcache.list hidden sysctl
# Very useful for it’s RTT RTTVAR
# Must be power of two
net.inet.tcp.hostcache.hashsize=65536
# hashsize * bucketlimit (which is 30 by default)
# It allocates 255Mb (1966080*136) of RAM
net.inet.tcp.hostcache.cachelimit=1966080
# TCP control-block Hash table tuning
net.inet.tcp.tcbhashsize=4096
# Disable ipfw deny all
# Should be uncommented when there is a chance that
# kernel and ipfw binary may be out-of sync on next reboot
#net.inet.ip.fw.default_to_accept=1
#
# SIFTR (Statistical Information For TCP Research) is a kernel module that
# logs a range of statistics on active TCP connections to a log file.
# See prerelease notes http://groups.google.com/group/mailing.freebsd.current/browse_thread/thread/b4c18be6cdce76e4
# and man 4 sitfr
#siftr_load=”YES”
# Enable superpages, for 7.2+ only
# See: http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html
vm.pmap.pg_ps_enabled=1
# Usefull if you are using Intel-Gigabit NIC
#hw.em.rxd=4096
#hw.em.txd=4096
#hw.em.rx_process_limit=”-1″
# Also if you have ALOT interrupts on NIC – play with following parameters
# NOTE: You should set them for every NIC
#dev.em.0.rx_int_delay: 250
#dev.em.0.tx_int_delay: 250
#dev.em.0.rx_abs_int_delay: 250
#dev.em.0.tx_abs_int_delay: 250
# There is also multithreaded version of em/igb drivers that can be found here:
# http://people.yandex-team.ru/~wawa/
#
# for additional em monitoring and statistics use
# sysctl dev.em.0.stats=1 ; dmesg
# sysctl dev.em.0.debug=1 ; dmesg
# Also after r209242 (-CURRENT) there is a separate sysctl for each stat variable;
# Same tunings for igb
#hw.igb.rxd=4096
#hw.igb.txd=4096
#hw.igb.rx_process_limit=100
# Some useful netisr tunables. See sysctl net.isr
#net.isr.maxthreads=4
#net.isr.defaultqlimit=10240
#net.isr.maxqlimit=10240
# Bind netisr threads to CPUs
#net.isr.bindthreads=1
#
# FreeBSD 9.x+
# Increase interface send queue length
# See commit message http://svn.freebsd.org/viewvc/base?view=revision&revision=207554
#net.link.ifqmaxlen=1024
# Nicer boot logo =)
loader_logo=”beastie”
=========================================================================
E agora por último mas não tão menos importante um /etc/sysctl.conf:
=========================================================================
# No zero mapping feature
# May break wine
# (There are also reports about broken samba3)
#security.bsd.map_at_zero=0
# If you have really busy webserver with apache13 you may run out of processes
#kern.maxproc=10000
# Same for servers with apache2 / Pound
#kern.threads.max_threads_per_proc=4096
# Max. backlog size
kern.ipc.somaxconn=4096
# Shared memory // 7.2+ can use shared memory > 2Gb
kern.ipc.shmmax=2147483648
# Sockets
kern.ipc.maxsockets=204800
# Mbuf 2k clusters (on amd64 7.2+ 25600 is default)
# Note that defaults for other variables depend on this variable, for example `tcpreass`
# Note for FreeBSD-7 and older: For such high value vm.kmem_size must be increased to 3G
kern.ipc.nmbclusters=262144
# Jumbo pagesize(_SC_PAGESIZE) clusters
# Used as general packet storage for jumbo frames on some network cards
# Can be monitored via `netstat -m`
#kern.ipc.nmbjumbop=262144
# Jumbo 9k/16k clusters
# If you are using them
#kern.ipc.nmbjumbo9=65536
#kern.ipc.nmbjumbo16=32768
# For lower latency you can decrease scheduler’s maximum time slice
# default: stathz/10 (~ 13)
#kern.sched.slice=1
# Increase max command-line length showed in `ps` (e.g for Tomcat/Java)
# Default is PAGE_SIZE / 16 or 256 on x86
# This avoids commands to be presented as [executable] in `ps`
# For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=120749
kern.ps_arg_cache_limit=4096
# Every socket is a file, so increase them
kern.maxfiles=204800
kern.maxfilesperproc=200000
kern.maxvnodes=200000
# On some systems HPET is almost 2 times faster than default ACPI-fast
# Useful on systems with lots of clock_gettime / gettimeofday calls
# See http://old.nabble.com/ACPI-fast-default-timecounter,-but-HPET-83–faster-td23248172.html
# After revision 222222 HPET became default: http://svnweb.freebsd.org/base?view=revision&revision=222222
kern.timecounter.hardware=HPET
# Small receive space, only usable on http-server, on file server this
# should be increased to 65535 or even more
#net.inet.tcp.recvspace=8192
# This is useful on Fat-Long-Pipes
#kern.ipc.maxsockbuf=10485760
#net.inet.tcp.recvbuf_max=10485760
#net.inet.tcp.recvbuf_inc=65535
# Small send space is useful for http servers that serve small files
# Autotuned since 7.x
net.inet.tcp.sendspace=16384
# This is useful on Fat-Long-Pipes
#net.inet.tcp.sendbuf_max=10485760
#net.inet.tcp.sendbuf_inc=65535
# Turn off receive autotuning
# You can play with it.
#net.inet.tcp.recvbuf_auto=0
#net.inet.tcp.sendbuf_auto=0
# This should be enabled if you going to use big spaces (>64k)
# Also timestamp field is useful when using syncookies
net.inet.tcp.rfc1323=1
# Turn this off on high-speed, lossless connections (LAN 1Gbit+)
net.inet.tcp.delayed_ack=0
# This feature is useful if you are serving data over modems, Gigabit Ethernet,
# or even high speed WAN links (or any other link with a high bandwidth delay product),
# especially if you are also using window scaling or have configured a large send window.
# Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 )
# This sysctl was removed in 10-CURRENT:
# See: http://www.mail-archive.com/svn-src-head@freebsd.org/msg06178.html
#net.inet.tcp.inflight.enable=0
# TCP slowstart algorithm tunings
# Here we are assuming VERY uncongested network
# Only takes effect if net.inet.tcp.rfc3390 is set to 0
# Otherwise formula teken from http://tools.ietf.org/html/rfc3390
#net.inet.tcp.slowstart_flightsize=10
#net.inet.tcp.local_slowstart_flightsize=100
# Disable randomizing of ports to avoid false RST
# Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf
# (it’s also says that port randomization auto-disables at some conn.rates, but I didn’t checked it thou)
#net.inet.ip.portrange.randomized=0
# Increase portrange
# For outgoing connections only. Good for seed-boxes and ftp servers.
net.inet.ip.portrange.first=1024
net.inet.ip.portrange.last=65535
#
# stops route cache degregation during a high-bandwidth flood
# http://www.freebsd.org/doc/en/books/handbook/securing-freebsd.html
#net.inet.ip.rtexpire=2
net.inet.ip.rtminexpire=2
net.inet.ip.rtmaxcache=1024
# Security
net.inet.ip.redirect=0
net.inet.ip.sourceroute=0
net.inet.ip.accept_sourceroute=0
net.inet.icmp.maskrepl=0
net.inet.icmp.log_redirect=0
net.inet.icmp.drop_redirect=1
net.inet.tcp.drop_synfin=1
#
# There is also good example of sysctl.conf with comments:
# http://www.thern.org/projects/sysctl.conf
#
# icmp may NOT rst, helpful for those pesky spoofed
# icmp/udp floods that end up taking up your outgoing
# bandwidth/ifqueue due to all that outgoing RST traffic.
#
#net.inet.tcp.icmp_may_rst=0
# Security
net.inet.udp.blackhole=1
net.inet.tcp.blackhole=2
# IPv6 Security
# For more info see http://www.fosslc.org/drupal/content/security-implications-ipv6
# Disable Node info replies
# To see this vulnerability in action run `ping6 -a sglAac ::1` or `ping6 -w ::1` on unprotected node
net.inet6.icmp6.nodeinfo=0
# Turn on IPv6 privacy extensions
# For more info see proposal http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2008-06/msg00103.html
net.inet6.ip6.use_tempaddr=1
net.inet6.ip6.prefer_tempaddr=1
# Disable ICMP redirect
net.inet6.icmp6.rediraccept=0
# Disable acceptation of RA and auto linklocal generation if you don’t use them
#net.inet6.ip6.accept_rtadv=0
#net.inet6.ip6.auto_linklocal=0
# Increases default TTL, sometimes useful
# Default is 64
net.inet.ip.ttl=128
# Lessen max segment life to conserve resources
# ACK waiting time in miliseconds
# (default: 30000. RFC from 1979 recommends 120000)
net.inet.tcp.msl=5000
# Max bumber of timewait sockets
net.inet.tcp.maxtcptw=200000
# Don’t use tw on local connections
# As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization.
# So disable it or now till get fixed
#net.inet.tcp.nolocaltimewait=1
# FIN_WAIT_2 state fast recycle
net.inet.tcp.fast_finwait2_recycle=1
# Time before tcp keepalive probe is sent
# default is 2 hours (7200000)
#net.inet.tcp.keepidle=60000
# Should be increased until net.inet.ip.intr_queue_drops is zero
net.inet.ip.intr_queue_maxlen=4096
# Protocol decoding in interrupt thread.
# If you have NIC that automatically sets flow_id then it’s better to not use direct_force, and use advantages of multithreaded netisr(9)
# If you have Yandex drives you better off with net.isr.direct_force=1 and net.inet.tcp.read_locking=0 otherwise you may run into some TCP related problems
# If you have old NIC that don’t set flow_ids you may need to patch ip_input to manually set FLOW_ID via nh_m2flow
# FreeBSD 8+
#net.isr.direct=1
#net.isr.direct_force=1
# In FreeBSD 9+ it was renamed to
#net.isr.dispatch=direct
# This is for routers only
#net.inet.ip.forwarding=1
#net.inet.ip.fastforwarding=1
# This speed ups dummynet when channel isn’t saturated
net.inet.ip.dummynet.io_fast=1
# Increase dummynet(4) hash
#net.inet.ip.dummynet.hash_size=2048
#net.inet.ip.dummynet.max_chain_len
# Should be increased when you have A LOT of files on server
# (Increase until vfs.ufs.dirhash_mem becomes lower)
vfs.ufs.dirhash_maxmem=67108864
# Note from commit http://svn.freebsd.org/base/head@211031 :
# For systems with RAID volumes and/or virtualization envirnments, where
# read performance is very important, increasing this sysctl tunable to 32
# or even more will demonstratively yield additional performance benefits.
vfs.read_max=32
# Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification)
net.inet.tcp.ecn.enable=1
# Flowtable – flow caching mechanism
# Useful for routers
#net.inet.flowtable.enable=1
#net.inet.flowtable.nmbflows=65535
# IPFW dynamic rules and timeouts tuning
# Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower
net.inet.ip.fw.dyn_buckets=65536
net.inet.ip.fw.dyn_max=65536
net.inet.ip.fw.dyn_ack_lifetime=120
net.inet.ip.fw.dyn_syn_lifetime=10
net.inet.ip.fw.dyn_fin_lifetime=2
net.inet.ip.fw.dyn_short_lifetime=10
# Make packets pass firewall only once when using dummynet
# i.e. packets going thru pipe are passing out from firewall with accept
#net.inet.ip.fw.one_pass=1
# shm_use_phys Wires all shared pages, making them unswappable
# Use this to lessen Virtual Memory Manager’s work when using Shared Mem.
# Useful for databases
#kern.ipc.shm_use_phys=1
# ZFS
# Enable prefetch. Useful for sequential load type i.e fileserver.
# FreeBSD sets vfs.zfs.prefetch_disable to 1 on any i386 systems and
# on any amd64 systems with less than 4GB of avaiable memory
# For additional info check this nabble thread http://old.nabble.com/Samba-read-speed-performance-tuning-td27964534.html
#vfs.zfs.prefetch_disable=0
# On highload servers you may notice following message in dmesg:
# “Approaching the limit on PV entries, consider increasing either the
# vm.pmap.shpgperproc or the vm.pmap.pv_entry_max tunable”
vm.pmap.shpgperproc=2048
=========================================================================
Como podem ver existem muitas variáveis que alteradas de forma correta, podem fazer com que o nosso servidor fique mais seguro, mais rápido, mais estável e principalmente atendendo nossas necessidades. O tuning de um servidor é importante porque nos dá o total controle de como o servidor vai responder às solicitações de carga. Algumas pessoas preferem um sistema mais automático e outras preferem moldar nas suas necessidades. Eu acredito mais em sistemas moldáveis, como é o FreeBSD, pois temos o poder de melhorar onde temos que realmente melhorar.
Existe, logicamente, sempre o lado ruim da questão que é a seguinte: se o sysadmin não souber fazer o tuning correto e o sistema não atender às suas necessidades, este logo culpará o sistema dizendo que não suporta ou não serve para o projeto.
Esse artigo foi feito com dados deste artigo que achei muito interessante e extremamente útil para ser estudado e testado.