Blog

Login Register
  • Archive

    «   October 2014   »
    M T W T F S S
        1 2 3 4 5
    6 7 8 9 10 11 12
    13 14 15 16 17 18 19
    20 21 22 23 24 25 26
    27 28 29 30 31    

Another Great Year Is Almost Over

We are pretty much done with 2011. There are just a few days left. It was a good year. We got to 6000+ servers running CloudLinux (4x increase since 2010), our revenues are more then 1,000% up, our team doubled size and we gained hundreds and hundreds new customers.
We achieved quite a few goals that we have set:
  • Signed up cPanel and Parallels as distributors making sure that their software is well integrated with CloudLinux
  • Released CloudLinux 6.x
  • Implemented memory limits
  • Drastically improved stability and speed of our repositories by making them fully redundant and geographically distributed
  • Released plugins for cPanel, Plesk, DirectAdmin, ISPmanager and InterWorx

I am very thankful to our customers and partners for making it possible. It is a pleasure working with you, and we appreciate your support. We strive to make sure that your systems are stable and secure – and we will continue working toward this goal. Next year we will:
  • Provide an ability to use CloudLinux 6 kernel on CloudLinux 5 servers
  • Release productions versions of CageFS and make it the best way to secure shared hosting server
  • Release stable version of MySQL governor
  • Improved logging that would allow you to pin-point actual issue giving your customers more visibility into what went wrong.
  • Introduce IO limits, CPU weights, physical memory limits, and limit for a number of processes
  • Provide better integration with control panels, including email notifications and different limits per plan
  • Centralized interface to see data about resource usage for all your servers

And hopefully, some other features that you will demand from us as we move forward.
Please, accept my warmest greetings. I hope you will have terrific new year.

Igor Seletskiy

New Kernel for CloudLinux 6.2: 2.6.32-231.21.1.lve0.9.18

New kernel 2.6.32-231.21.1.lve0.9.18 was moved to production. It includes following bug fixes:

To update
Code
# yum update kernel

Apache forensics: handling the logs using command line kung fu.

Knowing site audience is of a paramount importance for every web master. There are lots of tools to aid in gaining that precious information, from simple ones like AWStats to amazingly pretty Web 2.0 creatures like Google Analytics. But sometimes all you need is UNIX shell and some tools knowledge, like awk, grep and sed.

In most installations Apache uses so-called "Combined" log format, which is good enough to contain most of the needed info. On most Linux distributions Apache log files are usually stored in /var/log/httpd/ and the one we are interested in is called access_log. "Combined" log format is defined in the following way in main Apache configuration file httpd.conf:
Code
LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" combined
Looks a bit scary, but in fact it's not that awful. If you check custom log configuration in Apache documentation you will find all the glory details (there's also some docs on access_log itself). What's important now is every log line consists of 9 space-separated fields. To get the idea of how it looks like, we should pick a look at the actual file:
Code
10.20.30.40 - - [10/Dec/2011:02:10:05 +0300] "GET /logo.png HTTP/1.1" 200 35236 "http://some-wiki.com/Main_Page"
 "Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20100101 Firefox/8.0"
We are interested in a few fields only. Say, the first item (10.20.30.40) is an IP address of a client fetching the page. The item in the square brackets is a timestamp. After the GET word there's a name of a file which a user wants. Number 200 is the HTTP return code, which means 'OK' in this case (say, 404 means 'Document not found' etc.). An URL in quotes is so-called referer, this is where a user (user's browser, actually) comes from, it can be either your own site, or some external site like www.google.com. Finally, the last field is User-Agent, i.e. a browser identification string, happen to be Firefox 8 (in a couple of years it's gonna be Firefox 18, or even Firefox 24 -- well, time flies).


So, we have a huge file filled with lines like the one above. What can we get from it (besides eye floaters)? All sorts of things you never ever knew you could dig from a common log! Let's start crunching! First of all, let's assign a shell variable called LOG a value pointing to your log file, so we don't have to type its name a good hundred times:
Code
 $ LOG=/var/log/httpd/access_log
Please note that '$' in the beginning of the line means a shell prompt. It means input lines, i.e. you need type in everything that goes after '$', the '$' itself is printed by shell. In case there's no '$' sign at the beginning of a line, this is output. Now, how big is the file, how many lines (let's call them records) do we have?
Code
 $ wc -l $LOG
5217775
Pretty big, 5 million lines! And since what time is it maintained?
Code
$ head -1 $LOG | awk '{print $4, $5}'
[16/Nov/2009:09:06:28 +0300]
Here "head -1" means "give us the first line of a file" and the awk statement means "print us the fourth and the fifth fields of the line", which contains timestamps. Using similar awk statements you can get the list of IP addresses of all users who tried to access your website. I'm sure you do not want to view all the five millions of addresses flooding the screen (remember the terminal from the Matrix movie? Oh, good old days!).


I guess you would like to know only unique IP addresses (and even top ten of them, it's nice to know your adoring fans!):
Code
 $ awk '{print $1}' $LOG | sort | uniq
We had to sort the list of addresses we have from awk first, then use uniq utility to omit the repeated lines. Still, the list is way too long (for those curious readers who want to know how long exactly, just add "| wc -l" to the end of the above command, still the number is huge, somewhere about 1 million lines). We can sort it once again to view only top 10 customers (their IP addresses actually):
Code
 $ awk '{print $1}' $LOG | sort | uniq -c | sort -nr | head
Here we add -c (count) option to uniq command so it also outputs the number of repeated lines. Then, the second sort command ("sort -nr" is a numerical (rather than alphabetical) sort with reverse order (bigger numbers first). This gives us a list of IPs and their frequencies. Finally, the head command is to limit the output to first 10 lines (the default value is 10; if you want it to be top 25, simply use "head -25". You can do the same awk query to get top 10 of most popular files (pages) of your web site. The file name is seventh field, so:
Code
 $ awk '{print $7}' $LOG | sort | uniq -c | sort -nr | head
Often the most popular file is /favicon.ico, a small web icon that you usually see in a browser tab :D Next thing we want is limiting statistics to the last month only (who cares what was going on on your web site in 1812!). grep helps a lot! By adding grep command we only allow lines that have a string "/Nov/2011" in it:
Code
 $ grep '/Nov/2011:' $LOG | awk '{print $7}' | sort | uniq -c | sort -nr | head



Sometimes it's important to know HTTP return codes. For most sites it's 200 (OK) and 404 (not found). There are still others return codes and it helps to know them to investigate some web server problems:
Code
 $ awk '{print $9}' $LOG | sort | uniq -c | sort -nr | head
The other pretty popular codes are 30x codes: redirections, "Document moved" etc. If everything is OK with your site, 200 is on top of the list.

Let's check what are our "most wanted" missing files (those who many visitors tried to find and failed):
Code
 $ awk '($9="404") {print $7}' access_log | sort | uniq -c | sort -nr | head
You can squeeze many other interesting tidbits out of your logs, if you know a bit of that awk-grep-sort-uniq-kung-fu. Actually, all the above stuff was pretty basic, just to show you how easy and simple it is. Much more sophisticated queries can be performed. Finally, feel free to share your own apache log parsing recipes in comments ;)

Why is USB enabled on your server?

It was a regular Monday morning – busy as usual. Then one client popped up with a very interesting problem. His server became unresponsive lately, had very high load, and he was wondering why CloudLinux wasn't stopping the issue.

I quickly logged into the server (and the unresponsiveness become obvious right away) and ran top. I start with top pretty much every time someone has “overload” issue with the server. Running top was the right thing to do this time over as well as it gave me an idea where to look next. The si was at 70% -- something was wrong, really wrong

si stands for % of CPU used to handle software interrupts. On most servers you would rarely see si using more then 2 to 4% of CPU. Software interrupt is an asynchronous signal that needs to be handled by some code. They are normal, and happen all the time. For example, software interrupts happen on each timer tick or when network card receives a packet of data, and it needs software to process that data.

My next step was to see which software interrupts are the most frequent on this system, and might be causing the issue.
# cat /proc/interrupts
CPU0 CPU1
0: 1566845520 60143 IO-APIC-edge timer
1: 1 2 IO-APIC-edge i8042
8: 0 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
12: 1 3 IO-APIC-edge i8042
50: 226 0 PCI-MSI hda_intel
169: 0 0 IO-APIC-level uhci_hcd:usb5
209: 0 0 IO-APIC-level uhci_hcd:usb4
217: 0 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
225: 13475 111221010 IO-APIC-level uhci_hcd:usb3, ata_piix
233: 62 327366496 PCI-MSI eth0
NMI: 493362 580033
LOC: 1566919097 1566920751
RES: 27519611 15339092
ERR: 0
MIS: 0


Now, the first column stands of IRQ (interrupt request) number, CPU0, and CPU1 stand for number of times interrupt was handled by particular CPU. The next column is type of the interrupt – which is not important in this case, and the last column are modules that are listening for the interrupt.

I knew that timer could be ignored, it increments on each clock tick.
The LOC stands for local timer – and can be ignored as well.
RES stands for Rescheduling interrupts – and it looked fine.

The other two IRQ numbers that were very active were 225 and 233.
225 was used by uhci_hcd:usb3, ata_piix while 233 was used by eth0.
This is a web server, so I expected lots of network traffic, and high IRQ activity for eth0 was normal.
IRQ 225 didn't look as good. uhci_hcd is used for USB, and ata_piix is your standard ATA hard disk. The disk activity (based on iotop and iostat output) wasn't that high, but the counter was increasing very fast. Could it be interrupt storm caused by some conflict between two devices?

Well, USB is not needed on a web server, so it was easy to test.
# rmmod uhci_hcd ohci_hcd ehci_hcd
unloaded USB related modules, and now
# cat /proc/interrupts
CPU0 CPU1
0: 1567154926 60143 IO-APIC-edge timer
1: 1 2 IO-APIC-edge i8042
8: 0 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
12: 1 3 IO-APIC-edge i8042
50: 226 0 PCI-MSI hda_intel
225: 13475 111239531 IO-APIC-level ata_piix
233: 62 327404778 PCI-MSI eth0
NMI: 493550 580292
LOC: 1567228505 1567230167
RES: 27526034 15340183
ERR: 0
MIS: 0


USB was no more. System became responsive, and si dropped to 3%, load average dropped as well. It was a conflict between USB and ATA. I added USB modules to blacklist so they wouldn't be loaded after reboot. That was done by adding following lines to /etc/modprobe.d/blacklist.conf
# disable usb
blacklist uhci_hcd
blacklist ohci_hcd
blacklist ehci_hcd


While the situation is highly unusual, and probably sign of faulty hardware or bios, it raised an interesting question if USB should be disabled on the server. For most web servers USB is not in use* anyway. Is there harm in having those modules loaded? First of all they take up a little memory. Yet, on some motherboards (like in this example) they might share same IRQ number with another device, and that is bad. It means each time such interrupt happens, both interrupt handlers will wake up and try to decide which will handle it. That wastes CPU cycles. It might not be as bad as in this case (as this one for caused by some hardware issue), but it is still a waste.

It makes a lot of sense to disable any hardware not in use – it might give some extra breathing space for the server.


* If you use KVM - it might use USB for console access, disabling USB is not recommended in this case.

Why CloudLinux is super stable, or standing on the shoulders of the giants.

One of the features of CloudLinux we are targeting and caring for is stability. We surely understand that our customers' profits greatly depend on the quality of service (QoS) that they can provide. Users expect their web sites to be up and running no matter what. Every single minute of a site unavailability increases site owner frustration, and that eventually converts into bad reviews and customers running off while screaming loudly. Yes you should worry now, because they run away from you to that other hosting service provider, and they are taking their money with them!



While downtime is probably inevitable (even in extreme cases, such on a space ship, where it could cost billions of dollars, or in on-line trading system, where it could easily cost much more than that), it should happen as rare as possible. There are many factors to that, some of that are out of CloudLinux control, such as data center reliability. That includes not only obvious things such as uninterruptible power supply and well-connected networking, but also things such as good conditioning. The author once seen a server room disaster caused by a broken A/C unit which just went on fire, transforming its plastic parts into gross amounts of the black smoke and ash, and blowing the resulting products right into a room, due to a huge and fast fan which, ironically, was still perfectly working. Experience not to forget easily, must I say!

So what can CloudLinux do to help increase that famous number of nines metric? Two simple things:
  1. Protect those web sites from each other, by wisely distributing available hardware resources between them.
  2. Be stable and secure, immune to attacks and exploits, do not crash.
First item looks pretty complicated, and it is actually the very core of CloudLinux technology. It deserves a few blog entries, or maybe even a book (not that thin at all!). What I'd like to impart now is the second item (stability, security and no exploits).



In theory, one can write correct and bug free software. In practice, it's just as impossible as flying (wake up, Neo. The matrix has you). Software stability is the result of an endless battle between developers fixing bugs and themselves adding more bugs. Well, they like to refer to those as “features” rather than “bugs”, but it is a truth universally acknowledged that features and bugs come bundled together. That is why every respected software development cycle has a certain phase called “feature freeze”, during which they only add fixes but not bugs.

Sometimes this phase is running in parallel with development, that is, some developers continue to add more stuff, while others are cherry-picking bug fixes from that stream. This is exactly how the -stable branches work in mainline Linux kernel: after releasing a certain kernel version (say 3.1) they keep on bashing the next one (3.2), while people like Greg Kroah-Hartman collect bugfixes and periodically release stable kernels like 3.1.1, 3.1.2 and so on.
Then, Linux vendors are doing the same thing, branching their kernels off of a specific mainline kernel version and adding more and more bug and security fixes. One of the vendors who is particularly good at doing that is Red Hat. With their Enterprise Linux kernels, they usually take a kernel and then marinate it for at least six months, doing testing and fixing. Result? A kernel which is much more stable than the mainline one.

Thanks to open source model, CloudLinux stands on the shoulders of Red Hat. What we do is we take RHEL6 (Red Hat Enterprise Linux, version 6) kernels and put our stuff on top of those kernels. This is a way to improve stability and security. More to say, it lets us concentrate on our real job: providing a good platform for shared hosting, leaving the complex job of maintaining a stable and secure kernel to excellent kernel team of Red Hat. Improve your servers' stability. Go CloudLinux!

New Kernel for CloudLinux 6.x: 2.6.32-231.17.1.lve0.9.16 moved to production


New kernel 2.6.32-231.17.1.lve0.9.16 is moved to production. It includes following bug fixes:

  • Rolled back aacraid driver for some newer adaptec raid cards
  • Rebase to 131.17.1 rh6 update (security, bug fixes, enhancements; RHSA-2011-1350)
  • Beancounters with leaked resources are held for investigation
  • Reworked per CT odirect and fsync control
  • Reworked cpustats to use ktime
  • Fixed kthreads exit crash
  • Allow for ioprio set in CT
  • Make oom dmesg message more verbose
  • Fixes in dcache acct, ioacct, ve, nfsd, cpt
  • Cleanups in venet, veth, nfs
  • Stability fixes

To install:
Code
# yum install kernel-2.6.32-231.17.1.lve0.9.16 



New beta version of memmonitor 0.5-1

New version will log all processes currently in memory, when there is not enough memory. This should provide better information on what is causing server to overload.
To install:
Code
# yum install memmonitor --enablerepo=cloudlinux-updates-testing
To update:
Code
# yum update memmonitor --enablerepo=cloudlinux-updates-testing

More info on memmonitor: http://www.cloudlinux.com/docs/memmonitor.php

Help Us Promote CloudLinux

We could use your help. We’d like you to talk with your server provider, tell them that you are using CloudLinux and that it would be great if they offered CloudLinux OS as a pre-installed option. There are lots of advantages for them, and for you as well. We know that the majority of CloudLinux users are strong advocates for our operating system so we thought, who better to talk with them than you! We have over 100 data center partners offering CloudLinux pre-installed but yours may not be one of them. Your datacenter may not even know that you are using CloudLinux or of any of the stability benefits it offers. Please tell them.
Thanks, in advance, for your help.

Beta: LVE Manager for ISPmanager

Beta version of LVE Manager for ISPmanager is available. It provides web UI to manage, monitor and adjust LVE settings for ISPmanager control panel.
More info can be found here: http://www.cloudlinux.com/docs/ispmanager_lvemanager.php



New Beta Kernel for CloudLinux 6.x: 2.6.32-231.17.1.lve0.9.15

New kernel 2.6.32-231.17.1.lve0.9.15 is available from our cloudlinux-updates-testing repository. It includes following bug fixes:
  • Rolled back aacraid driver for some newer adaptec raid cards
  • Rebase on 131.17.1 rh6 update (security, bug fixes, enhancements; RHSA-2011-1350)
  • Beancounters with leaked resources are held for investigation
  • Reworked per CT odirect and fsync control
  • Reworked cpustats to use ktime
  • Fixed kthreads exit crash
  • Allow for ioprio set in CT
  • Make oom dmesg message more verbose
  • Fixes in dcache acct, ioacct, ve, nfsd, cpt
  • Cleanups in venet, veth, nfs
  • Fix in ext4 that caused some apps to hang
# yum install kernel-2.6.32-231.17.1.lve0.9.15 --enablerepo=cloudlinux-updates-testing


If you have PAE, or Enterprise kernel -- use corresponding prefix, like: kernel-PAE, kernel-ent instead of kernel



    New Kernel for CloudLinux 5.x: 2.6.18-374.3.1.el5.lve0.8.44

    New kernel 2.6.18-374.3.1.el5.lve0.8.44 is available. It includes following bug fixes and new features:
    • Rebased to upstream kernel 028stab094.3 including security and bug fixes RHSA-2011:1212
    • Kernel panic fix for lve_list_next race condition
    • Rebase to upstream 28stab093.2 kernel (security and bug fixes RHSA-2011:1065)
    • Ability to change NCPU on the fly
    • Ability to destroy LVE on the fly (via API)
    • Two modes to calculate load averages with ability to switch between them
    • Fix for OOM/hanged task issue
    • IO Priorities
    Code
    # yum install kernel

    If you have PAE, xen or Enterprise kernel -- use corresponding prefix, like: kernel-PAE, kernel-xen, kernel-ent instead of kernel

    To change NCPU on the fly, use:
    Code
    # lvectl set LVE_ID --ncpu N --force
    --force will cause NCPU change on the fly. Please, don't use that option with kernels prior to lve0.8.42, as it can crash the system

    Load Averages
    CloudLinux has a modified way to calculated load averages, as processes can wait on CPU because of LVE limits, and not because of lack of CPU resources. Previously, our LA algorithm was ignoring uninterruptable processes. You can now switch to another LA algorithm that accounts for uninterruptable processes by running:

    Code
    sysctl -w kernel.full_loadavg=1


    Switching to that mode will cause higher load averages during high IO activity intervals. This should be useful on cPanel servers that have high IO Wait without high load average

    You can always switch back by running:
    Code
    sysctl -w kernel.full_loadavg=0
    IO Priorities

    While we still planning to release IO limits by the end of this year, this release introduces IO priorities. Each LVE is set to have IO priority of 100 by default (highest possible).
    You can lower that priority, causing particular LVE to be de-prioritizes IO wise. This means that if there is lack of resources on the server, that LVE will get less IO operations then other LVEs.
    IO Priorities work only with CFQ IO scheduler.



    Beta: New Kernel for CloudLinux 5.x: 2.6.18-374.3.1.el5.lve0.8.44

    New kernel 2.6.18-374.3.1.el5.lve0.8.44 is available from our cloudlinux-updates-testing repository. It includes everything from lve0.8.43 kernel as well as:
    • Rebased to upstream kernel 028stab094.3 including security and bug fixes RHSA-2011:1212
    Code
    # yum install kernel-2.6.18-374.3.1.el5.lve0.8.44 --enablerepo=cloudlinux-updates-testing
    If you have PAE, xen or Enterprise kernel -- use corresponding prefix, like: kernel-PAE, kernel-xen, kernel-ent instead of kernel

    Beta: InterWorx LVE Manager plugin 0.1-2

    LVE Manager plugin for InterWorx is available. It fixes the issue when resellers were able to see LVE Manager.
    To update:
    Code
    # yum update interworx-lvemanager --enablerepo=cloudlinux-updates-testing
    
    Special thanks to InterWorx team for providing the patch.


    New kernel for CloudLinux 6.1: 2.6.32-231.12.1.lve0.9.13

    New kernel 2.6.32-231.12.1.lve0.9.13 for CloudLinux 6.1 is available. Changes since previous version: See also:
    To update, please use:
    # yum update kernel

    New Beta Kernel for CloudLinux 5.x: 2.6.18-374.el5.lve0.8.43

    New kernel 2.6.18-374.el5.lve0.8.43 is available from our cloudlinux-updates-testing repository. It includes everything from lve0.8.42 kernel as well as a bug fix for a bug that was present since lve0.8.42 kernel
    • Fix for lve_list_next race condition
    # yum install kernel-2.6.18-374.el5.lve0.8.43 --enablerepo=cloudlinux-updates-testing
    If you have PAE, xen or Enterprise kernel -- use corresponding prefix, like: kernel-PAE, kernel-xen, kernel-ent instead of kernel




    New Beta Kernel for CloudLinux 5.x: 2.6.18-374.el5.lve0.8.42

    New kernel 2.6.18-374.el5.lve0.8.42 is available from our cloudlinux-updates-testing repository. The kernel brings changes from upstream kernel, as well as number of new features and improvements, including ability to switch number of cores per LVE without reboot and IO priorities. IO Limits will be implemented in the future versions.
    • Rebase to upstream 28stab093.2 kernel (security and bug fixes RHSA-2011:1065)
    • Ability to change NCPU on the fly
    • Ability to destroy LVE on the fly (via API)
    • Two modes to calculate load averages with ability to switch between them
    • Fix for OOM/hanged task issue
    • IO Priorities
    • fs.proc_super_gid was added to specify group of users that can see all processes
    # yum install kernel-2.6.18-374.el5.lve0.8.42 --enablerepo=cloudlinux-updates-testing
    If you have PAE, xen or Enterprise kernel -- use corresponding prefix, like: kernel-PAE, kernel-xen, kernel-ent instead of kernel

    To change NCPU on the fly, use:
    # lvectl set LVE_ID --ncpu N --force
    --force will cause NCPU change on the fly. Please, don't use that option with kernels prior to lve0.8.42, as it can crash the system

    Load Averages
    CloudLinux has a modified way to calculated load averages, as processes can wait on CPU because of LVE limits, and not because of lack of CPU resources. Previously, our LA algorithm was ignoring uninterruptable processes. You can now switch to another LA algorithm that accounts for uninterruptable processes by running:
    sysctl -w kernel.full_loadavg=1
    Switching to that mode will cause higher load averages during high IO activity intervals. This should be useful on cPanel servers that have high IO Wait without high load average. You can always switch back by running:
    sysctl -w kernel.full_loadavg=0

    IO Priorities
    While we still planning to release IO limits by the end of this year, this release introduces IO priorities. Each LVE is set to have IO priority of 100 by default (highest possible). You can lower that priority, causing particular LVE to be de-prioritizes IO wise. This means that if there is lack of resources on the server, that LVE will get less IO operations then other LVEs. IO Priorities work only with CFQ IO scheduler.

    CloudLinux OS 6.1 Released

    I am happy to announce the release of CloudLinux OS 6.1. The new version comes with 2.6.32 kernel, updated apache & php packages and brings us in line with EL distribution. All features available in CloudLinux OS 5.x should be available in CloudLinux OS 6.1
    New ISO images with CloudLinux OS 6.1 are available, and conversion scripts are compatible with CentOS 6.0 and RHEL 6.1. You can find images and conversion scripts here:

    http://www.cloudlinux.com/downloads/


    lve-stats 0.6-7 fixes recent CPU 100% utilization issue

    We have recently introduced an issue where CPU usage would show up at 100% in lve stats output & resource usage plugins, even though account wasn't using that much. The issue would only show high usage, and no accounts were wrongly restricted in that case.

    The version of lve-stats 0.6-7 was released to fix the issue. To update, please run:


    Code
    # yum clean all
    # yum update lve-stats



    New kernel for CloudLinux 6.1: 2.6.32-231.6.1.lve0.9.8

    New kernel 2.6.32-231.6.1.lve0.9.8 is available. The kernel fixes the issue with low default shared memory limits inside LVE that was causing issues for eAccelerator.

    To update:
    Code
    # yum update kernel

    CageFS 2.0 second beta

    We have made several improvements and bug fixes for CageFS:
    • exim.cfg: removed /var/log/exim_mainlog, /var/log/exim_paniclog, /var/log/exim_rejectlog
    • cpanel.cfg: added templates for "suspended account" page
    • cpanel.cfg: added symlink /usr/lib/php.ini (fixing the bug of when no php.ini was loaded)
    • cpanel.cfg: added /usr/local/safe-bin
    • added users nobody, mysql
    • php.cfg: added /usr/local/bin/php (CLI version)
    • cagefsctl: added detection of mount point /var/run/pgsql
    • added pgsql-client.cfg

    To update cPanel servers:
    # yum update cpanel-cagefs cagefs --enablerepo=cloudlinux-updates-testing
    # cagefsctl --update

    To update servers running RPM based control panels:
    # yum update cagefs --enablerepo=cloudlinux-updates-testing
    # cagefsctl --update

    The update version is 2.0-20

    CloudLinux 6.1 Beta Released

    We are happy to announce CloudLinux 6.1 Beta. This release brings us in line with RHEL 6.x, and CentOS 6.x. The standard conversion scripts (centos2cl or cpanel2cl) were upgraded to handle conversion of RHEL/CentOS 6.x servers and ISO images for CloudLinux 6.1 are available for download http://www.cloudlinux.com/downloads

    CloudLinux 6.1 comes with a new kernel based on 2.6.32. The new kernel is more efficient than the one available in CloudLinux 5.x. The new version will no longer create migration threads, and in most cases, should outperform CloudLinux 5.x

    There are few known issues with CloudLinux 6:
    • lve-stats don't work - should be fixed within days
    • CageFS and MySQL governor are not yet available on CloudLinux 6.1 - we plan to introduce compatible versions in a few weeks
    Everything else should be fully functional. We are waiting for our upstream kernel (OpenVZ) to be moved to 'stable' from 'beta', and we should announce a stable release of CloudLinux 6.1 soon after. We expect this within the next 30 days.

    Please try CloudLinux 6.x in your environments. A license is required, so please feel free to register another account/request another trial license if needed.

    CageFS 2.0 public beta released

    I am happy to announce public beta of CageFS 2.0 (known as SecureLVE before). CageFS is compatible with cPanel, as well as majority of RPM based control panels. DirectAdmin support is coming soon.

    CageFS is a virtualized file system and a set of tools to contain each user in its own 'cage'. Each customer will have its own fully functional CageFS, with all the system files, tools, etc.

    The benefits of CageFS are:
    • Only safe binaries are available to user
    • User will not see any other users, and would have no way to detect presence of other users & their usernames on the server
    • User will not be able to see server configuration files, such as apache config files.
    • At the same time, user's environment will be fully functional, and user should not feel in any way restricted. No adjustments to user's scripts are needed.
    CageFS will limit any scripts execution done via:
    • Apache (suexec, suPHP, mod_fcgid, mod_fastcgi)
    • LiteSpeed Web Server
    • Cron Jobs
    • SSH
    • Any other PAM enabled service (requires additional configuration)

    Note: mod_php is not supported, MPM ITK requires custom patch

    Comparing to SecureLVE, CageFS has following improvements:
    • No changes to /etc/passwd file, no longer requires custom shell
    • Support for any PAM enabled service
    • Enable All/Disable All modes with white listing
    • Single binary to control all CageFS operations
    • cPanel support
    • Faster & better skeleton update procedures
    • Prefixes used in /var/cagefs to better scale in environments with large number of customers
    • namespaces for better security
    • Improved skeleton configuration via multiple config files
    • Automatic mount point file generation
    • Numerous other bug fixes and performance improvements

    CageFS documentation and installation procedures can be found here: http://www.cloudlinux.com/docs/cagefs

    If you are using SecureLVE, please, contact our support and we will help you to upgrade to CageFS.

    The confusion of CloudLinux memory limits

    The memory limits in CloudLinux are confusing at best. First of all they count virtual memory allocated by processes, instead of physical memory. And virtual memory use can be much higher, as Linux is very efficient in using same physical memory for multiple processes. We plan to add physical memory limits in the future – yet, this is not the only issue with memory limits.

    No matter if we limit physical or virtual memory, there will always be some guess work in detecting if the script error was due to memory limit, or if it was due to permissions, configuration errors or errors in the script itself. Such errors is the primary reason for us to ship CloudLinux with memory limits disabled by default. Memory limits are useful, and can often save server from overloading, swapping & going down. Yet, they can also add errors, that most sys admins don't connect to memory limits right away

    When software (such as php interpretor or mod_fcgid daemon) tries to allocate memory from a system, LVE can prevent that from happening. It would do it same way OS would do it in the case when there not enough memory. Most applications when they try to allocate memory, and fail, they will fail as well. It would look pretty much as if failed due to bug, or some other error. The distinction is very small, and usually comes as part of cryptic error message and strange exit code. When it comes to website, such errors usually pop up as error 500 – which means that script used to serve the request failed due to some error. In this case it usually means that PHP interpretor failed (same way it would fail on bad php script). Basically – PHP or some other components fails, for whatever reason, and error 500 served. Not much for CL to do here.

    Sometimes it gets even worth. Recently we got a customer who complained about mail() not working in php script. It was working before, but it stopped working after CloudLinux was installed. We knew that CloudLinux 'never' does something like that, and were totally baffled. It was verifiable error. Running php script that was trying to send email would come back with:
    Quote
    Warning: mail() [function.mail]: Could not execute mail delivery program '/usr/sbin/sendmail -t -i'

    Switching back to CentOS kernel would solve the problem (that would disable LVE). It took us some time to stumble upon the fact that it might be memory limits. Once we did, it took a minute to verify it. There was enough memory to run php interpretor, but not enough for sendmail to run on top of it. Hence sendmail would fail, and php would deliver such message. Increasing memory limit removed the issue. There is an easy way to figure out if the issue relates to memory limits. All you need to do is to run:
    Code
    # lveinfo --by-fault=mem --display-username


    If you see user for which script failed in the list, it means that some script for that user hit memory limit within the past 10 minutes. Run the script again, re-check lveinfo (note, it takes 1 minute for it to update) – and you know for sure. Same information can be taken out of /proc/lve/list

    Of course this is not enough, and we plan to do more. We want to create sophisticated notification system, so that both admin & user would be notified in case memory limits are reached. Additionally, we are researching the possibility to detect run time, on webserver level, when one of the processes that was used to serve up the request hit memory limit – and if we can intervene & serve our own error message in such cases. We are still at researching it – and if that would be possible, it would create a nice way to take out the confusion.

    Beta: MySQL Governor v0.5-7

    New beta version of MySQL governor is out. This version does additional performance optimization when retrieving user list.

    To update:
    Code
    # yum update db-governor db-governor-mysql --enablerepo=cloudlinux-updates-testing

    New Kernel: 2.6.18-338.19.1.el5.lve0.8.36

    New kernel 2.6.18-338.19.1.el5.lve0.8.36 is available now. The kernel brings changes from upstream kernel (OpenVZ 028stab092.2), as well as improves load averages reporting. The kernel has following bug fixes and enhancements:
    • Rebase on 238.19.1 rhel5.6 update (security and bug fixes RHSA-2011:0927-1 )
    • DRBD is compiled in xen x86_64 kernels
    • OpenVZ kernel related bugfixes (more)
    • Improved load averages
    In previous kernel versions, load averages wouldn't include processes in D state (uninterruptible state). Those are the processes that are waiting on IO. This kernel fixes the issue. You can switch back to old behavior by running:

    sysctl -w kernel.full_loadavg=0


    or adding it to /etc/sysctl.conf and run sysctl -p


    To install new kernel:
    Code
    # yum install kernel-2.6.18-338.19.1.el5.lve0.8.36


    If you have PAE, xen or Enterprise kernel -- use corresponding prefix, like: kernel-PAE, kernel-xen, kernel-ent instead of kernel

    Pages: Prev. | 1 | ... | 16 | 17 | 18 | 19 | 20 | 21 | Next