Arnold Daniels

Apr 182009
 

Associated websites often share user information, so a visitor only has to register once and can use that username and password for all sites. A good example for this is Google. You can use you google account for GMail, Blogger, iGoogle, google code, etc. This is nice, but it would be even nicer if logging in for GMail would mean I’m also logged in for the other websites. For that you need to implement single sign-on (SSO).

There are many single sign-on applications and protocols. Most of these are fairly complex. Applications often come with full user management solutions. This makes them difficult to integrate. Most solutions also don’t work well with AJAX, because redirection is used to let the visitor log in at the SSO server.

I’ve written a simple single sign-on solution (400 lines of code), which works by linking sessions. This solutions works for normal websites as well as AJAX sites.
Continue reading »

Jan 092009
 

Simple replacement
String replacement is often used as a way to apply templating. You might replace “%a:test” with “~~test~~” using the regexp: %a:(\w+), replacing it with “~~$1~~”.

Trying to escape
The only problem now, is that I can’t use “%a:” any more within my string. This could be solved by allowing escaping using the backslash. In the regexp we can use a negative lookbehind to see if the character before the % isn’t a backslash: (?<!\\)%a:(\w+).

Escaping the escaping
Now we’re close, however now it’s not possible to use “\%:a” anywhere. We need to be able to escape the backslash as well. We could state the problem as needing to match %a if there isn’t an uneven number of backslashes in front of it. Checking for an uneven number in a negative lookbehind isn’t possible unfortunately, so we need to get the backslashes into the match. We can say: match 0 or more pairs of backslashes, followed by “%a:”, if there is no backslash in front of it. This results in the regexp:
(?<!\\)((?:\\{2})*+)%a:(\w+), replacing it for “$1~~$2~~”.

To finish up
To only thing is that \% and \\ will still be displayed as that. This can simply be solved with a str_replace.

Oct 312008
 

Data stored in a database is often also useful for statistical purposes. If you own a web-shop you want to be able to create a report about turnover. You can get statistical information by using GROUP BY, eg.

SELECT DATE_FORMAT(invoice.date, '%M') AS `month`, COUNT(*) AS `invoice_count`, SUM(`invoice`.`amount`) AS `turnover`
FROM `invoice`
WHERE `date` BETWEEN '2008-01-01' AND '2008-12-31'
GROUP BY MONTH(`invoice`.`date`)
month     invoice_count  turnover
January   84             9532.26
February  141            20857.61
March     91             10922.71
April     112            15044.48
May       101            9676.60
June      137            12860.88
July      281            34291.20
August    191            26377.66
September 103            16324.78
October   99             12873.23

If you are selling a wide variety of products, you might like to see the turnover for each product category. You could do this with a simple GROUP BY as:

SELECT DATE_FORMAT(`invoice`.`date`, '%M') AS `month`, `category`.`description` AS `category`, COUNT(*) AS `product_count`, SUM(`invoice_product`.`amount`) AS `turnover`
FROM `invoice` INNER JOIN `invoice_product` ON `invoice`.`id` = `invoice_product`.`invoice_id` LEFT JOIN `product` ON `invoice_product`.`product_id` = `product`.`id` LEFT JOIN `category` ON `product`.`category_id` = `category`.`id`
WHERE `date` BETWEEN '2008-01-01' AND '2008-12-31'
GROUP BY MONTH(`invoice`.`date`), `category`.`id`
month     category   product_count  turnover
January   Hardware   62             4821.31
January   Software   51             4419.41
January   Cables     12             291.54
February  Hardware   71             8408.93
February  Software   101            11726.36
February  Cables     17             312.32
February  Other      2              410.00
March     Hardware   21             2371.58
March     Software   81             8238.81
March     Cables     13             312.32
...

This would give you each category in a different row, ordered by month. Though this contains all the information the format is far from nice. Instead you would like to have 1 row per month with each category as a column as the information about the invoices as well.

SELECT DATE_FORMAT(`invoice`.`date`, '%M') AS `month`, COUNT(DISTINCT `invoice`.`id`) AS `product_count`, COUNT(*) AS `invoice_count`, SUM(`invoice_product`.`amount`) AS `turnover`,
  SUM(`product`.`category_id`=1) AS `hardware_count`, SUM(IF(`product`.`category_id`=1, `invoice_product`.`amount`, 0)) AS `hardware_turnover`,
  SUM(`product`.`category_id`=2) AS `software_count`, SUM(IF(`product`.`category_id`=2, `invoice_product`.`amount`, 0)) AS `software_turnover`,
  SUM(`product`.`category_id`=3) AS `cables_count`, SUM(IF(`product`.`category_id`=3, `invoice_product`.`amount`, 0)) AS `cables_turnover`,
  SUM(`product`.`category_id`=4) AS `other_count`, SUM(IF(`product`.`category_id`=4, `invoice_product`.`amount`, 0)) AS `other_turnover`
FROM `invoice` INNER JOIN `invoice_product` ON `invoice`.`id` = `invoice_product`.`invoice_id` LEFT JOIN `product` ON `invoice_product`.`product_id` = `product`.`id` LEFT JOIN `category` ON `product`.`category_id` = `category`.`id`
WHERE `date` BETWEEN '2008-01-01' AND '2008-12-31'
GROUP BY MONTH(`invoice`.`date`), `category`.`id`
month     invoice_count  turnover    hardware_count  hardware_turnover  software_count  software_turnover  cables_count  cables_turnover  other_count  other_turnover
January   84             9532.26     62              4821.31            51              4419.41            12            291.54           0            0
February  141            20857.61    71              8408.93            101             11726.36           17            312.32           2            410.00
March     91             10922.71    21              2371.58            81              8238.81            13            312.32           0            0
...

The big downside of this method is that you need to modify the query if a category is added. This can be solved though by dynamically creating the query in a PHP script.

If you want to do advanced statistical you should have a look at OLAP cubes. Pentaho is an open-source reporting app which supports MySQL. http://mondrian.pentaho.org/

All data is fictional. The SQL queries are untested.

Jul 292008
 

I was reading this month’s php|architect. It has a nice article about EAV modeling. I had seen this db structure in other project, but didn’t know that it was called EAV. For those who don’t read php|architect, EAV describes a method of saving a large set of attributes, only some of which apply to an individual entity. Normally you would create a table, with a row for each entity and save each attribute in a column. With EAV you save each attribute as a row.

This makes selecting the data quite tedious. If you can life with some constraints, there is an easier way to do this.
Continue reading »

Jul 112008
 

Today I started with learning erlang. I’ll keep a log on my blog with the progress. On the first I’ve read the first 2 chapters of http://www.erlang.org/course. This language doesn’t look like anything else I’ve ever programmed. So don’t expect so much of a Erlang for PHP programmers kind of tutorial, since the whole thought is different. The first chapter was history which I pretty much skipped. The second chapter is ‘Sequential Programming’. Since all I’ve ever done is sequential programming it should be fairly understandable.
Continue reading »

Jun 302008
 

Just a short one today, because I’m really busy (un)fortunately.

You can take out and check each argument at a time:

1
2
3
4
  $search = isset($_GET['search']) ? $_GET['search'] : null;
  $page = isset($_GET['page']) ? $_GET['page'] : 1;
  $limit = isset($_GET['limit']) ? $_GET['limit'] : 15;
  // etc
  $search = isset($_GET['search']) ? $_GET['search'] : null;
  $page = isset($_GET['page']) ? $_GET['page'] : 1;
  $limit = isset($_GET['limit']) ? $_GET['limit'] : 15;
  // etc

But to do it in one go, just do:

1
  $args = $_GET + array('search'=>null, 'page'=>1, 'limit'=>15);
  $args = $_GET + array('search'=>null, 'page'=>1, 'limit'=>15);

Note that in some cases it is important to filter your input, that is not done here.

May 312008
 

How to get a file extension in PHP:

1
$ext = pathinfo($file_name, PATHINFO_EXTENSION);
$ext = pathinfo($file_name, PATHINFO_EXTENSION);

How to get a file extension in Perl:

1
my $ext = ($file_name =~ m/([^.]+)$/)[0];
my $ext = ($file_name =~ m/([^.]+)$/)[0];

How to get a file extension in Ruby:

1
ext = File.extname(file_name)
ext = File.extname(file_name)

How to get a file extension in Bash:

ext=${file_name##*.}
name=${file_name%.*}

How to get a file extension in Python (thanks to Jensen):

import os
ext = os.path.splitext(file_name)[1]


Got more? Please post a comment.

Apr 212008
 

Send in by Ruben Daniels

Lustre is one of the most popular upcoming open source cluster file systems out there. When you want to run Xen’s from a SAN using Lustre you need to support both in the Linux kernel. Both XEN and Lustre are near mature products. This means there is support for it. But it’s quite difficult to find the right source and to combine it with the right kernel source of each. It took me a week of trial and erroring until I found a combination that worked. Since Google wasn’t much help I wrote this article so it might help you. This installation is Ubuntu Gutsy specific. You can start out with a basic Gutsy installation. Hardy is getting Lustre support, but at the time of this writing the package doesnt match the default kernel of Hardy.

First we apt-get some utilities and Xen packages

apt-get install quilt libc6-xen libxen3.1 linux-image-xen linux-restricted-modules-xen linux-xen python-xen-3.1 xen-docs-3.1 xen-hypervisor-3.1 xen-image-2.6.19-4-server xen-ioemu-3.1 xen-tools xen-utils-3.1

Now we get the kernel from kernel.org

cd /usr/src
wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.22.tar.gz
tar -zxvf linux-2.6.22.tar.gz

The best matching (and stable) lustre source I’ve found is in the Hardy Lustre source package. So we get it and extra it manually:

wget http://ubuntu2.cica.es/ubuntu/ubuntu/pool/universe/l/lustre/lustre-source_1.6.4.2-1_all.deb
dpkg -x lustre-source_1.6.4.2-1_all.deb lustre-1.6.4.2
bunzip2 lustre-1.6.4.2/usr/src/lustre.tar.bz2
tar -xvf lustre-1.6.4.2/usr/src/lustre.tar
mv lustre-1.6.4.2/usr/src/modules/lustre/* lustre-1.6.4.2
rm -R lustre-1.6.4.2/usr -R

The official XEN kernel version is 2.6.18. This is way too old for newer machines. So we’ll get the Gutsy kernel source which contains XEN patches for our 2.6.22 kernel and apply them:

apt-get source linux-source-2.6.22
cd /usr/src/linux-source-2.6.22-2.6.22
cd debian/binary-custom.d/xen/patchset
patch -p1 /usr/src/linux-2.6.22 < 001*
patch -p1 /usr/src/linux-2.6.22 < 002*
patch -p1 /usr/src/linux-2.6.22 < 003*

Now we apply the lustre patches:

cd /usr/src/linux-2.6.22
ln -s /usr/src/lustre-1.6.4.2/lustre/kernel_patches/series/2.6.22-vanilla.series series
ln -s /usr/src/lustre-1.6.4.2/lustre/kernel_patches/patches patches
quilt -av push

Make the Kernel:

make distclean
cp debian/binary-custom.d/xen/patchset/config.i386 .
make menuconfig
make
make install modules_install

If you use kernel modules you need to run the following command to create the ram fs.

mkinitramfs -o /boot/initrd.img-2.6.22 2.6.22

Add the kernel to grub’s menu.lst. You could do this manually, but let’s just use the ubuntu script.

update-grub

Reboot the machine. After reboot build Lustre:

cd /usr/src/lustre-1.6.4.2
./configure --with-linux=/usr/src/linux-2.6.22
make
make install

Reboot the machine again. You should now have a working lustre/xen kernel.