Software Engineer's Notes: 2013

25 July 2013

Fixing PHP start-up error: «Unable to load dynamic library»

Suppose you have a PHP extension compiled as ext.so shared library. Let's assume that it depends on sockets extension(which is compiled as sockets.so library file). It means that PHP should load sockets.so before ext.so. Sometimes it doesn't, however. In this case running any PHP script you'd catch a startup error similar to the following:

/usr/lib64/php/modules/event.so: undefined symbol: php_sockets_le_socket in Unknown on line 0

Current version of PHP(at the time of writing 5.5.0 was the latest stable version) loads extensions from php.ini downwards line by line. So the order of "extensions" in php.ini specifies the order of loading of the shared libraries. Most GNU/Linux distributions use the --with-config-file-scan-dir= configure option, which specifies a directory where PHP looks for additional configuration. Usually this directory contains a file per extension. These files are loaded in alpha-numeric order. It means that our ext.ini will be loaded before sockets.ini, as well as ext.so will be loaded before sockets.so, which is not what we wanted!

People in the PHP Group promised to work on this issue. To fix this issue pro tem just rename the ini files in the config file scan directory. Thus our ext.ini has to be renamed to, z-ext.ini, for instance.

01 June 2013

How to factory reset Samsung Galaxy Note 2

I spent a day scraping my head in attempts to reset my phone to factory settings. Common instructions didn't help. So I'd like to share a very simple way to reset a Samsung Galaxy Note 2. It should work for other similar headsets, too.

The problem

Samsung Galaxy Note 2 GT-N7100 headset has been locked. Neither SIM card, nor Internet are available. There is no way to access menu/options. The only available options are: emergency calls and an invitation to enter Google account.

So the only way to factory reset is to use combinations mentioned in all these instructions like "hold power and volume up buttons until you see menu with 'factory reset' option" etc. But the boot-up menu has just "self-test" options without a tip on factory reset.

The fix

If you have no 'factory reset' option on the boot-up menu, and fully aware that the reset will remove all user data(contacts, custom apps, settings etc.); if the only thing you want is a working phone, then do the following:

Download and install fastboot utility. The simplest way is to download Zip file from this forum thread.

Move to the directory with fastboot and run:

$ ./fastboot -w reboot
< waiting for device >

Hold down the MENU button
Hold down the POWER button for 10..20 seconds until you see the Samsung logo with a red text: "fastboot mode". After you entered fastboot mode, release the phone buttons.

Hold down the VOLUME UP key while connecting the phone with computer using USB cable. Fastboot utility should detect the phone now. It resets the phone and reboots it:

$ ./fastboot -w reboot
< waiting for device >

erasing 'userdata'...
OKAY [  1.610s]
erasing 'cache'...
OKAY [  0.605s]
rebooting...

finished. total time: 2.219s

I did it on Gentoo x86_64. I'm not sure if my udev rules helped. To be on the safe side, I'd put the following to /etc/udev/rules.d/51-android.rules file:

SUBSYSTEM=="usb", ATTR{idVendor}=="1782", MODE="0666", GROUP="plugdev" 
SUBSYSTEM=="usb", ATTR{idVendor}=="22b8", MODE="0666", GROUP="plugdev"
SUBSYSTEM=="usb", ATTR{idVendor}=="0bb4", MODE="0666", GROUP="plugdev"
SUBSYSTEM=="usb", ATTR{idVendor}=="04e8", MODE="0666", GROUP="plugdev"
SUBSYSTEM=="usb", ATTR{idVendor}=="18d1", MODE="0666", GROUP="plugdev"

Don't forget to restart the udev daemon after creating/modifying the rules.

Done! :-) I hope this post saves someone's time.

28 May 2013

Example of XML parsing in Perl

The following is just a one-shot Perl script I used to convert one's XML to a CSV file. For me(Perl beginner) it took a while to figure out a fast and simple way to parse XML in Perl. So it might be useful for someone else.

#!/usr/bin/perl

use strict; use warnings;
use XML::Twig;
use Text::CSV_XS;

my @headers = (
    'Picture',
    'StateID',
    'ItemTypeID',
    'Name',
    'Descript',
    'Article',
    'Price',
    'Qty',
    'ItemAvailabilityID',
    'SerialNumber',
    'Weight',
);

my %twig_handlers = (
    'Item' => \&handle_item,
);
foreach (@headers) {
    $twig_handlers{$_} = \&handle_tag;
}

my $twig = XML::Twig->new(twig_handlers => \%twig_handlers);
my $csv = Text::CSV_XS->new({sep_char => ';'});
my @columns;
my $have_picture = 0;

print join(';', @headers), "\n";
$twig->parsefile(shift @ARGV);

sub handle_tag() {
       if ($_->gi eq 'Picture') {
           $have_picture = 1;
           push @columns, $_->{'att'}->{'url'};
       } else {
           if ($_->gi eq 'StateID' && !$have_picture) {
               push @columns, "";
           }
           push @columns, $_->trimmed_text;
    }
}

sub handle_item() {
    $csv->print(\*STDOUT, \@columns); 

    # handlers are called when elements closed
    print "\n";

    # reset
    @columns = ();
    $have_picture = 0;
}

The XML was similar to the following:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Items SYSTEM "http://domain.com/xml/MerchantItems.dtd">
<Items>
  <Item>
    <Images>
      <Picture url="http://domain.com/775/7757d248aeb782e916a9a51670171ec1.jpeg"/>
    </Images>
    <StateID>1735000</StateID>
    <ItemTypeID>29490209</ItemTypeID>
    <Name><![CDATA[Lamp 2541/8]]></Name>
    <Descript><![CDATA[ ... HTML code ... ]]></Descript>
    <Article>2541/8</Article>
    <Price>7267.50</Price>
    <Qty>100</Qty>
    <ItemAvailabilityID>3719000</ItemAvailabilityID>
    <SerialNumber>92645</SerialNumber>
    <Weight>0</Weight>
  </Item>
  <!-- ... -->
  </Item>
</Items>

26 May 2013

Fixing Sphinx wordforms

Sphinx wordforms feature has a drawback: the stemmer skips "destination" words. In this post I'll show a quick and dirty way to fix a large wordforms file.

Let's assume we have a wordforms file with the following contents:

noisy > noise
noisyyy > noise

Any document having word noisy will be found by searching for noisyyy. And vice versa. Bad news is that the word noise is out of stemming. It means that if we try to search for noise, we'll find only those documents that contain exactly "noise".

Sphinx 2.1.1-beta introduced indextool --morph INDEXNAME option, which applies morphology to the characters given on the standard input, e.g.:

echo 'confidence
> presence' | ~/bin/indextool --morph s3_all
Sphinx 2.1.1-beta (rel21-r3701)
Copyright (c) 2001-2013, Andrew Aksyonoff
Copyright (c) 2008-2013, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file './sphinx.conf'...
dumping stemmed results...
confid presenc

Yes, the output might be more friendly(parsable). It even doesn't support --quiet option at the time. Now we are going to replace the destination words with their stemmed variants. Here is a simple one-shot Perl script for this.

#!/usr/bin/perl
# File: stem-wordforms.pl
use strict;
use warnings;
use Fcntl; # sysopen
use POSIX 'tmpnam';
use IO::Seekable; # seek

sub usage () {
    print "Usage: $0 <filename> [<out_filename>]
<filename>         Path to SphinxSearch wordforms file
<out_filename>   Path to output filename 
";
}

if ($#ARGV < 0) {
    usage;
    exit;
}
my $in_filename  = $ARGV[0];
my $out_filename = $#ARGV > 0 ? $ARGV[1] : $in_filename .".out";
my $tmp_filename = tmpnam();
my $line;
my $n_in_lines = 0;
my $n_out_lines = 0;

sysopen FH_IN, $in_filename, O_RDONLY
    or die("Failed to open `$in_filename`");
sysopen FH_RES, $out_filename, O_CREAT | O_TRUNC | O_WRONLY, 0640
    or die("Failed to open `$out_filename`");
# Open pipe for indextool. We'll write destination wordforms here line by line.
# By means of `tr` command we place a word per line.
open my $fh_indextool, "|-",
"~/bin/indextool -c sphinx.conf --morph s3_all | tr ' ' '\n' > $tmp_filename"
    or (print "Failed\n" and die());

print ">> Stemming to `$tmp_filename`...";
while (<FH_IN>) {
    ++$n_in_lines;

    # Get destination word
    s/^[^\>]+\> // ;
    # Patch: for some reason Sphinx 2.1.1-beta strips `ё` character even if it
    # is declared in charset_table
    s/ё/e/g;
    $line = $_;

    print $fh_indextool $line;
}
close $fh_indextool;
print " OK\n";

print ">> Joining temporary results with the source wordforms... ";
my $got_results = 0;
my $line2;
sysopen(FH_TMP, $tmp_filename, O_RDWR, 0640)
    or die("Failed to open `$tmp_filename`");
seek FH_IN, 0, SEEK_SET;
while (<FH_TMP>) {
    $line = $_;

    if ($got_results == 0 && m/^results\.\.\./) {
        print "Got results... ";
        $got_results = 1;
        next;
    }
    $got_results or next;

    last if !($line2 = <FH_IN>);
    $line2 =~ s/[^\>]+$//;
    print FH_RES $line2 , " ", $line;
    ++$n_out_lines;
}
unlink $tmp_filename or print "Failed to unlink $tmp_filename\n";

$n_in_lines != $n_out_lines and die("* Failed! Number of lines of `$in_filename` and `$out_filename` doesn't match!\n");

print "OK\nDone\n";

close FH_IN;
close FH_RES;
close FH_TMP;

Not very nice script, but hopefully will help someone. You may want to adapt it for your configuration. Usage:

$ ./stem-wordforms.pl wordforms_myspell_ru_RU_UTF-8.txt out
>> Stemming to `/tmp/file7AawtG`... OK
>> Joining temporary results with the source wordforms... Got results... OK
Done