wiki:QemuGiftServer

Installing  GNU Gift on QemuOemrServer?

Assumptions

We're going to assume:

  • you're using a virtual machine manufactured to the specifications of the FAI configuration maintained by user juri_ at gitorious.org

Preparing Source

Following the directions on  Savannah, use CVS to download the newest version:

mkdir -p /home/gift/gift/src/orig/
cd /home/gift/gift/src/orig/
cvs -z3 -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/gift co gift
mv gift gnuift-0.1.14

Create an original tarball, for later debian package building:

tar -cvzf gnuift_0.1.14.orig.tar.gz gnuift-0.1.14/

Building from Source

Generate files required to build:

cd gift
./bootstrap-cvs.sh

Set up the build system:

./configure

Build Gift:

make

Building a Debian Package

Now that you have built the tree, lets get the tools to build a debian package out of our source tree.

mkdir -p /home/gift/gift/src/debian_package/
cd /home/gift/gift/src/debian_package/
git clone git://gitorious.org/gnu-gift-debian-package/gnu-gift-debian-package.git debian
tar -cvzf gnuift_0.1.14.debian.tar.gz debian/

Grab our tarball we created earlier

cp /home/gift/gift/src/orig/*.tar.gz /home/gift/gift/src/debian_package/

Extract gift, adding a debian control directory.

tar -xzf gnuift_0.1.14.orig.tar.gz
cd gnuift-0.1.14
tar -xzf ../gnuift_0.1.14.debian.tar.gz

now, build the package.

debuild

Installing Your New Debian Packages

Since image indexing takes twice(!!) as long with whats in debian, versus the packages we just built, make sure to install the packages you just created.

sudo dpkg -i /home/gift/gift/src/debian_package/*.deb

Obtaining Sample Image Data

I grabbed my sample image data from  The Open Clip Art Project.

OpenClipArt? 0.19

Specifically, I downloaded their  0.19 release, then ran the following command, to move all the .png files into one flat directory, for gift to index:

cd /home/gift/
wget http://www.openclipart.org/downloads/0.19/openclipart-0.19.tar.bz2
tar -xjf openclipart-0.19.tar.bz2
mkdir -p gift/openclipart-0.19/
cp -a `find openclipart-0.19 -name *.png` gift/openclipart-0.19/

OpenClipArt? 2.0

Download the  2.0 release, then move all the .png files into one flat directory, for gift to index:

cd /home/gift/
wget http://www.openclipart.org/downloads/2.0/openclipart-2.0-full.tar.bz2
tar -xjf openclipart-2.0-full.tar.bz2
mkdir -p gift/openclipart-2.0/
cp -a `find openclipart-2.0-full -name *.png` gift/openclipart-2.0/

Indexing your Collection

To index your collection, first copy in a default configuration file, then run gift-add-collection with the full path of the images you want to import.

The gift packaged in lenny and the current CVS version fail to create a config file. copying the file from the default, before running gift-add-collection.pl works around the problem.

cp /usr/share/libmrml1/gift-config.mrml /home/gift/
gift-add-collection.pl /home/gift/gift/openclipart-*/

now wait a long, long time.

Speeding this up

Downloading and installing the new version from CVS results in this step taking half(!) as long.

In addition, you can create the thumbnails using a separate script, at the same time. gift-add-collection.pl will skip generating a thumbnail if you create them for it either ahead of time, or while it is running.

NOTE: this script is maintained in CVS, and the version in this web page may become out-of-date.

gift-generate-thumbnails.sh

#!/bin/sh

# first argument: path of directory containing images to thumbnail

target=`dirname $1`/`basename $1`
thumbnail_dir=`dirname $1`/`basename $1`_thumbnails

echo "converting images in $target, placing them in $thumbnail_dir."

for each in `find $target -maxdepth 1 -type f|sed "s=\(.*\)/=="`; do {
convname=`echo $each | sed "s/\(.*\)[.]/\1_thumbnail_/"`
if [ ! -f "$thumbnail_dir/${convname}.jpg" ]; then
    {   
        echo converting $each
        convert -geometry 128x128 -quality 100 ${target}/${each} $thumbnail_dir/${convname}.jpg
    }
fi
}
done;

Fixing Problems

Failure to add Collection ID

PROGRESS: 99%


Copying /home/gift/gift-config.mrml to /home/gift/gift-config.mrml-old

XML::DOM::Attr=ARRAY(0x9e06b78)
Can't locate object method "getAttribute" via package "XML::DOM::Attr" at /usr/b
in/gift-add-collection.pl line 855, <LOCALELIST> line 274.
----> collection-id c-59-50-8-18-9-111-2-290-0  <----

This happened after a 'successful' run, where the mergesort had gone sideways. i had removed the gift data directory, and forgotten to restore the gift-config to default.

to fix it, since the collection had already written out this file successfully with the wrong collection ID, it is meerly required to find the collection ID in the file, and replace it with the one presented in the error message (not the one from this site!).

MergeSort? Blowup

...finished
before mergesort
Starting quicksort: 1048576 elements per page.
Sorting files /home/gift/gift-indexing-data/images//gift-auxiliary-1
to            /home/gift/gift-indexing-data/images//gift-auxiliary-2
NOW ALLOCATING A PAGE1048576
HIERFIRSTLEVELQUICK226868124;0
................gift-generate-inverted-file: ../../libGIFTAcInvertedFile/include
/merge_sort_streams.h:282: void first_level_quicksort(int, const char*, const ch
ar*) [with T = CIFBuilderTriplet]: Assertion `lTemporary' failed.

PROGRESS: 99%

This was the result of running out of disk during the mergesort. removed everything, freed up some disk space, and started over.

Out of Memory

PROGRESS: 99%

Copying /home/gift/gift-config.mrml to /home/gift/gift-config.mrml-old

Ran out of memory for input buffer at /usr/lib/perl5/XML/Parser/Expat.pm line 469, <LOCALELIST> line 270.                                                             

Run out of memory? create a 256 meg swap file, and use it:

dd if=/dev/zero of=/home/gift/swapfile bs=4k count=65536
sudo mkswap /home/gift/swapfile
sudo swapon /home/gift/swapfile

now skip down to the section on re-starting your run.

Infinity when re-starting a run

STARTING mit MERGESIZE1
MERGESORT MergeSize 12
endmerge
after mergesort. The last file I used was /home/gift/gift-indexing-data/clipart/
/gift-auxiliary-2
Opening sorted stream for reading. State (should be '1'): 0xbff84bec
[inFeatureID:4/0;inPosition:16/0==0]20
Writing Chunk for Feature ID 0. The Offset is 0x0=0
The collection frequency is: inf
gift-generate-inverted-file: CInvertedFileChunk.cc:117: bool CInvertedFileChunk:
:writeBinary(std::ostream&, TID, size_t) const: Assertion `!"collection frequenc
y out of range"' failed.


PROGRESS: 99%


Copying /home/gift/gift-config.mrml to /home/gift/gift-config.mrml-old

Ran out of memory for input buffer at /usr/lib/perl5/XML/Parser/Expat.pm line 46
9, <LOCALELIST> line 274.

This happened when I re-ran gift-add-colllection after a failed run due to no config file.

because it did not regenerate any .fts files, the script fails the next step.

Fix the root cause of the failure, then follow the directions on re-starting a broken run:

cp /usr/share/libmrml1/gift-config.mrml /home/gift/

Re-starting A Broken Run

Assuming you have fixed the root cause of whatever failure you encountered...

Backup the completed mapping between URIs and feature files.

cp /home/gift/gift-indexing-data/openclipart-*/url2fts.xml /home/gift

Next, clean out generated files.

rm /home/gift/gift-indexing-data/openclipart-*/InvertedFile*
rm /home/gift/gift-indexing-data/openclipart-*/gift-aux*
rm /home/gift/gift-indexing-data/openclipart-*/00*

re-run gift-add-collection, and place the backup url2fts.xml file in place.

gift-add-collection.pl /home/gift/gift/openclipart-*/
cp /home/gift/url2fts.xml /home/gift/gift-indexing-data/openclipart-*/

now, get a count of images in your image repository, and insert the count into /home/gift/gift-config.mrml.

cat gift-config.mrml | sed 's/images="[0-9]*/images="'`find gift/openclipart-*/ -type f | wc -l`'/' > gift-config.new
cp gift-config.new gift-config.mrml

Finally, have gift-add-collection update the collection ID in the configuration file.

gift-add-collection.pl -fix-config /home/gift/gift/openclipart-*/

Setting up a Frontend

RainbowSock? (derived from the historic monosock) is available via gitorious, so to grab a copy:

mkdir -p /home/gift/rainbowsock
cd /home/gift/rainbowsock
git clone git://gitorious.org/rainbowsock/rainbowsock.git rainbowsock

Now, create a tarball of the distribution, so that we can extract it into the webroot at /var/www/:

cd rainbowsock
git archive -o ../rainbowsock-master.tar master

Extract the front-end into the webroot:

cd /var/www/
sudo tar -xf /home/gift/rainbowsock/rainbowsock-master.tar

Remove the default "it works!" page.

sudo rm /var/www/index.html

Add an alias for /home/gift/gift, to serve the images and their thumbnails to the public through apache.

Add the following in /etc/apache2/sites-available/default, between the VirtualHost? tags.

    Alias /gift/ "/home/gift/gift/"
    <Directory /home/gift/gift/>
        Options Indexes FollowSymLinks MultiViews
        AllowOverride None
        Order allow,deny
        allow from all
    </Directory>

Finally, edit /var/www/include/config.php, and change the two 'file' URIs to match the directory names of your image path, and thumbnail path, respectively.

Improving the Frontend

See GiftNewFrontend

Using GIFT for sorting images

Import Batch of Images

We need a web interface for importing a collection of images, that stores image metadata (relationships of one image to another, whether an image is part of the image database or a 'stock' image, tags, etc) in an sql database.

Create a default set that is all images in the imported collection, named 'collectionname-origin'. This set should be read-only.

Sort Images into Sets

Using something similar to the current 'search' page, decide form type by making a search that only returns forms of the type you are looking for. To do this, run a search, and create a set by using that search's results, plus a floor value for how alike of results to find. create an derived anti-set to pull out any false-positives, and you're done sorting.

cutting forms up

Create a new collection out of the result set, instead of using convert to resize the entire image in the import script, crop it to known region.. (look at css for defining regions?)

This will cut the field segments we want out. you may want to import them into a new collection, along side some 'known' field images (known signatures, printed client IDs, etc)

Getting Results

Compare the imported/generated images with the cut fields. matches should be in the high percentage with matching fields. Use this corelation to file forms by field match.

Importing into OpenEMR

sort forms based on type, AND field match. perform action with form based on form type.