<?xml version="1.0" encoding="utf-8"?>
<!-- generator="FeedCreator 1.7.2-ppt DokuWiki" -->
<?xml-stylesheet href="http://geek.kyloo.net/software/lib/exe/css.php?s=feed" type="text/css"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="http://geek.kyloo.net/software/feed.php">
        <title>Qin Gao's Softwares </title>
        <description></description>
        <link>http://geek.kyloo.net/software/</link>
        <image rdf:resource="http://geek.kyloo.net/software/lib/images/favicon.ico" />
       <dc:date>2010-05-11T19:13:50-06:00</dc:date>
        <items>
            <rdf:Seq>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/start?rev=1273527469&amp;do=diff"/>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/mgiza:forcealignment?rev=1273527369&amp;do=diff"/>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/chaski:release_note_chaski?rev=1268062280&amp;do=diff"/>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/chaski:download?rev=1268062169&amp;do=diff"/>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/indepth:mkcls?rev=1264367660&amp;do=diff"/>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/mgiza:overview?rev=1264252543&amp;do=diff"/>
                <rdf:li rdf:resource="http://geek.kyloo.net/software/doku.php/mgiza:release_note_mgiza?rev=1263272199&amp;do=diff"/>
            </rdf:Seq>
        </items>
    </channel>
    <image rdf:about="http://geek.kyloo.net/software/lib/images/favicon.ico">
        <title>Qin Gao's Softwares </title>
        <link>http://geek.kyloo.net/software/</link>
        <url>http://geek.kyloo.net/software/lib/images/favicon.ico</url>
    </image>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/start?rev=1273527469&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-05-10T15:37:49-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>Welcome</title>
        <link>http://geek.kyloo.net/software/doku.php/start?rev=1273527469&amp;do=diff</link>
        <description>Welcome to Qin Gao's software page, hope you can find something useful here

News

 2010/05/10  Updated instruction for force alignment, thanks to Arek. 

 2010/03/08  Bug fix for Chaski Download

 2010/01/23  Release of Chaski and MGIZA will be on SourceForge</description>
    </item>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/mgiza:forcealignment?rev=1273527369&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-05-10T15:36:09-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>Force Alignment Using MGIZA++</title>
        <link>http://geek.kyloo.net/software/doku.php/mgiza:forcealignment?rev=1273527369&amp;do=diff</link>
        <description>This is a mini-HOWTO on force aligning unseen test data using MGIZA++ with existing models. To force align unseen test data, you need the following staff:


	*  A set of models trained by MGIZA++. If you are training Model 1/2/3, you may use model output by GIZA++, however, if you are force aligning using IBM Model 4 or HMM, GIZA++ does not output correct model to be loaded by MGIZA++.
	*  Vocabulary of previous training. The vocabulary files are generated from plain2cooc executable, and in Mose…</description>
    </item>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/chaski:release_note_chaski?rev=1268062280&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-03-08T08:31:20-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>Chaski Release Notes</title>
        <link>http://geek.kyloo.net/software/doku.php/chaski:release_note_chaski?rev=1268062280&amp;do=diff</link>
        <description>0.2.5


Maintenance release, fixed bugs when detecting existence of files on HDFS.

Download on Source Forge

0.2.4


Maintenance release, fixed bugs that may affect interaction with LoonyBin.

Download on Source Forge

0.2.3


Maintenance release, fixed bugs:</description>
    </item>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/chaski:download?rev=1268062169&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-03-08T08:29:29-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>Download Chaski</title>
        <link>http://geek.kyloo.net/software/doku.php/chaski:download?rev=1268062169&amp;do=diff</link>
        <description>Dependencies


Chaski must be run on Hadoop, however it can also be run on local machines with Pseudo-cluster hadoop installations. Most recent Chaski make use of 0.20.1 API, and DOES NOT support any previous version of Hadoop.

Chaski package usually contain all its dependencies jar, and after building the package should be able to run directly on Hadoop, i.e. you do not need to add third party jars to the CLASSPATH, all the jar-files not included in Hadoop distribution will be unjar-ed and re-…</description>
    </item>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/indepth:mkcls?rev=1264367660&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-01-24T14:14:20-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>Make word classes</title>
        <link>http://geek.kyloo.net/software/doku.php/indepth:mkcls?rev=1264367660&amp;do=diff</link>
        <description>1. Basic formuli


Basic formula of n-gram language model:


Two-side class-based language model:



Where  is the class of word .

Alternatively we can use one-side class-base lm:



The difference is the second term, which depends on word instead of word class.</description>
    </item>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/mgiza:overview?rev=1264252543&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-01-23T06:15:43-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>MGIZA</title>
        <link>http://geek.kyloo.net/software/doku.php/mgiza:overview?rev=1264252543&amp;do=diff</link>
        <description>MGIZA++ is a multi-threaded word alignment tool based on GIZA++. It extends GIZA++ in multiple ways:


 Multi-threading 

MGIZA++ can make use of multi-core platforms efficiently. Usually a quad-core machine can have a three-fold speedup over single-thread GIZA++.</description>
    </item>
    <item rdf:about="http://geek.kyloo.net/software/doku.php/mgiza:release_note_mgiza?rev=1263272199&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2010-01-11T21:56:39-06:00</dc:date>
        <dc:creator>Qin Gao</dc:creator>
        <title>Release Notes</title>
        <link>http://geek.kyloo.net/software/doku.php/mgiza:release_note_mgiza?rev=1263272199&amp;do=diff</link>
        <description>0.6.3


 Memory optimization

Filter vocabulary / word class and eliminate duplications. Being able to train with 34M sentence pairs and keep memory below 2G.

 Bug fix 

When log file was specified, model 3/4/5 training will occasionally encounter racing condition and crash. The unnecessary logging information is removed, because the same message is already printed on screen.</description>
    </item>
</rdf:RDF>
