Isiklikud vahendid
Oled siin: Algus EKT projektid Semantika vahendid eesti keelele Semantilised klassid Indeksite dokumentatsioon

Indeksite dokumentatsioon

 

EuroWN Indexes

File make-indexes.sh executes a series of shell scripts and Python scripts to make indexes that are needed for quick and efficient browsing of Eurowordnet file.

[R nov 16 13:31:45 EET 2012]

soi-file

kb<version>-<encoding>-<subversion>.soi

File is in the following format: <synset_number>:<offset_in_file>

<synset_number> is (unique) synset number, like in *.norm or *.txt, line

0 @123@ WORD_MEANING

between @-s.

<offset_in_file> is byte offset number in *.norm file, like found by grep -b.

1:21
2:2355
3:3341
4:5284
5:6671
6:16100
7:18831
8:23410
9:25308
10:26687

rlx-file

kb<version>-<encoding>-<subversion>.rlx

File is in the following format: <literal>:<synset_number>

Raw literal index file. Field <literal> is literal, like in *.norm or *.txt, line

2 LITERAL "rahumeelne"

Not fields by themselves, but each line should be unique.

This file is used to generate literal index file.

korraldama:1
korda seadma:1
korraldamine:2
küsima:3
paluma:3
nõutama:3
küsimine:4
palumine:4
nõutamine:4
mõjutama:5

lix-file

kb<version>-<encoding>-<subversion>.lix

File is in the following format: <literal>:<synset_number_list>

where <synset_number_list> is space-separated list of synset numbers.

Literals should be unique.

kaabellevifirma:39345
triikimisruum:32795
tuletõkkesein:40136
vusistamine:26905
võistlusdistants:48588
jäljend:10857
raskerelv:9291
jaamahoone:16394 39829
mahlakuu:7749
fabritseerima:18632

tix-file

kb<version>-<encoding>-<subversion>.tix

File is in the following format: <synset_number>:<part_of_speech>: <literal>:<sense_number>

where <part_of_speech> is part of speech, like in *.norm or *.txt, line

 1 PART_OF_SPEECH "v"

and <sense_number> is sense number, like in *.norm or *.txt, line

    3 SENSE 7

"tuple" <part_of_speech>: <literal>:<sense_number> should be unique

1:v: korraldama:7
1:v: korda seadma:3
2:n: korraldamine:3
3:v: küsima:2
3:v: paluma:2
3:v: nõutama:2
4:n: küsimine:2
4:n: palumine:1
4:n: nõutamine:1
5:v: mõjutama:1

rix-file

kb<version>-<encoding>-<subversion>.rix

File is in the following format: <synset_number>:<part_of_speech>:<semantic_relation>:<target_part_of_speech>:<target_literal>:<target_sense_number>

where <semantic_relation> is semantic relation, like in file *.norm or *.txt, line

  2 RELATION "near_synonym"

and <target_part_of_speech> is part of speech of target concept, like in file *.norm or *.txt:

    3 TARGET_CONCEPT
       4 PART_OF_SPEECH "v"

and <target_literal> and <target_sense_number> come from following lines:

      4 LITERAL "seadma"
         5 SENSE 2

Each line of the file should be unique.

1:v:near_synonym:v:seadma:2
1:v:has_hyperonym:v:parandama:2
1:v:has_hyponym:v:süstematiseerima:1
1:v:has_hyponym:v:arveldama:1
1:v:has_xpos_hyponym:n:korraldumine:1
1:v:has_xpos_hyponym:n:süstematiseerimine:1
1:v:involved_agent:n:korraldaja:3
2:n:has_hyperonym:n:tegutsemine:2
2:n:has_xpos_hyperonym:v:tegutsema:3
3:v:has_hyperonym:v:andma:1

iix-file

kb<version>-<encoding>-<subversion>.iix

File is in the following format:

<synset_number>:<part_of_speech>:<eq_semantic_relation>:<target_part_of_speech>:<target_wordnet_offset>

where <eq_semantic_relation> is semantic relation, like in file *.norm or *.txt, line

   2 EQ_RELATION "eq_synonym"

and <target_part_of_speech> and <target_wordnet_offset> come from following lines:

     3 TARGET_ILI
       4 PART_OF_SPEECH "v"
       4 WORDNET_OFFSET 416049
1:v:eq_synonym:v:416049
2:n:eq_has_holonym:n:55898
3:v:eq_synonym:v:422854
4:n:eq_near_synonym:n:4638292
5:v:eq_synonym:v:432532
6:n:eq_has_hyperonym:n:88700
7:v:eq_synonym:v:451248
8:n:eq_has_hyperonym:v:452960
9:v:eq_near_synonym:v:452960
10:v:eq_synonym:v:467082

iax-file

kb<version>-<encoding>-<subversion>.iax

File is in the following format:

<synset_number>:<part_of_speech>:<eq_semantic_relation>:<target_part_of_speech>:<target_add_on_id>

where <target_add_on_id> is add on id like in file *.norm or *.txt, line

      4 ADD_ON_ID 5101
1:v:eq_generalization:v:5101
1:v:eq_generalization:v:6298
3:v:eq_generalization:v:2040
5:v:eq_generalization:n:1800
5:v:eq_generalization:v:2054
5:v:eq_generalization:v:5697
5:v:eq_generalization:v:5978
7:v:eq_generalization:v:5863
8:n:eq_generalization:n:1506
8:n:eq_generalization:v:5403

Date: 2013-02-27T19:51+0200

Author: Neeme Kahusk

Org version 7.9.3e with Emacs version 23

Validate XHTML 1.0
Tegevused dokumentidega