[go: up one dir, main page]

File: timbl.1

package info (click to toggle)
timbl 6.10-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 3,088 kB
  • sloc: cpp: 17,211; ansic: 425; sh: 70; makefile: 63
file content (442 lines) | stat: -rw-r--r-- 7,039 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
.TH timbl 1 "2017 November 9"

.SH NAME
timbl \- Tilburg Memory Based Learner
.SH SYNOPSIS
timbl [options]

timbl \-f data\-file \-t test\(hyfile

.SH DESCRIPTION
TiMBL is an open source software package implementing several memory\(hybased learning algorithms, among which IB1\(hyIG, an implementation of k\(hynearest neighbor classification with feature weighting suitable for symbolic feature spaces, and IGTree, a decision\(hytree approximation of IB1\(hyIG. All implemented algorithms have in common that they store some representation of the training set explicitly in memory. During testing, new cases are classified by extrapolation from the most similar stored cases.

.SH OPTIONS
.B \-a
<n>
or
.B \-a
<string>
.RS
determines the classification algorithm.

Possible values are:

.B 0
or
.B IB
 the IB1 (k\(hyNN) algorithm (default)

.B 1
or
.B IGTREE
 a decision\(hytree\(hybased approximation of IB1

.B 2
or
.B TRIBL
 a hybrid of IB1 and IGTREE

.B 3
or
.B IB2
 an incremental editing version of IB1

.B 4
or
.B TRIBL2
 a non\(hyparameteric version of TRIBL
.RE

.B \-b
n
.RS
number of lines used for bootstrapping (IB2 only)
.RE

.B \-B
n
.RS
number of bins used for discretization of numeric feature values (Default B=20)
.RE

.BR \-\-Beam =<n>
.RS
limit +v db output to n highest\(hyvote classes
.RE

.BR \-\-clones =<n>
.RS
number f threads to use for parallel testing
.RE

.B \-c
n
.RS
clipping frequency for prestoring MVDM matrices
.RE

.B +D
.RS
store distributions on all nodes (necessary for
using +v db with IGTree, but wastes memory otherwise)
.RE

.B \-\-Diversify
.RS
rescale weight (see docs)
.RE

.B \-d
val
.RS
weigh neighbors as function of their distance:
 Z      : equal weights to all (default)
 ID     : Inverse Distance
 IL     : Inverse Linear
 ED:a   : Exponential Decay with factor a (no whitespace!)
 ED:a:b : Exponential Decay with factor a and b (no whitespace!)
.RE

.B \-e
n
.RS
estimate time until n patterns tested
.RE

.B \-f
file
.RS
read from data file 'file' OR use filenames from 'file' for cross validation test
.RE

.B \-F
format
.RS
assume the specified input format
(Compact, C4.5, ARFF, Columns, Binary, Sparse )
.RE

.B \-G
normalization

.RS
normalize distributions (+v db option only)

Supported normalizations are:

.B Probability
or
.B 0

normalize between 0 and 1

.BR addFactor :<f>
or
.BR 1 :<f>

add f to all possible targets, then normalize between 0 and 1  (default f=1.0).

.B logProbability
or
.B 2

Add 1 to the target Weight, take the 10Log and then normalize between 0 and 1

.RE

.B +H
or
.B \-H
.RS
write hashed trees (default +H)
.RE

.B \-i
file
.RS
read the InstanceBase from 'file' (skips phase 1 & 2 )
.RE

.B \-I
file
.RS
dump the InstanceBase in 'file'
.RE

.B \-k
n
.RS
search 'n' nearest neighbors (default n = 1)
.RE

.B \-L
n
.RS
set value frequency threshold to back off from MVDM to Overlap at level n
.RE

.B \-l
n
.RS
fixed feature value length (Compact format only)
.RE

.B \-m
string
.RS
use feature metrics as specified in 'string':
 The format is : GlobalMetric:MetricRange:MetricRange
           e.g.: mO:N3:I2,5\-7

 C: cosine distance. (Global only. numeric features implied)
 D: dot product. (Global only. numeric features implied)
 DC: Dice coefficient
 O: weighted overlap (default)
 E: Euclidian distance
 L: Levenshtein distance
 M: modified value difference
 J: Jeffrey divergence
 S: Jensen\(hyShannon divergence
 N: numeric values
 I: Ignore named  values
.RE

.BR \-\-matrixin =file
.RS
read ValueDifference Matrices from file 'file'
.RE

.BR \-\-matrixout =file
.RS
store ValueDifference Matrices in 'file'
.RE

.B \-n
file
.RS
create a C4.5\-style names file 'file'
.RE

.B \-M
n
.RS
size of MaxBests Array
.RE

.B \-N
n
.RS
number of features (default 2500)
.RE

.B \-o
s
.RS
use s as output filename
.RE

.BR \-\-occurrences =<value>
.RS
The input file contains occurrence counts (at the last position)
value can be one of:
.B train
,
.B test
or
.B both
.RE

.B \-O
path
.RS
save output using 'path'
.RE

.B \-p
n
.RS
show progress every n lines (default p = 100,000)
.RE

.B \-P
path
.RS
read data using 'path'
.RE

.B \-q
n
.RS
set TRIBL threshold at level n
.RE

.B \-R
n
.RS
solve ties at random with seed n
.RE

.B \-s
.RS
use the exemplar weights from the input file
.RE

.B \-s0
.RS
ignore the exemplar weights from the input file
.RE

.B \-T
n
.RS
use feature n as the class label. (default: the last feature)
.RE

.B \-t
file
.RS
test using 'file'
.RE

.B \-t
leave_one_out
.RS
test with the leave\(hyone\(hyout testing regimen (IB1 only).
you may add \-\-sloppy to speed up leave\(hyone\(hyout testing (but see docs)
.RE

.B \-t
cross_validate
.RS
perform cross\(hyvalidation test (IB1 only)
.RE

.B \-t
@file
.RS
test using files and options described in 'file'
Supported options: d e F k m o p q R t u v w x % \-
.RE

.B \-\-Treeorder =value
n
.RS
ordering of the Tree:
 DO: none
 GRO: using GainRatio
 IGO: using InformationGain
 1/V: using 1/# of Values
 G/V: using GainRatio/# of Valuess
 I/V: using InfoGain/# of Valuess
 X2O: using X\(hysquare
 X/V: using X\(hysquare/# of Values
 SVO: using Shared Variance
 S/V: using Shared Variance/# of Values
 GxE: using GainRatio * SplitInfo
 IxE: using InformationGain * SplitInfo
 1/S: using 1/SplitInfo
.RE

.B \-u
file
.RS
read value\(hyclass probabilities from 'file'
.RE

.B \-U
file
.RS
save value\(hyclass probabilities in 'file'
.RE

.B \-V
.RS
Show VERSION
.RE

.B +v
level or
.B \-v
level
.RS
set or unset verbosity level, where level is:

 s:  work silently
 o:  show all options set
 b:  show node/branch count and branching factor
 f:  show calculated feature weights (default)
 p:  show value difference matrices
 e:  show exact matches
 as: show advanced statistics (memory consuming)
 cm: show confusion matrix (implies +vas)
 cs: show per\(hyclass statistics (implies +vas)
 cf: add confidence to output file (needs \-G)
 di: add distance to output file
 db: add distribution of best matched to output file
 md: add matching depth to output file.
 k:  add a summary for all k neigbors to output file (sets \-x)
 n:  add nearest neigbors to output file (sets \-x)

  You may combine levels using '+' e.g. +v p+db or \-v o+di
.RE

.B \-w
n
.RS
weighting
 0 or nw: no weighting
 1 or gr: weigh using gain ratio (default)
 2 or ig: weigh using information gain
 3 or x2: weigh using the chi\(hysquare statistic
 4 or sv: weigh using the shared variance statistic
 5 or sd: weigh using standard deviation. (all features must be numeric)
.RE

.B \-w
file
.RS
read weights from 'file'
.RE

.B \-w
file:n
.RS
read weight n from 'file'
.RE

.B \-W
file
.RS
calculate and save all weights in 'file'
.RE

.B +%
or
.B \-%
.RS
do or don't save test result (%) to file
.RE

.B +x
or
.B \-x
.RS
do or don't use the exact match shortcut
   (IB1 and IB2 only, default is \-x)
.RE

.BR \-X " file"
.RS
dump the InstanceBase as XML in 'file'
.RE

.SH BUGS
possibly

.SH AUTHORS
Ko van der Sloot Timbl@uvt.nl

Antal van den Bosch Timbl@uvt.nl

.SH SEE ALSO
.BR timblserver (1)