tomoe-devel Mailing List for Tomoe

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I have good news ! Using tomoe_query_set_max_n_strokes gives very good 
results. With a patch like the following, candidates get displayed very 
fast!

--- module/recognizer/tomoe-recognizer-simple-logic.c   (révision 1543)
+++ module/recognizer/tomoe-recognizer-simple-logic.c   (copie de travail)
@@ -90,6 +90,17 @@
 
     query = tomoe_query_new ();
     tomoe_query_set_min_n_strokes (query, input_stroke_num);
+
+    /* Statistics show that characters with less than 6 strokes
+       represent less that 10% of characters and characters with
+       between 7 and 13 strokes represent more than 60% of characters */
+    if (input_stroke_num <= 6) {
+        tomoe_query_set_max_n_strokes (query, input_stroke_num + 5);
+    }
+    else if(input_stroke_num <= 13) {
+        tomoe_query_set_max_n_strokes (query, input_stroke_num + 3);
+    }
+
     target_chars = tomoe_dict_search (dict, query);
     g_object_unref (query);
     if (!target_chars) return NULL;

I think we can add this by default even for platforms which don't have 
performance issues because IMHO comparing for example a one stroke input 
with a character of more than say 10 strokes doesn't make sense !

Here are some statistics I have made using handwriting-ja.xml 
(handwriting-zh_CN.xml gives similar results) :

N_strokes       N_characters    Cumulated       Percent
1       26      26      0.402476780185758 %
2       56      82      0.86687306501548 %
3       77      159     1.19195046439628 %
4       128     287     1.98142414860681 %
5       158     445     2.44582043343653 %
6       201     646     3.11145510835913 %
7       318     964     4.92260061919505 %
8       440     1404    6.81114551083591 %
9       476     1880    7.36842105263158 %
10      550     2430    8.51393188854489 %
11      577     3007    8.93188854489164 %
12      570     3577    8.82352941176471 %
13      534     4111    8.26625386996904 %
14      434     4545    6.71826625386997 %
15      428     4973    6.62538699690402 %
16      367     5340    5.68111455108359 %
17      297     5637    4.59752321981424 %
18      207     5844    3.20433436532508 %
19      166     6010    2.56965944272446 %
20      138     6148    2.13622291021672 %
21      107     6255    1.65634674922601 %
22      68      6323    1.05263157894737 %
23      50      6373    0.773993808049536 %
24      37      6410    0.572755417956656 %
25      19      6429    0.294117647058824 %
26      11      6440    0.170278637770898 %
27      9       6449    0.139318885448916 %
28      6       6455    0.0928792569659443 %
29      2       6457    0.0309597523219814 %
30      3       6460    0.0464396284829721 %

Cheers,

Mathieu




2007	Jan	Feb	Mar	Apr (50)	May (10)	Jun (48)	Jul (72)	Aug (1)	Sep	Oct (2)	Nov (6)	Dec (1)
2008	Jan (9)	Feb (9)	Mar (1)	Apr (1)	May (3)	Jun (10)	Jul (5)	Aug	Sep (1)	Oct	Nov (17)	Dec
2009	Jan	Feb (1)	Mar	Apr	May (7)	Jun (2)	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
					1 (1)	2 (2)
3	4	5	6 (2)	7 (2)	8	9
10	11 (1)	12	13	14	15	16
17 (3)	18 (13)	19 (11)	20 (10)	21	22	23
24	25	26	27	28	29 (3)	30

tomoe-devel Mailing List for Tomoe

tomoe-devel — Discussion about Tomoe development