Title: Automatic Extraction of False Friends from Billingual Parallel Corpus
1???????? ??????????? ??. ???????
???????? ???????? ?? ?????????? ?
??????????? ??????? "????????????? ??????????"
??????????? ????????? ?? ??????? ???????? ??
????????? ????????? ??????
?????????? ?? ?????????? ?? ????????????? ?
?????? ?????? ??????
?????? ??????????? 01.01.12 ???????????
?????, 12.04.2010 ?.
????????? ??????? ?????? ?????
?????? ??????????? ??. ?. ?. II ??. ?-? ?????
?????????
2??????? ? ??????? ????????
- ???????
- ?????? ????????? ? ?????? ????????
- ????. ???? ? ???. ????
- ??????? ????????
- ?????? ????????? ? ???????? ????????
- ????. ????? ? ???. ????? (??????)
(?????????)
3???? ?? ????????????
- ???????????? ?? ????????? ?? ????????? ??
???????????? ?????????? ??????? - ???????????? ?? ????????? ?? ????????? ?? ???????
???????? ?? ????????? ????????? ??????
(?????????)
4?????? ?? ????????????
- ???????????? ?? ????????? ?? ????????? ??
??????????? ??????? ????? ???? ?? ????????? ?
????? ???? - ???????????? ?? ????????? ?? ????????? ??
?????????? ?????????? ??????? - ???????????? ?? ????????? ?? ????????? ??
???????????? ?????????? ??????? - ???????????? ?? ????????? ?? ??????????? ?????
??????? ? ??????? ???????? - ???????????? ?? ????????? ?? ????????? ?? ???????
???????? ?? ????????? ????????? ??????
(?????????)
5????? ?? ??????????? ??????
- ??????? ?? ????????? ??????? ? ????????? ????????
- ????? ?? ???????????? ????????? ??
- ????????? ?? ??????????? ???????
- ????????? ?? ?????????? ???????
- ????????? ?? ??????? ????????
- ????? ?????????? ????????? ?? ????? ???????? ??
?? ???????? ?? ????????? ? ????? ????
(????? 1)
6????????? MMEDR ??????????? ??????? ?????
????????? ? ?????
- ???????????
- ?????????????? ?? ??????? ????
- ???????????? ?????????? ??????? ?? ?????????????
?? n-????? - ????. ?????? ? ?????, ?????? ? ????
- ????????????? ?????????? ?? ????????? (normalized
edit distance) - ? ????? ??? ?????? ?? ?????
(????? 2)
7????????? SemSim ?????????? ??????? ???? ???????
? ???
- ?????????? ??? ???????? ?? ?????? ????
- ??????, ????? ????? ????????? ???? ?? ????? ?
??????????, ??????????? ? ??? - ??????? ?? ?? ?????????? ?????, ??????? ???
??????? ? Google - ??????????? ?? ?????????? ???????? ??????????
?????? - ????????? ?? ????, ???? ????????, ????????? ??
Google - ?????????? ??????? ??????? ????? ????????? ??
????? ???? (???? ???????????)
(????? 3)
8????????? ?? ?????????? ??????? ???? ??? ??????
(1)
- ????????? ?? ??????? ? Google ?? "?????"
??????? ?? ????????? ????? ?????? ??
???????? ???? ? ??????? ?? ???-????????? ??????
?? ??????? ?? ????????????? ?? ???????? -
?????????? ?? ?????? ????????? ?????. ???? ?
?????, ?????? ??????? ?????? ... ??????? ??
????????? ????? ????? Advise ??????????
????????? ?? ????????? ????? ????? ??????? ? ????
(??)??????? ??????????. ?? ???? ?? ???????? ??
?????????? ???? ?????? ?????? ?? ?????? ?????
... ????? ????? - 8-?? ???? - ???? ??????? ?????
????? ??? ???????? ?????????. ??????????? ??????
(45-80 ??.). ????? ????? ? ?????? ???? ...
???????? ????????? ????? ? ??????. ??? - ???????
???????? ... ...
(????? 3)
9????????? ?? ?????????? ??????? ???? ??? ??????
(2)
- ?????????? ???????? ?????????? ???????
????? ?????
????? 422
??????? 262
????? 202
????????? 167
??????? 94
????? 84
??????? 72
??? 56
?????? 37
... ...
????? ?????
????? 461
?????? 386
?????? 345
?????? 205
??????? 183
???????? 176
?????????? 188
??????? 98
????? 12
... ...
(????? 3)
10????????? CrossSim ?????-??????? ??????????
???????
- ????????? ?? ???????????? ???????? ??????????
??????? Vbg ? Vru ?? ??? - ?????????? ?? ?????? ?? Vbg ?? ????? ????? ??
????? ???????? Vbg ? Vru - ???????? ?????? (????????)
- ???????????? ????? ?????? ????
- ???????????
(????? 3)
11?????????? ?? ??????????? MMEDR, SemSim ? CrossSim
- ?????????????? ????? ??????? ? ??????? ????????
CrossSim - ??????? 96,17
- ????????? ?? ???????? SemSim
- ??????? 67,09
- ????????? ???????????? ?? ???? MMEDR ? CrossSim
- ?????????? ?? 6
(????? 4)
12????????? FFExtract ??????? ???????? ??
????????? ??????
- ?????????????? ?? ?????????
- ????????? MMEDR
- ??????????? ????? ??????? ? ??????? ????????
- ????????????? ?????? ?? ?????? ?? ???? ????????
?? ?????? ? ????????? ????????? - ?????????? ?????? CrossSim
- ?????????? ?????? ??????? 77,64
(????? 5)
13??????? ?? ????????????
- ????????? SemSim ? CrossSim ?? ????????? ??
?????????? ? ???????????? ?????????? ??????? ????
??? - ????????? MMEDR ?? ????????? ????????? ??
??????????? ??????? - ????????? FFExtract ?? ????????? ?? ???????
???????? ?? ????????? ?????? - ????????? ?? ??????????? ????? ??????? ? ???????
????????, ?? ????????? ?? ????????, ??
???????????? ?? ???? - TECFF Toolkit for Extraction of Cognates and
False Friends
(???????? ???????)
14?????????? ?? ????????????
- Nakov P., Nakov S., Paskaleva E. "Improved Word
Alignments Using the Web as a Corpus",
Proceedings of International Conference "Recent
Advances in Natural Language Processing" (RANLP
2007), pages 400-405, Borovets, Bulgaria, 2007 - Nakov S., Nakov P., Paskaleva E. "Cognate or
False Friend? Ask the Web!", Proceedings of the
1st International Workshop on Acquisition and
Management of Multilingual Lexicons, held in
conjunction with RANLP 2007, pages 5562,
Borovets, Bulgaria, 2007 - Nakov S. "Automatic Acquisition of Synonyms Using
the Web as a Corpus". Proceedings of the 3rd
Annual South-East European Doctoral Student
Conference (DSC 2008), Volume 2, pages 216-229,
Thessaloniki, Greece, 2008 - Nakov S. "Measuring Cross-Lingual Semantic
Similarity by Searching in Google". Proceedings
of the 5th International Conference "The
Language A Phenomenon without Frontiers", ISBN
978-954-9685-43-5, pages 238-242, Varna,
Bulgaria, 2008 - Nakov S. "Automatic Identification of False
Friends in Parallel Corpora Statistical and
Semantic Approach", Serdica Journal of Computing,
issue 3, pages 133-158, 2009 - Nakov S., Nakov P., Paskaleva E. "Unsupervised
Extraction of False Friends from Parallel
Bi-Texts Using the Web as a Corpus", Proceedings
of International Conference "Recent Advances in
Natural Language Processing" (RANLP 2009), pages
292-298, Borovets, Bulgaria, 2009 - Nakov S., Paskaleva E., Nakov P. "A
Knowledge-Rich Approach to Measuring the
Similarity between Bulgarian and Russian Words",
Workshop on Multilingual Resources, Technologies
and Evaluation for Central and Eastern European
Languages held in conjuction with RANLP 2009,
Borovets, Bulgaria, 2009
(???????? ???????)
15????????
??????????? ????????? ?? ??????? ???????? ??
????????? ????????? ??????