Beginner's first steps with Similis
Copyright © ProZ.com, 1999-2013. All rights reserved.
I am an engineer (electronics designer) turned translator. Although not a programmer, I am interested in tools that help speed up the repetitive aspect of translating (more on that later). Similis is my first contact with Translation memory (TM) programs, therefore please forgive the erring ways and naiveté typical for a novice.
Interests and potential conflicts
I haven't received any payment for writing this essay, although Lingua et Machina (LM), the editor of Similis based in France, granted me an extended temporary license to test it and proposed a discount when this article first appeared. So I don't feel muzzled at all.
Go to www.similis.fr or www.lingua-et-machina.com (English and French) to get first impressions, animated demos, hints, FAQ, beginner's guide (E + F) and user's guide (French only), on the uppermost top of Support page. The beginner's guide, sufficient to get started, is an 11 page pdf file in a strange square format without margins. The demo, release 2.6.5b (in fact the full program with a time limited license for 5000 treated words), weighs 160 MB.
Similis support (Support) is first rate, quick and open to suggestions. Let's hope it will remain the same even after the number of their customers grows.
Because the zipped file comprises another zipped folder, it's best to place it in an empty folder and right-click the Unzip here command.
Installation and conflict with antivirus software
Similis Freelance-2.6.5b-Setup.exe attempts to install itself in the C:\Program Files\Lingua et Machina\Similis folder but stalled on my computer during the process. Preventing Bit Defender Antivirus Plus v10 to start at boot did not help. With the active involvement of Support, a workaround was found: install simply in a C:\Similis folder.
Bit Defender anyway dislikes two files (kill.exe and pskill.exe) it considers as infected with Prockill.I and Prockill.C respectively (I found on Google that McAfee similarly dislikes kill.exe found in other products). The full installation takes 1.6 GB because many dictionaries for linguistic analysis are present.
Similis works between files in Dutch, English, French, German, Italian, Portuguese and Spanish, extracting a TM from existing translations, but it does not build a memory when you translate a document from scratch (as far as I know). So you choose which translated files to use as a basis for forthcoming jobs. I choose my first German/French pair of files and the job to be processed more or less haphazardly, aligned the resulting table and sent it to the job for "pre-translation".
Pre-translation happens in a specially formatted Word window split in 3. On top, the source file, on bottom the target and in the middle the suggested matching expression. It's "my" Word customized with all my preferences, macros etc. plus one Similis menu heading replacing the Window item. You do jog along the target (the source moves in synchronism) and accept the suggestion (color coded if fuzzy) by clicking in the middle window or with a shortcut directly in the target. You can also automate the pre-translation by accepting the 100% matches and then the fuzzy ones. My 19" screen is a bit crowded but there are no tags to nag you.
Because of my first choice, the result was poor as could be expected.
I went to the other extreme for my second choice, took a file pair (issue A of a User manual for a wireless remote control unit) as TM source and the slightly revised issue B to be translated. The TM before manual alignment was impressive and the result after pre-translation really overwhelming - but not conclusive because my test was overly easy (in a sense). Word's tool of comparing/merging two files (efficient from the 2003 vintage onwards) would help greatly for this kind of task.
Pulling out all stops
Up to now, I could experiment relying mainly upon the beginner's guide. But I definitely needed the Similis guide (68 pages, French only) to move to the third step. I hope the English only readers will nevertheless feel tempted to experiment for themselves!
I extracted TM's from two more User guides, one marketing Flyer and two Application notes for various products and merged the resulting memories, adding also a terminology I had built from the same customer's assignments (using Hermetic System's Word Frequency Counter Advanced software, more on that later), but with little visible benefit I think. The resulting aligned TM counts some 4400 segments. 10'000 to 20'000 segments is considered a reasonable size.
I applied the TM to another, fairly resembling user guide as target. After automatic pre-translating, the target got two thirds of paragraphs cleanly translated (guesstimate mine). This sounds impressive, but I have yet to carefully compare the resulting file with the original, and this can be tedious and error prone also.
Overview and pricing
I found this test interesting and rewarding. Basic manipulations are rather simple but not entirely intuitive. For instance, I'm not sure to understand the difference between a TM and a glossary and their respective benefits and would not pretend to be aware of the full potential of such a program. More (how much more?) practice is needed.
The software is fairly fast on my 885 MHz Pentium 4 PC with 1.5 GB RAM under Windows XP. Support advices to prefer a PC with only one CPU or one core: Similis cannot drive dual-core processors. As Similis draws heavily on databases, a fast disk with a big cache is also useful. Laptops often have very poor disks.
Similis Freelance with HTML and XML sells for 750 € plus an optional yearly fee of 195 € for support and upgrade. It supports an unlimited number of projects and TM's. There is a "First Comer" license limited to 200'000 words without HTML and XML for 295 €. My tests consumed almost 40'000 words, so I would quickly bump into the limit. I haven't yet bought the program (see my Questions below).
My own "cheap" TM-less solution is based on Replsoft.com Useful File Utilities (sic!), boasting a customizable Batch Replacer for Word at US$ 40 (for office use) and on Hermetic.ch "Word Frequency Counter Advanced" software for US$ 53, both with lifetime upgrades free of charge. The combination amounts to a simple, dumb, automatic word replacer, without any linguistic intelligence or regard for the context, but it saves time. Don't either forget Word's tool for comparing two files.
My wish list
I'm no mice lover but fond of using keyboard shortcuts (preferably user customizable) whenever possible. So I submitted a few suggestions to Support (not yet implemented when editing this paper):
- Exclude headers and footers - and maybe numbers also - when extracting TM (to avoid clogging the alignment table with page and page numbers)
- Move in the alignment table with Page Up/Down (presently arrow keys or scrolling)
- Undo one operation at a time (presently all operations since the "validate the alignment" has been last clicked), with CTRL+Z as shortcut and CTRL+Y to restore
- Add a F4 shortcut for "repeat the last operation"
- Define CTRL+ shortcuts for the various alignment functions (align, merge, delete, insert a line, presently via mouse right-clicks)
- Keep two merged cells in the alignment table pre-selected after the merge, because it's likely they will be aligned with a neighboring element, cells being selected via CTRL+Click
Similis claims exporting and importing compatibility to Trados TMs. Many internal Similis files can be exported to .tmx, .csv or tsv (tabulation separated values) for external processing in Trados or Excel and reimporting. I wonder if a TM, after being squeezed through Similis, can really be reused in Trados without added queer characters or other malfunctions?
I have written this essay in order to stimulate the interest of the TM community for this relatively new program, but the first edition of this paper elicited no comment or argument. It would also be nice to find comparative reviews between Similis and competing tools.