{"id":65990,"date":"2018-09-02T12:57:32","date_gmt":"2018-09-02T10:57:32","guid":{"rendered":"http:\/\/dandavats.tumblr.com\/post\/177652743266"},"modified":"2018-09-02T12:58:36","modified_gmt":"2018-09-02T10:58:36","slug":"the-brand-new-bbt-dictionary-appan-aid-to-bbt-production-the","status":"publish","type":"post","link":"https:\/\/www.dandavats.com\/?p=65990","title":{"rendered":"The Brand New BBT Dictionary App! An aid to BBT production"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-65989\" src=\"http:\/\/www.dandavats.com\/wp-content\/uploads72\/tumblr_pefd3w4CJL1sbj0vuo1_500.png\" width=\"500\" height=\"687\" srcset=\"https:\/\/www.dandavats.com\/wp-content\/uploads\/tumblr_pefd3w4CJL1sbj0vuo1_500.png 500w, https:\/\/www.dandavats.com\/wp-content\/uploads\/tumblr_pefd3w4CJL1sbj0vuo1_500-136x187.png 136w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/p>\n<p>The Brand New BBT Dictionary App!<br \/>\nAn aid to BBT production: The BBT produces books in many languages. Neces\u00adsity is the mother of invention, and necessity has left us venturing into machine learning, neurolinguistic programming, and a fusion of science, computational linguistics, and Srila Prabhupada\u2019s books. Years ago, we had fairly stable editorial teams in most of our languages. Nowadays devotees are more mobile in their services, and we\u2019ve had to figure out ways to help ever new editorial teams use our layout software for hyphenating and finalizing their files \u2013 work best done by native speakers.<\/p>\n<p>We\u2019re working hard at developing our text repository \u2013 better known as a multilingual parallel text corpus. Combining the text repository with machine learning, natural lan\u00adguage processing, dictionaries, databases, and other tools is allowing us a fuller control over the huge amount of text we handle on a yearly basis. To deal with the immediate problem mentioned above and to improve all our production processes, Hare Krishna Dasa designed a dictionary utility that creates a list of unique words from a set of BBText files. It then pre-hyphenates those words using OpenOffice hyphenation dictionaries. The hyphen\u00adated words are then used by the BBT\u2019s proprietary software, PP\/HJ, to create hyphenated text files that can go straight to proofreading. We\u2019re also building pre-hyphenated Sanskrit dictionaries to make the end result cleaner for the editorial teams.<\/p>\n<p>We\u2019ve been gradually building the multilingual paral\u00adlel parsable text corpus mentioned above. A corpus is more than a collection of book files; it includes tagged text files, dictionaries pulled from every word in a text, and dictionaries that recognize the roots and stems of words. Such dictionaries will make translation easier and allow us to develop basic syn\u00adonym lists to aid automatic translation, parallel viewing, spellchecking, and all sorts of useful grammatical analysis. We\u2019ll also parse proper nouns, allowing us to, say, connect dictionary or glossary entries to those words in ebooks and apps. We\u2019ll also make Sanskrit terms and their equivalents parsable.<\/p>\n<p>All this will allow side\u00ad by \u00adside comparison of a book in multiple languages or of different editions in a sin\u00adgle language at paragraph or even sentence level. In our book apps, you\u2019ll be able to swipe from Russian to English to German and find yourself on the same paragraph in each language. For those on editorial teams who are asked to do fidelity checks, this type of parallel analysis between languages can be quite useful. So although our internal app development might seem more pedantic than practical, we\u2019re excited about it! Our dictionary app has a number of practical applications, not the least of which is that it will make BBT source texts accessible for translation, verifica\u00adtion, quality control, progress control, version control, audio overlay, and exporting to a variety of formats.<\/p>\n<p>Our most recent iteration of the dictionary app works with these languages: Afrikaans, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English (British and American), Estonian, French, Galician, German, Greek, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese (Brazilian and European), Romanian, Russian, Sanskrit, Serbian (Cyrillic &#038; Latin), Slovak, Slovenian, Spanish, Swedish, Telugu, Ukrainian, and Zulu.<\/p>\n<p>Planned for the next version: Adding machine\u00ad learning features so that the app will recognize a word\u2019s language, including whether a word has been imported from another language, is pure Sanskrit, or is Sanskrit declined according to the rules of a particular language. The app will then spellcheck and pre-hyphenate accordingly.<\/p>\n","protected":false},"excerpt":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/78.media.tumblr.com\/104356321176e5396cb595074b6e7c88\/tumblr_pefd3w4CJL1sbj0vuo1_500.png\"><\/p>\n<p>The Brand New BBT Dictionary App!<br \/>\nAn aid to BBT production: The BBT produces books in many languages. Neces&shy;sity is the mother of invention, and necessity has left us venturing into machine learning, neurolinguistic programming, and a fusion of science, computational linguistics, and Srila Prabhupada&rsquo;s books. Years ago, we had fairly stable editorial teams in most of our languages. Nowadays devotees are more mobile in their services, and we&rsquo;ve had to figure out ways to help ever new editorial teams use our layout software for hyphenating and finalizing their files &ndash; work best done by native speakers.<\/p>\n","protected":false},"author":10650,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[118],"tags":[152],"class_list":["post-65990","post","type-post","status-publish","format-standard","hentry","category-recent-media","tag-nectar"],"_links":{"self":[{"href":"https:\/\/www.dandavats.com\/index.php?rest_route=\/wp\/v2\/posts\/65990","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dandavats.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dandavats.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dandavats.com\/index.php?rest_route=\/wp\/v2\/users\/10650"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dandavats.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=65990"}],"version-history":[{"count":2,"href":"https:\/\/www.dandavats.com\/index.php?rest_route=\/wp\/v2\/posts\/65990\/revisions"}],"predecessor-version":[{"id":65992,"href":"https:\/\/www.dandavats.com\/index.php?rest_route=\/wp\/v2\/posts\/65990\/revisions\/65992"}],"wp:attachment":[{"href":"https:\/\/www.dandavats.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=65990"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dandavats.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=65990"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dandavats.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=65990"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}