http://p2004r.blogspot.com/ ([identity profile] p2004r.blogspot.com) wrote in [personal profile] dr_klm 2011-03-25 06:00 pm (UTC)

да, на http://pdfbox.apache.org/userguide/faq.html#gibberish_text

This is because the characters in a PDF document can use a custom encoding instead of unicode or ASCII. When you see gibberish text then it probably means that a meaningless internal encoding is being used. The only way to access the text is to use OCR. This may be a future enhancement.


Hо то мы ведь знаем, что "настоящие кириллические" ученые почему то вс[её] пишут в мсворде (шутка :)

Post a comment in response:

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting