Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The example function in YandexTranslate without translation #101

Open
AlexK-1 opened this issue Apr 24, 2024 · 7 comments
Open

The example function in YandexTranslate without translation #101

AlexK-1 opened this issue Apr 24, 2024 · 7 comments

Comments

@AlexK-1
Copy link

AlexK-1 commented Apr 24, 2024

To get examples of sentences with a word in English, I use the YandexTranslate class function example. This function has the parameters destination_language and source_language and when they are equal, the error ParameterValueError: Parameter source_language cannot be equal to the destination_language parameter appears. Although in the documentation
Yandex API
says that you can use a language pair with the same languages.

To get around this error, I have to translate the word I need, and then specify the translation back to the source language in the example function. This leads to incorrect results.

The Yandex API allows (if you believe the documentation) not to translate the word to search for suggestions, but the library returns an error.

Is it possible to search for example Yandex sentences through your library without translating the word?

The code I'm using now:

from translatepy.translators import YandexTranslate

translator = YandexTranslate()

word = "regale"

translated_word = translator.translate(word, "ru", "en")  # forced translation of a word from English
examples = translator.example(translated_word.result, "en", "ru")  # none of the sentences contain the word 'regale'

print(examples.result)

The code that I would like to use, but which returns an error:

from translatepy.translators import YandexTranslate

translator = YandexTranslate()

word = "regale"

examples = translator.example(word, "en", "en")  # ParameterValueError: Parameter source_language cannot be equal to the destination_language parameter

print(examples.result)
@ZhymabekRoman
Copy link
Contributor

ZhymabekRoman commented Apr 24, 2024

Don't mix different things - dictionary (словарь) и example (примеры). That documentation talks about a dictionary. But okay, that's not the problem, we use custom reverse engineered API endpoints, the endpoint in the documentation requires API keys for authorisation.

@ZhymabekRoman
Copy link
Contributor

Can you get the same behavior as you need (source language to source language) in Yandex Translate web application or Android application? If not, we can't help you.

@AlexK-1
Copy link
Author

AlexK-1 commented Apr 24, 2024

Don't mix different things - dictionary (словарь) и example (примеры). That documentation talks about a dictionary. But okay, that's not the problem, we use custom reverse engineered API endpoints, the endpoint in the documentation requires API keys for authorisation.

I'm sorry that I got something mixed up. But in the example function of the YandexTranslate class there was a link to https://dictionary.yandex.net where there was a link to the documentation that I referred to in the first post.

Can you get the same behavior as you need (source language to source language) in Yandex Translate web application or Android application? If not, we can't help you.

Unfortunately, I couldn't reproduce it. I will try to find a solution to my problem outside of your library. Maybe you can give me some advice?

@ZhymabekRoman
Copy link
Contributor

But in the example function of the YandexTranslate class there was a link to dictionary.yandex.net where there was a link to the documentation that I referred to in the first post.

The closest thing to the functionality mentioned in documentation is the dictionary function in translatepy.

Unfortunately, I couldn't reproduce it. I will try to find a solution to my problem outside of your library. Maybe you can give me some advice?

I make some changes to the translatepy to get working Yandex dictionary function and I get this response for specific words:
hello:

{'head': {},
 'en': {'syn': [{'text': 'hello',
    'pos': {'code': 'nn', 'text': 'n'},
    'ts': 'həˈləʊ',
    'tr': [{'text': 'hi',
      'pos': {'code': 'nn', 'text': 'n'},
      'fr': 1,
      'syn': [{'text': 'hallo', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1},
       {'text': 'salut', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1}]},
     {'text': 'good day',
      'pos': {'code': 'nn', 'text': 'n'},
      'fr': 1,
      'syn': [{'text': 'good afternoon',
        'pos': {'code': 'nn', 'text': 'n'},
        'fr': 1}]},
     {'text': 'greetings',
      'pos': {'code': 'nn', 'text': 'n'},
      'fr': 1,
      'syn': [{'text': 'hullo', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1},
       {'text': 'hiya', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1},
       {'text': 'hey', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1},
       {'text': 'howdy', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1}]},
     {'text': 'greet', 'pos': {'code': 'vrb', 'text': 'v'}, 'fr': 1},
     {'text': 'yo', 'pos': {'code': 'inv', 'text': 'invar'}, 'fr': 1}]}]}}

regale:

{'head': {},
 'en': {'syn': [{'text': 'regale',
    'pos': {'code': 'vrb', 'text': 'v'},
    'ts': 'rɪˈgeɪl',
    'tr': [{'text': 'treat',
      'pos': {'code': 'vrb', 'text': 'v'},
      'fr': 1,
      'syn': [{'text': 'entertain',
        'pos': {'code': 'vrb', 'text': 'v'},
        'fr': 1},
       {'text': 'feed', 'pos': {'code': 'vrb', 'text': 'v'}, 'fr': 1},
       {'text': 'divert', 'pos': {'code': 'vrb', 'text': 'v'}, 'fr': 1}]},
     {'text': 'feast', 'pos': {'code': 'nn', 'text': 'n'}, 'fr': 1}]}]}}

Is this something you were looking for?

@AlexK-1
Copy link
Author

AlexK-1 commented Apr 24, 2024

Is this something you were looking for?

No. I need to get examples of sentences with a certain word without having to translate it once again, as shown in the first code example in the first post. Maybe I said something wrong because I used a translator to write messages.


I also noticed that the Yandex API returns an example in the original language and its translation. Both would be useful to me, but the example function returns only one translated one. Can I get both sentences (original and translated) for one example sentence in the translatepy library or do I need to write my own function?

An example of a Yandex API response from the browser console:
image

@Animenosekai
Copy link
Owner

I think the definition of the example function was ambiguous in the current version but should be well-defined in next

@typing.overload
def example(self: C, text: str, source_lang: typing.Union[str, Language] = "auto", *args, **kwargs) -> typing.List[models.ExampleResult[C]]:
"""
Returns use cases for the given `text`
Parameters
---------
text: str
The text to get the example for
source_lang: str | Language
The language `text` is in. If "auto", the translator will try to infer the language from `text`
Returns
-------
list[ExampleResult]
The examples
"""

@dataclasses.dataclass(kw_only=True, slots=True, repr=False)
class ExampleResult(Result[Translator]):
"""
Holds an example sentence where the given word is used.
"""
__extra_attributes__ = ("positions",)
example: str
"""The example"""
reference: typing.Optional[str] = None
"""Where the example comes from (i.e a book or a the person who said it if it's a quote)"""
@property
def position(self) -> typing.Optional[int]:
"""
The first position of the word in the example
"""
try:
return self.positions[0]
except IndexError:
return None
@property
def positions(self):
"""
The positions of the word in the example
Returns
-------
list[int]
A list of positions of the word in the example
"""
searching = False
current_letter = 0
searching_length = len(self.source)
lower_source = str(self.source).lower()
positions = []
for index, letter in enumerate(str(self.example).lower(), start=1):
same_letter = letter == lower_source[current_letter]
if same_letter and current_letter == 0 and not searching:
searching = True
current_letter += 1 # search for the next letter in `self.source`
elif same_letter and searching:
current_letter += 1 # search for the next letter in `self.source`
else:
# resetting everything
current_letter = 0
searching = False
if current_letter >= searching_length:
positions.append(index - searching_length)
searching = False
current_letter = 0
return positions
def __pretty__(self, cli: bool = False) -> str:
source_length = len(self.source)
if cli:
result = self.example
for pos in self.positions:
result = "{before}{bold}{source}{normal}{after}".format(
before=result[:pos],
bold="\033[1m",
source=result[pos:source_length],
normal="\033[0m",
after=result[pos + source_length:]
)
else:
result = self.example
indicators = "" if not cli else "\033[90m"
current = 0
for pos in self.positions:
indicators += " " * (pos - current) # adding the offset until the next tildes
indicators += "~" * source_length # adding the tildes under the source word
current = (pos + source_length) # end of the current word
if cli:
indicators += "\033[0m"
return """\
{blue}Example{normal}
{blue}-------{normal}
{result}
{indicators}\
""".format(blue="\033[94m" if cli else "",
normal="\033[0m" if cli else "",
result=result,
indicators=indicators)
EXAMPLE_TEST = ExampleResult(
service=None,
source="hello",
source_lang=Language("English"),
example="Hello everyone, how are you ?"
)

This makes me think that RichDictionaryResult should be able to optionally hold examples too.

I just checked on the next branch, and it seems that no translator returns an example, might not be reimplemented yet…

As for the current stable version, those functions are hit are miss and are here because some translators such as Yandex are DeepL supported some kind of “example” feature, but which seemed to have different behaviors following the website used.

Note

For example, the use of a destination_language doesn't feel right in this context, which is corrected in the new branch.

It might also be worth diving a bit more in the current web implementations of the example feature in supported websites.

@AlexK-1
Copy link
Author

AlexK-1 commented Apr 25, 2024

As a result, I wrote my own simple function to get a list of sample sentences using your exceptions. I'll use it for now.

import requests
from translatepy.translators.yandex import YandexTranslateException
from translatepy.translators import BaseTranslator


def get_examples(text: str, source_language: str, destination_language: str) -> list:
    link = (f"https://dictionary.yandex.net/dicservice.json/queryCorpus?ui=en&"
            f"src={text}&lang={source_language}-{destination_language}&flags=7&srv=android&v=2&maxlen=200")
    BaseTranslator._validate_language_pair(None, source_language, destination_language)
    request = requests.get(link)
    if request.status_code >= 400:
        raise YandexTranslateException(request.status_code, request.json()["message"])
    result = []
    for example in request.json()["result"]["examples"]:
        result.append({
            "src": example["src"].replace("<", "").replace(">", ""),
            "dst": example["dst"].replace("<", "").replace(">", "")
        })
    return result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants