API Reference

HTTP API

Python API wrapper for the languagetool REST API.

Simple usage:

>>> from pylanguagetool import api
>>> api.check(
...     'This is a example',
...     api_url='https://languagetool.org/api/v2/',
...     lang='en-US',
... )
{'software': {'name': 'LanguageTool', 'version': '4.6-SNAPSHOT', 'buildDate': '2019-05-15 19:25', 'apiVersion': 1, 'premium': False, 'premiumHint': 'You might be missing errors only the Premium version can find. Contact us at support<at>languagetoolplus.com.', 'status': ''}, 'warnings': {'incompleteResults': False}, 'language': {'name': 'English (US)', 'code': 'en-US', 'detectedLanguage': {'name': 'English (US)', 'code': 'en-US', 'confidence': 0.561}}, 'matches': [{'message': 'Use "an" instead of \'a\' if the following word starts with a vowel sound, e.g. \'an article\', \'an hour\'', 'shortMessage': 'Wrong article', 'replacements': [{'value': 'an'}], 'offset': 8, 'length': 1, 'context': {'text': 'This is a example', 'offset': 8, 'length': 1}, 'sentence': 'This is a example', 'type': {'typeName': 'Other'}, 'rule': {'id': 'EN_A_VS_AN', 'description': "Use of 'a' vs. 'an'", 'issueType': 'misspelling', 'category': {'id': 'MISC', 'name': 'Miscellaneous'}}, 'ignoreForIncompleteSentence': False, 'contextForSureMatch': 1}]}
pylanguagetool.api.check(input_text, api_url, lang, mother_tongue=None, preferred_variants=None, enabled_rules=None, disabled_rules=None, enabled_categories=None, disabled_categories=None, enabled_only=False, verbose=False, pwl=None, **kwargs)[source]

Check given text and return API response as a dictionary.

Parameters
  • input_text (str) – Plain text that will be checked for spelling mistakes.

  • api_url (str) – API base url, e.g. https://languagetool.org/api/v2/

  • lang

    Language of the given text as RFC 3066 language code. For example en-GB or de-AT. auto is a valid value too and will cause the language to be detected automatically.

  • mother_tongue

    Native language of the author as RFC 3066 language code.

  • preferred_variants (str) – Comma-separated list of preferred language variants. The language detector used with language=auto can detect e.g. English, but it cannot decide whether British English or American English is used. Therefore, this parameter can be used to specify the preferred variants like en-GB and de-AT. Only available with language=auto.

  • enabled_rules (str) – Comma-separated list of IDs of rules to be enabled

  • disabled_rules (str) – Comma-separated list of IDs of rules to be disabled.

  • enabled_categories (str) – Comma-separated list of IDs of categories to be enabled.

  • disabled_categories (str) – Comma-separated list of IDs of categories to be disabled.

  • enabled_only (bool) – If True, only the rules and categories whose IDs are specified with enabledRules or enabledCategories are enabled. Defaults to False.

  • verbose (bool) – If True, a more verbose output will be printed. Defaults to False.

  • pwl (List[str]) – Personal world list. A custom dictionary of words that should be excluded from spell checking errors.

Returns

A dictionary representation of the JSON API response. The most notable key is matches, which contains a list of all spelling mistakes that have been found.

E.g.:

{'language': {'code': 'en-US',
              'detectedLanguage': {'code': 'en-US',
                                   'confidence': 0.561,
                                   'name': 'English (US)'},
              'name': 'English (US)'},
 'matches': [{'context': {'length': 1,
                          'offset': 8,
                          'text': 'This is a example'},
              'contextForSureMatch': 1,
              'ignoreForIncompleteSentence': False,
              'length': 1,
              'message': 'Use "an" instead of 'a' if the following word '
                         "starts with a vowel sound, e.g. 'an article', 'an "
                         "hour'",
              'offset': 8,
              'replacements': [{'value': 'an'}],
              'rule': {'category': {'id': 'MISC', 'name': 'Miscellaneous'},
                       'description': "Use of 'a' vs. 'an'",
                       'id': 'EN_A_VS_AN',
                       'issueType': 'misspelling'},
              'sentence': 'This is a example',
              'shortMessage': 'Wrong article',
              'type': {'typeName': 'Other'}}],
 'software': {'apiVersion': 1,
              'buildDate': '2019-05-15 19:25',
              'name': 'LanguageTool',
              'premium': False,
              'premiumHint': 'You might be missing errors only the Premium '
                             'version can find. Contact us at '
                             'support<at>languagetoolplus.com.',
              'status': '',
              'version': '4.6-SNAPSHOT'},
 'warnings': {'incompleteResults': False}}

Return type

dict

pylanguagetool.api.get_languages(api_url)[source]

Return supported languages as a list of dictionaries.

Parameters

api_url (str) – API base url.

Returns

Supported languages as a list of dictionaries.

Each dictionary contains three keys, name, code and longCode:

{
    "name":"English (GB)",
    "code":"en",
    "longCode":"en-GB"
}

Return type

List[dict]

CLI

A python library and CLI tool for the LanguageTool JSON API.

pylanguagetool.get_clipboard()[source]

Return text stored in the operating system’s clipboard.

Returns

Text stored in the operating system’s clipboard.

Return type

str

pylanguagetool.get_input_text(config)[source]

Return text from stdin, clipboard or file.

Returns

A tuple contain of the text and an optional file extension. If the text does not come from a file, the extension part of the tuple will be none.

Return type

Tuple[str, str]

Converters

Support spellchecking various file formats by converting them to plain text

pylanguagetool.converters.convert(source, texttype)[source]

Convert files of various types to plaintext

Parameters
  • texttype (str) – file extension of the input file

  • source (str) – content of the input file

Returns

plaintext output

Return type

str

pylanguagetool.converters.html2text(html)[source]

convert HTML to plaintext by parsing it with BeautifulSoup and removing code

Parameters

html (str) – HTML string

Returns

plaintext

Return type

str

pylanguagetool.converters.ipynb2markdown(ipynb)[source]

Extract Markdown cells from iPython Notebook

Parameters

ipynb (str) – iPython notebook JSON file

Returns

Markdown

Return type

str

pylanguagetool.converters.markdown2html(markdown)[source]

convert Markdown to HTML via markdown2

Parameters

markdown (str) – Markdown text

Returns

HTML

Return type

str

pylanguagetool.converters.rst2html(rst)[source]

convert reStructuredText to HTML with docutils

Parameters

rst (str) – reStructuredText

Returns

HTML

Return type

str

pylanguagetool.converters.transifexjson2txt(jsondata)[source]

extract translations from Transifex JSON file

Parameters

jsondata (str) – Transifex export file

Returns

Plaintext translations

Return type

str

pylanguagetool.converters.xliff2txt(source)[source]

extract translations from XLIFF file

Parameters

source (str) – XLIFF XML string

Returns

Plaintext translations

Return type

str