API Reference

HTTP API

Python API wrapper for the languagetool REST API.

Simple usage:

>>> from pylanguagetool import api
>>> api.check(
...     'This is a example',
...     api_url='https://languagetool.org/api/v2/',
...     lang='en-US',
... )
{'software': {'name': 'LanguageTool', 'version': '4.6-SNAPSHOT', 'buildDate': '2019-05-15 19:25', 'apiVersion': 1, 'premium': False, 'premiumHint': 'You might be missing errors only the Premium version can find. Contact us at support<at>languagetoolplus.com.', 'status': ''}, 'warnings': {'incompleteResults': False}, 'language': {'name': 'English (US)', 'code': 'en-US', 'detectedLanguage': {'name': 'English (US)', 'code': 'en-US', 'confidence': 0.561}}, 'matches': [{'message': 'Use "an" instead of \'a\' if the following word starts with a vowel sound, e.g. \'an article\', \'an hour\'', 'shortMessage': 'Wrong article', 'replacements': [{'value': 'an'}], 'offset': 8, 'length': 1, 'context': {'text': 'This is a example', 'offset': 8, 'length': 1}, 'sentence': 'This is a example', 'type': {'typeName': 'Other'}, 'rule': {'id': 'EN_A_VS_AN', 'description': "Use of 'a' vs. 'an'", 'issueType': 'misspelling', 'category': {'id': 'MISC', 'name': 'Miscellaneous'}}, 'ignoreForIncompleteSentence': False, 'contextForSureMatch': 1}]}
pylanguagetool.api.check(input_text, api_url, lang, mother_tongue=None, preferred_variants=None, enabled_rules=None, disabled_rules=None, enabled_categories=None, disabled_categories=None, enabled_only=False, picky=False, verbose=False, pwl=None, **kwargs)[source]

Check given text and return API response as a dictionary.

Parameters:
  • input_text (str) – Plain text that will be checked for spelling mistakes.

  • api_url (str) – API base url, e.g. https://languagetool.org/api/v2/

  • lang

    Language of the given text as RFC 3066 language code. For example en-GB or de-AT. auto is a valid value too and will cause the language to be detected automatically.

  • mother_tongue

    Native language of the author as RFC 3066 language code.

  • preferred_variants (str) – Comma-separated list of preferred language variants. The language detector used with language=auto can detect e.g. English, but it cannot decide whether British English or American English is used. Therefore, this parameter can be used to specify the preferred variants like en-GB and de-AT. Only available with language=auto.

  • enabled_rules (str) – Comma-separated list of IDs of rules to be enabled

  • disabled_rules (str) – Comma-separated list of IDs of rules to be disabled.

  • enabled_categories (str) – Comma-separated list of IDs of categories to be enabled.

  • disabled_categories (str) – Comma-separated list of IDs of categories to be disabled.

  • enabled_only (bool) – If True, only the rules and categories whose IDs are specified with enabledRules or enabledCategories are enabled. Defaults to False.

  • picky (bool) – If enabled, addition rules are activated.

  • verbose (bool) – If True, a more verbose output will be printed. Defaults to False.

  • pwl (List[str]) – Personal world list. A custom dictionary of words that should be excluded from spell checking errors.

Returns:

A dictionary representation of the JSON API response. The most notable key is matches, which contains a list of all spelling mistakes that have been found.

E.g.:

{
    "language": {
        "code": "en-US",
        "detectedLanguage": {
            "code": "en-US",
            "confidence": 0.561,
            "name": "English (US)",
        },
        "name": "English (US)",
    },
    "matches": [
        {
            "context": {"length": 1, "offset": 8, "text": "This is a example"},
            "contextForSureMatch": 1,
            "ignoreForIncompleteSentence": False,
            "length": 1,
            "message": "Use "an" instead of 'a' if the following word "
            "starts with a vowel sound, e.g. 'an article', 'an "
            "hour'",
            "offset": 8,
            "replacements": [{"value": "an"}],
            "rule": {
                "category": {"id": "MISC", "name": "Miscellaneous"},
                "description": "Use of 'a' vs. 'an'",
                "id": "EN_A_VS_AN",
                "issueType": "misspelling",
            },
            "sentence": "This is a example",
            "shortMessage": "Wrong article",
            "type": {"typeName": "Other"},
        }
    ],
    "software": {
        "apiVersion": 1,
        "buildDate": "2019-05-15 19:25",
        "name": "LanguageTool",
        "premium": False,
        "premiumHint": "You might be missing errors only the Premium "
        "version can find. Contact us at "
        "support<at>languagetoolplus.com.",
        "status": "",
        "version": "4.6-SNAPSHOT",
    },
    "warnings": {"incompleteResults": False},
}

Return type:

dict

pylanguagetool.api.get_languages(api_url)[source]

Return supported languages as a list of dictionaries.

Parameters:

api_url (str) – API base url.

Returns:

Supported languages as a list of dictionaries.

Each dictionary contains three keys, name, code and longCode:

{
    "name":"English (GB)",
    "code":"en",
    "longCode":"en-GB"
}

Return type:

List[dict]

CLI

Converters

Support spellchecking various file formats by converting them to plain text

pylanguagetool.converters.convert(source, texttype)[source]

Convert files of various types to plaintext

Parameters:
  • texttype (str) – file extension of the input file

  • source (str) – content of the input file

Returns:

plaintext output

Return type:

str

pylanguagetool.converters.html2text(html)[source]

convert HTML to plaintext by parsing it with BeautifulSoup and removing code

Parameters:

html (str) – HTML string

Returns:

plaintext

Return type:

str

pylanguagetool.converters.ipynb2markdown(ipynb)[source]

Extract Markdown cells from iPython Notebook

Parameters:

ipynb (str) – iPython notebook JSON file

Returns:

Markdown

Return type:

str

pylanguagetool.converters.markdown2html(markdown)[source]

convert Markdown to HTML via markdown2

Parameters:

markdown (str) – Markdown text

Returns:

HTML

Return type:

str

pylanguagetool.converters.rst2html(rst)[source]

convert reStructuredText to HTML with docutils

Parameters:

rst (str) – reStructuredText

Returns:

HTML

Return type:

str

pylanguagetool.converters.transifexjson2txt(jsondata)[source]

extract translations from Transifex JSON file

Parameters:

jsondata (str) – Transifex export file

Returns:

Plaintext translations

Return type:

str

pylanguagetool.converters.xliff2txt(source)[source]

extract translations from XLIFF file

Parameters:

source (str) – XLIFF XML string

Returns:

Plaintext translations

Return type:

str