API Reference¶

HTTP API¶

Python API wrapper for the languagetool REST API.

Simple usage:

>>> from pylanguagetool import api
>>> api.check(
...     'This is a example',
...     api_url='https://languagetool.org/api/v2/',
...     lang='en-US',
... )
{'software': {'name': 'LanguageTool', 'version': '4.6-SNAPSHOT', 'buildDate': '2019-05-15 19:25', 'apiVersion': 1, 'premium': False, 'premiumHint': 'You might be missing errors only the Premium version can find. Contact us at support<at>languagetoolplus.com.', 'status': ''}, 'warnings': {'incompleteResults': False}, 'language': {'name': 'English (US)', 'code': 'en-US', 'detectedLanguage': {'name': 'English (US)', 'code': 'en-US', 'confidence': 0.561}}, 'matches': [{'message': 'Use "an" instead of \'a\' if the following word starts with a vowel sound, e.g. \'an article\', \'an hour\'', 'shortMessage': 'Wrong article', 'replacements': [{'value': 'an'}], 'offset': 8, 'length': 1, 'context': {'text': 'This is a example', 'offset': 8, 'length': 1}, 'sentence': 'This is a example', 'type': {'typeName': 'Other'}, 'rule': {'id': 'EN_A_VS_AN', 'description': "Use of 'a' vs. 'an'", 'issueType': 'misspelling', 'category': {'id': 'MISC', 'name': 'Miscellaneous'}}, 'ignoreForIncompleteSentence': False, 'contextForSureMatch': 1}]}

pylanguagetool.api.check(input_text, api_url, lang, mother_tongue=None, preferred_variants=None, enabled_rules=None, disabled_rules=None, enabled_categories=None, disabled_categories=None, enabled_only=False, picky=False, verbose=False, pwl=None, **kwargs)[source]¶

Check given text and return API response as a dictionary.

Parameters:

input_text (str) – Plain text that will be checked for spelling mistakes.
api_url (str) – API base url, e.g. https://languagetool.org/api/v2/
lang –
Language of the given text as RFC 3066 language code. For example en-GB or de-AT. auto is a valid value too and will cause the language to be detected automatically.
mother_tongue –
Native language of the author as RFC 3066 language code.
preferred_variants (str) – Comma-separated list of preferred language variants. The language detector used with language=auto can detect e.g. English, but it cannot decide whether British English or American English is used. Therefore, this parameter can be used to specify the preferred variants like en-GB and de-AT. Only available with language=auto.
enabled_rules (str) – Comma-separated list of IDs of rules to be enabled
disabled_rules (str) – Comma-separated list of IDs of rules to be disabled.
enabled_categories (str) – Comma-separated list of IDs of categories to be enabled.
disabled_categories (str) – Comma-separated list of IDs of categories to be disabled.
enabled_only (bool) – If True, only the rules and categories whose IDs are specified with enabledRules or enabledCategories are enabled. Defaults to False.
picky (bool) – If enabled, addition rules are activated.
verbose (bool) – If True, a more verbose output will be printed. Defaults to False.
pwl (List[str]) – Personal world list. A custom dictionary of words that should be excluded from spell checking errors.

Returns:

A dictionary representation of the JSON API response. The most notable key is matches, which contains a list of all spelling mistakes that have been found.

E.g.:

{
    "language": {
        "code": "en-US",
        "detectedLanguage": {
            "code": "en-US",
            "confidence": 0.561,
            "name": "English (US)",
        },
        "name": "English (US)",
    },
    "matches": [
        {
            "context": {"length": 1, "offset": 8, "text": "This is a example"},
            "contextForSureMatch": 1,
            "ignoreForIncompleteSentence": False,
            "length": 1,
            "message": "Use "an" instead of 'a' if the following word "
            "starts with a vowel sound, e.g. 'an article', 'an "
            "hour'",
            "offset": 8,
            "replacements": [{"value": "an"}],
            "rule": {
                "category": {"id": "MISC", "name": "Miscellaneous"},
                "description": "Use of 'a' vs. 'an'",
                "id": "EN_A_VS_AN",
                "issueType": "misspelling",
            },
            "sentence": "This is a example",
            "shortMessage": "Wrong article",
            "type": {"typeName": "Other"},
        }
    ],
    "software": {
        "apiVersion": 1,
        "buildDate": "2019-05-15 19:25",
        "name": "LanguageTool",
        "premium": False,
        "premiumHint": "You might be missing errors only the Premium "
        "version can find. Contact us at "
        "support<at>languagetoolplus.com.",
        "status": "",
        "version": "4.6-SNAPSHOT",
    },
    "warnings": {"incompleteResults": False},
}

Return type:

dict

pylanguagetool.api.get_languages(api_url)[source]¶

Return supported languages as a list of dictionaries.

Parameters:

api_url (str) – API base url.

Returns:

Supported languages as a list of dictionaries.

Each dictionary contains three keys, name, code and longCode:

{
    "name":"English (GB)",
    "code":"en",
    "longCode":"en-GB"
}

Return type:

List[dict]

CLI¶

Converters¶

Support spellchecking various file formats by converting them to plain text

pylanguagetool.converters.convert(source, texttype)[source]¶

Convert files of various types to plaintext

Parameters:

texttype (str) – file extension of the input file
source (str) – content of the input file

Returns:

plaintext output

Return type:

str

pylanguagetool.converters.html2text(html)[source]¶

convert HTML to plaintext by parsing it with BeautifulSoup and removing code

Parameters:: html (str) – HTML string
Returns:: plaintext
Return type:: str

pylanguagetool.converters.ipynb2markdown(ipynb)[source]¶

Extract Markdown cells from iPython Notebook

Parameters:: ipynb (str) – iPython notebook JSON file
Returns:: Markdown
Return type:: str

pylanguagetool.converters.markdown2html(markdown)[source]¶

convert Markdown to HTML via markdown2

Parameters:: markdown (str) – Markdown text
Returns:: HTML
Return type:: str

pylanguagetool.converters.rst2html(rst)[source]¶

convert reStructuredText to HTML with docutils

Parameters:: rst (str) – reStructuredText
Returns:: HTML
Return type:: str

pylanguagetool.converters.transifexjson2txt(jsondata)[source]¶

extract translations from Transifex JSON file

Parameters:: jsondata (str) – Transifex export file
Returns:: Plaintext translations
Return type:: str

pylanguagetool.converters.xliff2txt(source)[source]¶

extract translations from XLIFF file

Parameters:: source (str) – XLIFF XML string
Returns:: Plaintext translations
Return type:: str