close

[Solved] TypeError: extract() got an unexpected keyword argument ‘json_output’

Hello Guys, How are you all? Hope You all Are Fine. Today I am trying to use trafilatura but I am facing following error TypeError: extract() got an unexpected keyword argument ‘json_output’ in Python. So Here I am Explain to you all the possible solutions here.

Without wasting your time, Let’s start This Article to Solve This Error.

How TypeError: extract() got an unexpected keyword argument ‘json_output’ Error Occurs ?

I am trying to use trafilatura but I am facing following error.

Traceback (most recent call last):
  File "f:\Python Script\Python\2021\s.py", line 9, in <module>
    date_extraction_params={'extensive_search': True, 'original_date': True})
TypeError: extract() got an unexpected keyword argument 'json_output'

Here is my code That I am trying to run.

import trafilatura

url = 'my_url'
downloaded_url = trafilatura.fetch_url(url)

a = trafilatura.extract(downloaded_url, json_output=True, with_metadata=True, include_comments=False,
                        date_extraction_params={'extensive_search': True, 'original_date': True})
if a:
    json_output = json.loads(a)
    print(json_output['text'])
else:
    print("nothing")

How To Solve TypeError: extract() got an unexpected keyword argument ‘json_output’ Error ?

  1. How To Solve TypeError: extract() got an unexpected keyword argument ‘json_output’ Error ?

    To Solve TypeError: extract() got an unexpected keyword argument ‘json_output’ Error I suggest you to don’t Use trafilatura use BS4 instead Of. Here is Example. You can do something like this. Here is my example code.

  2. TypeError: extract() got an unexpected keyword argument ‘json_output’

    To Solve TypeError: extract() got an unexpected keyword argument ‘json_output’ Error I suggest you to don’t Use trafilatura use BS4 instead Of. Here is Example. You can do something like this. Here is my example code.

Solution 1: Don’t use trafilatura

I suggest you to don’t Use trafilatura use BS4 instead Of. Here is Example. You can do something like this.

    try:
        resp = requests.get(url)
        # We will only extract the text from successful requests:
        if resp.status_code == 200:
            return beautifulsoup_extract_text_fallback(resp.content)
        else:
            # This line will handle for any failures in the BeautifulSoup4 function:
            return np.nan
    # Handling for any URLs that don't have the correct protocol
    except MissingSchema:
        return np.nan

Hope it should be useful for you.

Summary

It’s all About this issue. Hope all solution helped you a lot. Comment below Your thoughts and your queries. Also, Comment below which solution worked for you?

Also, Read

Leave a Comment