How to Convert a Beautiful Soup Tag to JSON

I have a type element, bs4.element.Tag, product of a web scraping, I usually do: json.loads (soup.find ('script', type = 'application / ld + json'). Text) , but on this page it only appears in: <script> </script> so I had to do: scripts = soup.find_all ('script') until I get to the one that interests me: script = scripts [18].

The variable in question is script. My problem is that I want to access its attributes, for example script ['goodsInfo'], obviously being an element type bs4.element.Tag, try to do: script.attrs and return me {}. Then I tried to convert it to the type json: json.loads (str (script)) and it throws me the exception: ‘JSONDecodeError: Expecting value: line 1 column 1 (char 0)’

This is my code:

import json
from bs4 import BeautifulSoup
import requests
url_aux = ''

response = requests.get(url_aux)
soup = BeautifulSoup(response.content, "html.parser")

scripts = soup.find_all('script')
script = scripts[18]

#output: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

#output: bs4.element.Tag


This Post Has One Comment

  1. No Fault

    You can use json module to extract the data, but first it’s necessary to locate the right info – you can use re module for that.

    For example:

    import re
    import json
    import requests

    url = ‘’

    txt = re.findall(r’goodsInfo\s*:\s*({.*})’, requests.get(url).text)[0]

    data = json.loads(txt)

    # print(json.dumps(data, indent=4)) # <– uncomment to see all data

    print('Num of comments:', data['detail']['comment']['comment_num'])

    Mock-neck Brush Stroke Print Bodycon Dress
    Nub of comments: 17

Leave a Reply