i've huge string contains many sets, each separated ,
. each set has key-value pairs in it, each pair separated &
.
here small example,
tag=43&id=8787&type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium,type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium&tag=172&id=8978,tag=41&type=video/webm;+codecs="vp8.0,+vorbis"&id=1738&quality=medium
this string has following sets (3 sets, each separated ,
):
tag=43&id=8787&type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium&tag=172&id=8978 tag=41&type=video/webm;+codecs="vp8.0,+vorbis"&id=1738&quality=medium
i want write regular expression split original strings sets of key-value pairs. tried this,
sets = huge_string.split(',')
but not work, there comma inside one key-value pair also:
type=video/webm;+codecs="vp8.0,+vorbis" # <--- causing problem!
here ,
causing problem.
how write regular expression accomplish task? i'm using python 3.3.1
.
now don't know how many pairs there, , in order.
this how parse response youtube api:
# content str stores content of link query = urllib.parse.parse_qs(content) fullurls = query['url_encoded_fmt_stream_map'][0].split(',') data = [urllib.parse.parse_qs(i) in fullurls] print(data)
this output list of dict
stores information of each of links. of course, code above demonstration of concept. assumptions should cut down , checks should added in actual code.
the youtube api returns response of mime type application/x-www-form-urlencoded
, need use urllib.parse.parse_qs
decode it.
the url_encoded_fmt_stream_map
key contains value comma-separated list of url encoded strings, need split along commas ,
, decode each of tokens urllib.parse.parse_qs
. there no worry commas in codecs description, since url encoded, not interfere splitting.
Comments
Post a Comment