i've huge string contains many sets, each separated ,. each set has key-value pairs in it, each pair separated &.
here small example,
tag=43&id=8787&type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium,type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium&tag=172&id=8978,tag=41&type=video/webm;+codecs="vp8.0,+vorbis"&id=1738&quality=medium this string has following sets (3 sets, each separated ,):
tag=43&id=8787&type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium type=video/webm;+codecs="vp8.0,+vorbis"&quality=medium&tag=172&id=8978 tag=41&type=video/webm;+codecs="vp8.0,+vorbis"&id=1738&quality=medium i want write regular expression split original strings sets of key-value pairs. tried this,
sets = huge_string.split(',') but not work, there comma inside one key-value pair also:
type=video/webm;+codecs="vp8.0,+vorbis" # <--- causing problem! here , causing problem.
how write regular expression accomplish task? i'm using python 3.3.1.
now don't know how many pairs there, , in order.
this how parse response youtube api:
# content str stores content of link query = urllib.parse.parse_qs(content) fullurls = query['url_encoded_fmt_stream_map'][0].split(',') data = [urllib.parse.parse_qs(i) in fullurls] print(data) this output list of dict stores information of each of links. of course, code above demonstration of concept. assumptions should cut down , checks should added in actual code.
the youtube api returns response of mime type application/x-www-form-urlencoded, need use urllib.parse.parse_qs decode it.
the url_encoded_fmt_stream_map key contains value comma-separated list of url encoded strings, need split along commas , , decode each of tokens urllib.parse.parse_qs. there no worry commas in codecs description, since url encoded, not interfere splitting.
Comments
Post a Comment