正则表达式帮助将列表分成两元组(Regex to help split up list into two-tuples)
给定一个actor列表,用括号中的字符名称,用分号(;)或comm(,)分隔:
Shelley Winters [Ruby]; Millicent Martin [Siddie]; Julia Foster [Gilda]; Jane Asher [Annie]; Shirley Ann Field [Carla]; Vivien Merchant [Lily]; Eleanor Bron [Woman Doctor], Denholm Elliott [Mr. Smith; abortionist]; Alfie Bass [Harry]我将如何将其解析为[(演员,角色),......形式的两种类型的列表)
--> [('Shelley Winters', 'Ruby'), ('Millicent Martin', 'Siddie'), ('Denholm Elliott', 'Mr. Smith; abortionist')]我最初有:
actors = [item.strip().rstrip(']') for item in re.split('\[|,|;',data['actors'])] data['actors'] = [(actors[i], actors[i + 1]) for i in range(0, len(actors), 2)]但这并不是很有效,因为它也会将项目分成括号。
Given a list of actors, with their their character name in brackets, separated by either a semi-colon (;) or comm (,):
Shelley Winters [Ruby]; Millicent Martin [Siddie]; Julia Foster [Gilda]; Jane Asher [Annie]; Shirley Ann Field [Carla]; Vivien Merchant [Lily]; Eleanor Bron [Woman Doctor], Denholm Elliott [Mr. Smith; abortionist]; Alfie Bass [Harry]How would I parse this into a list of two-typles in the form of [(actor, character),...]
--> [('Shelley Winters', 'Ruby'), ('Millicent Martin', 'Siddie'), ('Denholm Elliott', 'Mr. Smith; abortionist')]I originally had:
actors = [item.strip().rstrip(']') for item in re.split('\[|,|;',data['actors'])] data['actors'] = [(actors[i], actors[i + 1]) for i in range(0, len(actors), 2)]But this doesn't quite work, as it also splits up items within brackets.
最满意答案
您可以使用以下内容:
>>> re.findall(r'(\w[\w\s\.]+?)\s*\[([\w\s;\.,]+)\][,;\s$]*', s) [('Shelley Winters', 'Ruby'), ('Millicent Martin', 'Siddie'), ('Julia Foster', 'Gilda'), ('Jane Asher', 'Annie'), ('Shirley Ann Field', 'Carla'), ('Vivien Merchant', 'Lily'), ('Eleanor Bron', 'Woman Doctor'), ('Denholm Elliott', 'Mr. Smith; abortionist'), ('Alfie Bass', 'Harry')]人们也可以简化一些事情.*? :
re.findall(r'(\w.*?)\s*\[(.*?)\][,;\s$]*', s)You can go with something like:
>>> re.findall(r'(\w[\w\s\.]+?)\s*\[([\w\s;\.,]+)\][,;\s$]*', s) [('Shelley Winters', 'Ruby'), ('Millicent Martin', 'Siddie'), ('Julia Foster', 'Gilda'), ('Jane Asher', 'Annie'), ('Shirley Ann Field', 'Carla'), ('Vivien Merchant', 'Lily'), ('Eleanor Bron', 'Woman Doctor'), ('Denholm Elliott', 'Mr. Smith; abortionist'), ('Alfie Bass', 'Harry')]One can also simplify some things with .*?:
re.findall(r'(\w.*?)\s*\[(.*?)\][,;\s$]*', s)更多推荐
发布评论