Py学习  »  David Culbreth  »  全部回复
回复总数  2
2 年前
回复了 David Culbreth 创建的主题 » 在Python中,如何在解析数据的同时按一个键分隔字典?

将其分解为以下步骤会更容易:

  1. 将信息聚合到对话中
  2. 对对话进行分类
  3. 从每个人那里得到最后一条信息。
from typing import Dict, List
import pprint

data: List[Dict[str, str]] = [
    {'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '1', 'agent_text': 'Hey welcome to Hatch realty, before we start our conversation, can I start by asking your name?', 'customer_input': '', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '2', 'agent_text': 'Great, thanks David. Is there something I can help you with, or what brought you to our site?', 'customer_input': 'My name is David', 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '3', 'agent_text': 'Of course, when would you like to move?', 'customer_input': 'I want to buy a house', 'customer_name': 'Brandon', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '4', 'agent_text': 'Do you have a specific property address?', 'customer_input': 'This summer', 'customer_name': 'Brandon', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '5', 'agent_text': 'Can I get your email address and phone number so I can have someone reach out to you?', 'customer_input': "No I don't.", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '1', 'agent_text': 'Hey welcome to Hatch realty, before we start our conversation, can I start by asking your name?', 'customer_input': '', 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '2', 'agent_text': 'Great, thanks Brandon. Is there something I can help you with, or what brought you to our site?', 'customer_input': 'My name is Brandon', 'customer_name': 'Brandon', 'customer_intention': '', 'customer_property': ''}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '3', 'agent_text': 'Of course, Do you need to be out by a certain date or is your timeframe open?', 'customer_input': 'I want to sell a house', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '4', 'agent_text': 'Do you have a specific property address?', 'customer_input': "It's open", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '5', 'agent_text': 'Can I get your email address and phone number so I can have someone reach out to you?', 'customer_input': "No I don't", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
]

conversations:Dict[str, List[Dict[str, str]]] = dict() # typing indicator just for IDE nice-to-haves
# group conversations by conversation id
for message in data:
    cid = message.get('conversation_id')
    if not cid:
        continue
    if cid not in conversations:
        conversations[cid] = list()
    conversations[cid].append(message)

# sort each conversation by message dialog counter
conversations = {
    cid: sorted(conversation, key=(lambda message:message.get('dialog_counter')))
    for cid, conversation in conversations.items()
}

# get the last message in each conversation
last_messages = [conversation[-1] for conversation in conversations.values()]

pprint.pprint(last_messages)

我忘了输出:

[{'agent_text': 'Can I get your email address and phone number so I can have '
                'someone reach out to you?',
  'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6',
  'customer_input': "No I don't.",
  'customer_intention': 'buy',
  'customer_name': 'David',
  'customer_property': 'house',
  'dialog_counter': '5'},
 {'agent_text': 'Can I get your email address and phone number so I can have '
                'someone reach out to you?',
  'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee',
  'customer_input': "No I don't",
  'customer_intention': 'buy',
  'customer_name': 'David',
  'customer_property': 'house',
  'dialog_counter': '5'}]
5 年前
回复了 David Culbreth 创建的主题 » 在python中拆分.txt文件中的行

有一个集成函数 readlines() . 有一个 tutorialspoint article 关于它,它在 python docs

你可以这样使用它。

with open('path/to/my/file') as myFile:
    for line in myFile.readlines():
        print line

直接从 docs themselves 至少从python 3.5到3.7,

如果要读取列表中文件的所有行,也可以使用 list(f) 或_ f.readlines() .