社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  Python

在Python中,如何在解析数据的同时按一个键分隔字典?

Brandon Jacobson • 3 年前 • 272 次点击  

我将以下信息存储在一本名为conversation_counter的字典中。由于项目经理的指示,我无法共享整个代码,但这是我在重复它。

{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '1', 'agent_text': 'Hey welcome to Hatch realty, before we start our conversation, can I start by asking your name?', 'customer_input': '', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '2', 'agent_text': 'Great, thanks David. Is there something I can help you with, or what brought you to our site?', 'customer_input': 'My name is David', 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '3', 'agent_text': 'Of course, when would you like to move?', 'customer_input': 'I want to buy a house', 'customer_name': 'Brandon', 'customer_intention': 'buy', 'customer_property': 'house'}
{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '4', 'agent_text': 'Do you have a specific property address?', 'customer_input': 'This summer', 'customer_name': 'Brandon', 'customer_intention': 'buy', 'customer_property': 'house'}
{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '5', 'agent_text': 'Can I get your email address and phone number so I can have someone reach out to you?', 'customer_input': "No I don't.", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '1', 'agent_text': 'Hey welcome to Hatch realty, before we start our conversation, can I start by asking your name?', 'customer_input': '', 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '2', 'agent_text': 'Great, thanks Brandon. Is there something I can help you with, or what brought you to our site?', 'customer_input': 'My name is Brandon', 'customer_name': 'Brandon', 'customer_intention': '', 'customer_property': ''}
{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '3', 'agent_text': 'Of course, Do you need to be out by a certain date or is your timeframe open?', 'customer_input': 'I want to sell a house', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '4', 'agent_text': 'Do you have a specific property address?', 'customer_input': "It's open", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '5', 'agent_text': 'Can I get your email address and phone number so I can have someone reach out to you?', 'customer_input': "No I don't", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}

如您所见,只有两个不同的对话ID(第一个值)。对话并不总是有5个计数(请参阅对话计数器),因此我如何1)找到每个对话id的最高对话计数器,或者为每个对话id创建单独的字典。

这是我到目前为止所拥有的,但它只包含一本有意义的字典,因为我很难更新字典,所以它会取消第一本。

conversationID = ""
data = {
    "conversation_id": "",
    "customer_name": "",
    "customer_intention": "",
    "customer_property": ""
        }
for convo in conversation_counter:
    conversationID = convo['conversation_id']
    for conversations in conversationID:
        data["conversation_id"] = convo['conversation_id']
        if customer_name != "":
            data["customer_name"] = convo['customer_name']
        if customer_intention != "":
            data["customer_intention"] = convo['customer_intention']
        if customer_property != "":
            data["customer_property"] = convo['customer_property']

print(data)

输出正是我想要的,但只给我一个条目。我认为最简单的方法是找到最高的对话框计数器,但我不知道如何在带有单独条目的for循环中实现这一点。

{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/129646
 
272 次点击  
文章 [ 1 ]  |  最新文章 3 年前
David Culbreth
Reply   •   1 楼
David Culbreth    3 年前

将其分解为以下步骤会更容易:

  1. 将信息聚合到对话中
  2. 对对话进行分类
  3. 从每个人那里得到最后一条信息。
from typing import Dict, List
import pprint

data: List[Dict[str, str]] = [
    {'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '1', 'agent_text': 'Hey welcome to Hatch realty, before we start our conversation, can I start by asking your name?', 'customer_input': '', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '2', 'agent_text': 'Great, thanks David. Is there something I can help you with, or what brought you to our site?', 'customer_input': 'My name is David', 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '3', 'agent_text': 'Of course, when would you like to move?', 'customer_input': 'I want to buy a house', 'customer_name': 'Brandon', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '4', 'agent_text': 'Do you have a specific property address?', 'customer_input': 'This summer', 'customer_name': 'Brandon', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6', 'dialog_counter': '5', 'agent_text': 'Can I get your email address and phone number so I can have someone reach out to you?', 'customer_input': "No I don't.", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '1', 'agent_text': 'Hey welcome to Hatch realty, before we start our conversation, can I start by asking your name?', 'customer_input': '', 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '2', 'agent_text': 'Great, thanks Brandon. Is there something I can help you with, or what brought you to our site?', 'customer_input': 'My name is Brandon', 'customer_name': 'Brandon', 'customer_intention': '', 'customer_property': ''}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '3', 'agent_text': 'Of course, Do you need to be out by a certain date or is your timeframe open?', 'customer_input': 'I want to sell a house', 'customer_name': 'Brandon', 'customer_intention': 'sell', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '4', 'agent_text': 'Do you have a specific property address?', 'customer_input': "It's open", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
   ,{'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee', 'dialog_counter': '5', 'agent_text': 'Can I get your email address and phone number so I can have someone reach out to you?', 'customer_input': "No I don't", 'customer_name': 'David', 'customer_intention': 'buy', 'customer_property': 'house'}
]

conversations:Dict[str, List[Dict[str, str]]] = dict() # typing indicator just for IDE nice-to-haves
# group conversations by conversation id
for message in data:
    cid = message.get('conversation_id')
    if not cid:
        continue
    if cid not in conversations:
        conversations[cid] = list()
    conversations[cid].append(message)

# sort each conversation by message dialog counter
conversations = {
    cid: sorted(conversation, key=(lambda message:message.get('dialog_counter')))
    for cid, conversation in conversations.items()
}

# get the last message in each conversation
last_messages = [conversation[-1] for conversation in conversations.values()]

pprint.pprint(last_messages)

我忘了输出:

[{'agent_text': 'Can I get your email address and phone number so I can have '
                'someone reach out to you?',
  'conversation_id': '4850dd66-05b9-43e9-b546-e4976c9c29b6',
  'customer_input': "No I don't.",
  'customer_intention': 'buy',
  'customer_name': 'David',
  'customer_property': 'house',
  'dialog_counter': '5'},
 {'agent_text': 'Can I get your email address and phone number so I can have '
                'someone reach out to you?',
  'conversation_id': 'dbec6faa-16cb-416a-8653-ffc36174ecee',
  'customer_input': "No I don't",
  'customer_intention': 'buy',
  'customer_name': 'David',
  'customer_property': 'house',
  'dialog_counter': '5'}]