Py学习  »  Python

如何在python中将多个dictionary对象合并到一个列表中?

Jacob Berger • 3 年前 • 1606 次点击  

我已经试了好几个小时来实现这个目标。我试图从一个文件中读取数据,将所有数据组织到一个字典中,然后将所有字典合并到一个列表中。然而,我已经尽可能地将所有数据放入字典,但每当我尝试在每次迭代中将它们附加到=列表时,它要么完全中断,要么只添加文件的最后一行`def

readParkFile(fileName="national_parks.csv"):
    f = open(fileName, "r")
    parkList = []
    parkDict = {}
    header = f.readline().split(",")
    keys = list(header)

    for line in f:
        tmp = list(line.split(","))
        quote = tmp[7:]
        fullQuote = ','.join(quote)
        del tmp[7:]
        tmp.append(fullQuote)
        for value in tmp:
            parkDict[keys[tmp.index(value)]] = value
        parkList.append(parkDict)
        print(parkDict)

readParkFile()`

代码、名称、州、英亩、纬度、经度、日期、描述 阿卡德,阿卡迪亚国家公园,ME,47390,44.35,-68.211919-02-26,“阿卡迪亚覆盖了大部分荒山岛和其他沿海岛屿,以美国大西洋海岸最高的山峰、花岗岩山峰、海岸线、林地和湖泊为特色。这里有淡水、河口、森林和潮间带栖息地。” 美国犹他州拱门国家公园拱门,76519,38.68,-109.571971-11-12,“该遗址拥有2000多个天然砂岩拱门,公园中最受欢迎的拱门包括精致拱门、景观拱门和双拱门。数百万年的侵蚀在沙漠气候中形成了这些结构,干旱的土地上有维持生命的生物土壤结皮和坑洞,这些坑洞是天然的集水盆地。其他地质构造包括:去石尖塔、鳍和平衡石。" 巴德尔,巴德兰兹国家公园,SD,242756,43.75,-102.51978-11-10,“巴德兰兹是一个由山头、尖顶、尖顶和混合草地草原组成的集合。白河巴德兰兹包含已知的始新世晚期和渐新世哺乳动物化石的最大集合。野生动物包括野牛、大角羊、黑脚雪貂和土拨鼠。” 德克萨斯州大弯国家公园比伯,801163,29.25,-103.251944-06-12,“这个公园以美墨边境格兰德河的突出弯曲而命名,它包括奇瓦瓦沙漠的一大片偏远地区。它的主要景点是干旱的奇索斯山脉和沿河峡谷中的乡村娱乐。边界内还存在各种各样的白垩纪和第三纪化石以及美洲原住民的文物。” 科罗拉多州甘尼森国家公园黑峡谷BLCA,32950,38.57,-107.721999-10-21,“该公园保护着甘尼森河的四分之一,甘尼森河从黑暗的前寒武纪岩石上切割出陡峭的峡谷壁。峡谷以北美最陡峭的悬崖和最古老的岩石为特色,是河流漂流和攀岩的热门地点。深而窄的峡谷由片麻岩和片岩组成,在阴影中呈黑色。” 布尔卡,布莱斯峡谷国家公园,犹他州,35835,37.57,-112.181928-02-25,“布莱斯峡谷是波恩萨格特高原上的一个地质竞技场,有数百个因侵蚀而形成的高大多彩的砂岩石林。该地区最初由美洲原住民定居,后来由摩门教先驱定居。” CANY,Canyonlands National Park,UT,337598,38.2,-109.931964-09-12,“在科罗拉多河、格林河及其支流的共同努力下,这片景观被侵蚀成了峡谷、山丘和台地的迷宫,将公园分为三个区。公园还包括岩石尖顶和拱门,以及古代普韦布洛人的文物。” CARE,美国犹他州国会山礁国家公园,241904,38.2,-111.171971-12-18,“公园的水袋褶皱是一条100英里(160公里)的单斜,展示了地球的各种地质层。其他自然特征包括巨石、悬崖和形状类似美国国会山的砂岩穹顶。” 洞穴,卡尔斯巴德洞穴国家公园,新墨西哥州,46766,32.17,-104.441930-05-14,“卡尔斯巴德洞穴有117个洞穴,其中最长的超过120英里(190公里)长。大房间将近4000英尺(1200米)长,洞穴里有超过40万只墨西哥无尾蝙蝠和其他16个物种。地面上是奇瓦瓦沙漠和响尾蛇泉。” 加利福尼亚州海峡群岛国家公园奇斯,249561,34.01,-119.421980-03-05,“八个海峡岛屿中有五个受到保护,公园一半的面积在水下。这些岛屿有一个独特的地中海生态系统,最初由Chumash人定居。它们是2000多种陆地动植物的家园,包括岛狐在内的145种特有动植物。渡轮服务提供从大陆到这些岛屿的交通。” CONG,Congaree国家公园,SC,26546,33.78,-80.782003-11-10,“在Congaree河上,这个公园是北美仅存的老生长漫滩森林的最大部分。一些树木是美国东部最高的。一条名为木板路环路的高架人行道引导游客穿过沼泽。” CRLA,火山口湖国家公园,或183224,42.94,-122.11902-05-22,“火山口湖位于7700年前崩塌的一座名为马扎马山的古老火山的破火山口内。该湖是美国最深的,以其鲜艳的蓝色和清澈的海水而闻名。巫师岛和幽灵船是火山口内较新的火山构造。由于该湖没有入口或出口,只能靠降水补充。” 库瓦,库亚霍加河谷国家公园,俄亥俄州,32950,41.24,-81.552000-10-11,“库亚霍加河沿岸的这个公园有瀑布、山丘、小径,以及关于早期农村生活的展览。俄亥俄州和伊利运河拖径沿着俄亥俄州和伊利运河,骡子在那里拖曳运河船只。该公园有许多历史悠久的住宅、桥梁和建筑,还提供了一次风景优美的火车旅行。”

预期产出:

[{“code”:, “name”: …},{“code”:, “name”: …}, {“code”:, “name”: …}]
Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/133554
 
1606 次点击  
文章 [ 3 ]  |  最新文章 3 年前
constantstranger
Reply   •   1 楼
constantstranger    3 年前

( 笔记 :你可能想看看用熊猫来做这个。请参阅此答案的下半部分以获取示例。)

直接回答你的问题: 如果您想要实际的类型化值(int表示英亩,float表示纬度/经度),下面是一种使用正则表达式的方法:

import re
def readParkFile(fileName="national_parks.csv"):
    parkList = []
    f = open(fileName, "r")
    keys = f.readline().strip("\n").split(",")
    for line in f:
        v = re.search('(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*)', line).groups()
        Code, Name, State, Acres, Latitude, Longitude, Date, Description = v[0], v[1], v[2], int(v[3]), float(v[4]), float(v[5]), v[6], v[7][1:-1]
        parkDict = dict(zip(keys, [Code, Name, State, Acres, Latitude, Longitude, Date, Description]))
        parkList.append(parkDict)
    return parkList

parkList = readParkFile()

在循环内部, re.search() 在所有逗号分隔的组中使用非贪婪限定符,但最后一个组除外( Description ),然后将数字字段转换为数字,并从中删除周围的引号 描述 。然后将这些类型化的值与 keys 使用 zip() 变成了一个 dict 然后再将其附加到结果中, parkList .循环csv文件中的所有行后,函数返回 公园名单 .

结果dict列表的第一个dictionary元素如下所示:

{'Code': 'ACAD', 'Name': 'Acadia National Park', 'State': 'ME', 'Acres': 47390, 'Latitude': 44.35, 'Longitude': -68.21, 'Date': '1919-02-26', 'Description': 'Covering most of Mount Desert Island and other coastal islands, Acadia features the tallest mountain on the Atlantic coast of the United States, granite peaks, ocean shoreline, woodlands, and lakes. There are freshwater, estuary, forest, and intertidal habitats.'}

另一种使用熊猫的方法是: 在熊猫中,你可以这样做:

import pandas as pd
df = pd.read_csv("national_parks.csv")
print(df)
print(df.dtypes)
parkList = df.to_dict('records')
print(parkList[0])

它将给出以下输出:

    Code                                        Name State   Acres  Latitude  Longitude        Date                                        Description
0   ACAD                        Acadia National Park    ME   47390     44.35     -68.21  1919-02-26  Covering most of Mount Desert Island and other...
1   ARCH                        Arches National Park    UT   76519     38.68    -109.57  1971-11-12  This site features more than 2,000 natural san...
2   BADL                      Badlands National Park    SD  242756     43.75    -102.50  1978-11-10  The Badlands are a collection of buttes, pinna...
3   BIBE                      Big Bend National Park    TX  801163     29.25    -103.25  1944-06-12  Named for the prominent bend in the Rio Grande...
4   BISC                      Biscayne National Park    FL  172924     25.65     -80.08  1980-06-28  The central part of Biscayne Bay, this mostly ...
5   BLCA  Black Canyon of the Gunnison National Park    CO   32950     38.57    -107.72  1999-10-21  The park protects a quarter of the Gunnison Ri...
6   BRCA                  Bryce Canyon National Park    UT   35835     37.57    -112.18  1928-02-25  Bryce Canyon is a geological amphitheater on t...
7   CANY                   Canyonlands National Park    UT  337598     38.20    -109.93  1964-09-12  This landscape was eroded into a maze of canyo...
8   CARE                  Capitol Reef National Park    UT  241904     38.20    -111.17  1971-12-18  The park's Waterpocket Fold is a 100-mile (160...
9   CAVE              Carlsbad Caverns National Park    NM   46766     32.17    -104.44  1930-05-14  Carlsbad Caverns has 117 caves, the longest of...
10  CHIS               Channel Islands National Park    CA  249561     34.01    -119.42  1980-03-05  Five of the eight Channel Islands are protecte...
11  CONG                      Congaree National Park    SC   26546     33.78     -80.78  2003-11-10  On the Congaree River, this park is the larges...
12  CRLA                   Crater Lake National Park    OR  183224     42.94    -122.10  1902-05-22  Crater Lake lies in the caldera of an ancient ...
13  CUVA               Cuyahoga Valley National Park    OH   32950     41.24     -81.55  2000-10-11  This park along the Cuyahoga River has waterfa...
Code            object
Name            object
State           object
Acres            int64
Latitude       float64
Longitude      float64
Date            object
Description     object
dtype: object
{'Code': 'ACAD', 'Name': 'Acadia National Park', 'State': 'ME', 'Acres': 47390, 'Latitude': 44.35, 'Longitude': -68.21, 'Date': '1919-02-26', 'Description': 'Covering most of Mount Desert Island and other coastal islands, Acadia features the tallest mountain on the Atlantic coast of the United States, granite peaks, ocean shoreline, woodlands, and lakes. There are freshwater, estuary, forest, and intertidal habitats.'}

正如你所见,只需一个电话 read_csv() ,pandas解析csv文件,计算出每列的数据类型,并将所有这些组合成一个DataFrame对象。然后,您可以通过调用 to_dict() 带有参数的DataFrame对象的 'records' .

Mohamed Yasser
Reply   •   2 楼
Mohamed Yasser    3 年前

你提供的信息并没有真正的帮助。在提供输入时,尽量使复制粘贴变得容易。我们不能复制并粘贴到文件中。csv文件。

不管怎样,我做了一个小小的决定。csv,我认为这段代码提供了你想要的:

finalList = []
with open('smth.csv','r') as file:          
    keys = file.readline().split(',') 
    line = file.readline().split(',')
    while line: # Exits when readline returns an empty line 
        finalList.append({keys[0]: line[0],
                            keys[1]: line[1],
                            keys[2]: line[2]  
                            })
        line = file.readline()

你也可以通过使用 file.readlines() 一下子就回来了。

Shay Nehmad
Reply   •   3 楼
Shay Nehmad    3 年前

我复制了您的输入,在CSV中的每一行之前添加了换行符(参见 here ),及 将字典的初始化移动到循环中 .运行以下代码:

from pprint import pprint

def readParkFile(fileName="temp.csv"):
    f = open(fileName, "r")
    parkList = []
    header = f.readline().split(",")
    keys = list(header)

    for line in f:
        parkDict = {}  # This is the issue! Moving the initialization of this dictionary into the loop.
        tmp = list(line.split(","))
        quote = tmp[7:]
        fullQuote = ','.join(quote)
        del tmp[7:]
        tmp.append(fullQuote)
        for value in tmp:
            parkDict[keys[tmp.index(value)]] = value
        # pprint(parkDict)
        parkList.append(parkDict)
    # pprint(parkList)

if __name__ == '__main__':
    readParkFile()

您将得到以下输出:

[{'Acres': '47390',
  'Code': 'ACAD',
  'Date': '1919-02-26',
  'Description \n': '"Covering most of Mount Desert Island and other coastal '
                    'islands, Acadia features the tallest mountain on the '
                    'Atlantic coast of the United States, granite peaks, ocean '
                    'shoreline, woodlands, and lakes. There are freshwater, '
                    'estuary, forest, and intertidal habitats." \n',
  'Latitude': '44.35',
  'Longitude': '-68.21',
  'Name': 'Acadia National Park',
  'State': 'ME'},
 {'Acres': '76519',
  'Code': 'ARCH',
  'Date': '1971-11-12',
  'Description \n': '"This site features more than 2\\,000 natural sandstone '
                    'arches, with some of the most popular arches in the park '
                    'being Delicate Arch, Landscape Arch and Double Arch. '
                    'Millions of years of erosion have created these '
                    'structures in a desert climate where the arid ground has '
                    'life-sustaining biological soil crusts and potholes that '
                    'serve as natural water-collecting basins. Other geologic '
                    'formations include stone pinnacles, fins, and balancing '
                    'rocks." \n',
  'Latitude': '38.68',
  'Longitude': '-109.57',
  'Name': 'Arches National Park',
  'State': 'UT'},
 {'Acres': '242756',
  'Code': 'BADL',
  'Date': '1978-11-10',
  'Description \n': '"The Badlands are a collection of buttes, pinnacles, '
                    'spires, and mixed-grass prairies. The White River '
                    'Badlands contain the largest assemblage of known late '
                    'Eocene and Oligocene mammal fossils. The wildlife '
                    'includes bison, bighorn sheep, black-footed ferrets, and '
                    'prairie dogs."\n',
  'Latitude': '43.75',
  'Longitude': '-102.5',
  'Name': 'Badlands National Park',
  'State': 'SD'},
 {'Acres': '801163',
  'Code': 'BIBE',
  'Date': '1944-06-12',
  'Description \n': '"Named for the prominent bend in the Rio Grande along the '
                    'U.S.-Mexico border, this park encompasses a large and '
                    'remote part of the Chihuahuan Desert. Its main attraction '
                    'is backcountry recreation in the arid Chisos Mountains '
                    'and in canyons along the river. A wide variety of '
                    'Cretaceous and Tertiary fossils as well as cultural '
                    'artifacts of Native Americans also exist within its '
                    'borders." \n',
  'Latitude': '29.25',
  'Longitude': '-103.25',
  'Name': 'Big Bend National Park',
  'State': 'TX'},
 {'Acres': '172924',
  'Code': 'BISC',
  'Date': '1980-06-28',
  'Description \n': '"The central part of Biscayne Bay, this mostly underwater '
                    'park at the north end of the Florida Keys has four '
                    'interrelated marine ecosystems: mangrove forest, the Bay, '
                    'the Keys, and coral reefs. Threatened animals include the '
                    'West Indian manatee, American crocodile, various sea '
                    'turtles, and peregrine falcon." \n',
  'Latitude': '25.65',
  'Longitude': '-80.08',
  'Name': 'Biscayne National Park',
  'State': 'FL'},
 {'Acres': '32950',
  'Code': 'BLCA',
  'Date': '1999-10-21',
  'Description \n': '"The park protects a quarter of the Gunnison River, which '
                    'slices sheer canyon walls from dark Precambrian-era rock. '
                    'The canyon features some of the steepest cliffs and '
                    'oldest rock in North America, and is a popular site for '
                    'river rafting and rock climbing. The deep, narrow canyon '
                    'is composed of gneiss and schist, which appears black '
                    'when in shadow." \n',
  'Latitude': '38.57',
  'Longitude': '-107.72',
  'Name': 'Black Canyon of the Gunnison National Park',
  'State': 'CO'},
 {'Acres': '35835',
  'Code': 'BRCA',
  'Date': '1928-02-25',
  'Description \n': '"Bryce Canyon is a geological amphitheater on the '
                    'Paunsaugunt Plateau with hundreds of tall, multicolored '
                    'sandstone hoodoos formed by erosion. The region was '
                    'originally settled by Native Americans and later by '
                    'Mormon pioneers." \n',
  'Latitude': '37.57',
  'Longitude': '-112.18',
  'Name': 'Bryce Canyon National Park',
  'State': 'UT'},
 {'Acres': '337598',
  'Code': 'CANY',
  'Date': '1964-09-12',
  'Description \n': '"This landscape was eroded into a maze of canyons, '
                    'buttes, and mesas by the combined efforts of the Colorado '
                    'River, Green River, and their tributaries, which divide '
                    'the park into three districts. The park also contains '
                    'rock pinnacles and arches, as well as artifacts from '
                    'Ancient Pueblo peoples." \n',
  'Latitude': '38.2',
  'Longitude': '-109.93',
  'Name': 'Canyonlands National Park',
  'State': 'UT'},
 {'Acres': '241904',
  'Code': 'CARE',
  'Date': '1971-12-18',
  'Description \n': '"The park\'s Waterpocket Fold is a 100-mile (160 km) '
                    "monocline that exhibits the earth's diverse geologic "
                    'layers. Other natural features include monoliths, cliffs, '
                    'and sandstone domes shaped like the United States '
                    'Capitol." \n',
  'Latitude': '38.2',
  'Longitude': '-111.17',
  'Name': 'Capitol Reef National Park',
  'State': 'UT'},
 {'Acres': '46766',
  'Code': 'CAVE',
  'Date': '1930-05-14',
  'Description \n': '"Carlsbad Caverns has 117 caves, the longest of which is '
                    'over 120 miles (190 km) long. The Big Room is almost '
                    '4,000 feet (1,200 m) long, and the caves are home to over '
                    '400,000 Mexican free-tailed bats and sixteen other '
                    'species. Above ground are the Chihuahuan Desert and '
                    'Rattlesnake Springs." \n',
  'Latitude': '32.17',
  'Longitude': '-104.44',
  'Name': 'Carlsbad Caverns National Park',
  'State': 'NM'},
 {'Acres': '249561',
  'Code': 'CHIS',
  'Date': '1980-03-05',
  'Description \n': '"Five of the eight Channel Islands are protected, with '
                    "half of the park's area underwater. The islands have a "
                    'unique Mediterranean ecosystem originally settled by the '
                    'Chumash people. They are home to over 2,000 species of '
                    'land plants and animals, 145 endemic to them, including '
                    'the island fox. Ferry services offer transportation to '
                    'the islands from the mainland." \n',
  'Latitude': '34.01',
  'Longitude': '-119.42',
  'Name': 'Channel Islands National Park',
  'State': 'CA'},
 {'Acres': '26546',
  'Code': 'CONG',
  'Date': '2003-11-10',
  'Description \n': '"On the Congaree River, this park is the largest portion '
                    'of old-growth floodplain forest left in North America. '
                    'Some of the trees are the tallest in the eastern United '
                    'States. An elevated walkway called the Boardwalk Loop '
                    'guides visitors through the swamp." \n',
  'Latitude': '33.78',
  'Longitude': '-80.78',
  'Name': 'Congaree National Park',
  'State': 'SC'},
 {'Acres': '183224',
  'Code': 'CRLA',
  'Date': '1902-05-22',
  'Description \n': '"Crater Lake lies in the caldera of an ancient volcano '
                    'called Mount Mazama that collapsed 7,700 years ago. The '
                    'lake is the deepest in the United States and is noted for '
                    'its vivid blue color and water clarity. Wizard Island and '
                    'the Phantom Ship are more recent volcanic formations '
                    'within the caldera. As the lake has no inlets or outlets, '
                    'it is replenished only by precipitation." \n',
  'Latitude': '42.94',
  'Longitude': '-122.1',
  'Name': 'Crater Lake National Park',
  'State': 'OR'},
 {'Acres': '32950',
  'Code': 'CUVA',
  'Date': '2000-10-11',
  'Description \n': '"This park along the Cuyahoga River has waterfalls, '
                    'hills, trails, and exhibits on early rural living. The '
                    'Ohio and Erie Canal Towpath Trail follows the Ohio and '
                    'Erie Canal, where mules towed canal boats. The park has '
                    'numerous historic homes, bridges, and structures, and '
                    'also offers a scenic train ride."\n',
  'Latitude': '41.24',
  'Longitude': '-81.55',
  'Name': 'Cuyahoga Valley National Park',
  'State': 'OH'}]