像计算机科学家一样学习python(7)

列表+字典

Python学习笔记-像计算机科学家一样学习python(6)

列表

列表和字符串的不同在于,列表是可变的,而字符串不可变。

t.append()在列表尾部添加新的元素

t.extend()将后面列表的元素全部加入列表 t 中

t.sort()将元素排序

需要注意的是,这几个方法都是没有返回值的。

total += a → total = total + a

t.pop()删除下标的元素,并返回被删除的元素值

del t[]删除下标的元素,返回None

t.remove()删除特定元素

list(s)返回有字符串字母组成的列表

s.split(delimiter)返回字符串单个单词组成的列表,并用delimiter隔开。它的逆操作是delimeter.join(t)

注意is==的区别,前者是相同(identical),后者是相等(equivalent)。

练习:

10-1

Python
def nested_sum(t):
    t3 = []
    for t2 in t:
        t3 = t3 + t2
    print(sum(t3))
nested_sum([[1,2],[3,4]])

10-2

Python
def cumsum(t):
    n = len(t) - 1
    t3 = []
    for i in range(n+1):
        number = sum(t[:i+1])
        t3.append(number)
    print(t3)
cumsum([1,2,3,4,5])

10-3

Python
def middle(t):
    t.pop(0)
    t.pop(len(t)-1)
    return(t)
print(middle([1,2,3,4,5]))

10-4

Python
def chop(t):
    t.pop(0)
    t.pop(len(t)-1)
t=[1,2,3,4,5]
chop(t)
print(t)

10-5

Python
def is_sorted(t):
    t2 = t + []
    # 如果直接使用t2 = t,那么t2只会变成t的别名,后面对t操作的时候也会影响到t2.
    t.sort()
    if t2 == t:
        return True
    return False

print(is_sorted([1,2,3,2,4,5]))

10-6

Python
def is_anagram(str1, str2):
    if str1 == str2[::-1]:
        return True
    return False

print(is_anagram('1234','3321'))

10-7

Python
def has_duplicates(t):
    t2 = t + []
    t2.sort()
    for i in range(len(t2)-1):
        if t2[i] == t2[i+1]:
            return True
    return False
print(has_duplicates([2,3,4,6,2]))

10-9

Python
from timeit import timeit
fin = open('words.txt')
l = []
def read1():
    for line in fin:
        l.append(line)
def read2():
    for line in fin:
        l = l + [line]
print(timeit('read1()','from __main__ import read1'), timeit('read2()','from __main__ import read2'))

测试我们发现了很有趣的事情,当程序执行百万次一下的时候,直接用+速度更快,执行次数越少,+的优势更大,但是当次数上去了之后,append的优势就逐渐体现出来了

0.004870300000000001 6.199999999999956e-05
number = 100
0.0053992 5.7000000000001494e-05
number = 1000
0.008607299999999998 0.0005729000000000012
number = 10000
0.060827599999999996 0.054653999999999994
number = 100000
0.49591250000000003 0.4961783999999999
number = 1000000
4.8199121 4.7742455999999995
number = 10000000
47.8154918 49.5350966

暂时不知道什么原因,以后再回来研究吧,

10-10

偷懒不造轮子了,查看Bisect的文档

Python
import bisect
fin = open('words.txt')
t = []
for letter in fin:
    t.append(letter)
    t.sort()
def index(a, x):
    i = bisect.bisect_right(a, x)
    if i:
        return i
    raise ValueError
print(index(t,'chammy'))

字典

字典是一种映射。字典包含下标(键)集合和值集合。键和值之间的关联被称为键值对,或者被称为一项。

使用in查询一个值是不是字典中的键。

如果需要查看值,则可以用values方法,返回一个值集合。

11.2练习:

Python
def histogram(s):
    d = dict()
    for c in s:
        d[c] = d.get(c, 0) + 1
    return d

vs出现了问题,但是没有保留logs,谨记谨记,遇到问题保留logs!

使用内置函数sorted按顺序遍历所有键:

Python
for key in sorted(h):
    print(key, h(key))

raise语句会生成一个异常。

练习

11-1

Python
def alphabet():
    fin = open('words.txt')
    dictionary = dict()
    for key in fin:
        dictionary[key.strip()] = 'unknow'
    return(dictionary)
print('all' in alphabet())

虽然明显比较快,但是我们还是要用timelit测试一下。

Python
from timeit import timeit
import bisect
fin = open('words.txt')
t = []
for letter in fin:
    t.append(letter)
    t.sort()
def index(a, x):
    i = bisect.bisect_right(a, x)
    if i:
        return i
    raise ValueError

def alphabet():
    fin = open('words.txt')
    dictionary = dict()
    for key in fin:
        dictionary[key.strip()] = 'unknow'
    return(dictionary)

def read1():
    'chammy' in alphabet()

def read2():
    index(t,'chammy')

print(timeit('read1()','from __main__ import read1',number=1), timeit('read2()','from __main__ import read2',number=1))

结果和预料的不一样,结合之前的经验来看,应该是number出了问题。

Python
from timeit import timeit
import bisect

def index(x):
    fin = open('words.txt')
    t = []
    for letter in fin:
        t.append(letter)
        t.sort()
    i = bisect.bisect_right(t, x)
    if i:
        return i
    raise ValueError

def alphabet():
    fin = open('words.txt')
    dictionary = dict()
    for key in fin:
        dictionary[key.strip()] = 'unknow'
    return(dictionary)

def read1():
    'chammy' in alphabet()

def read2():
    index('chammy')

print(timeit('read1()','from __main__ import read1',number=1), timeit('read2()','from __main__ import read2',number=1))

稍微修改了一下,看起来应该是append花了更多的时间

11-2

查看文档:

setdefault(key[, default])

如果字典存在键 key ,返回它的值。如果不存在,插入值为 default 的键 key ,并返回 defaultdefault默认为 None

Python
def invert_dict(d):
    inverse = dict()
    for key in d:
        val =  d[key]
        inverse.setdefault(val, key)
    return inverse
d = dict
d = {'one':'uno','two':'doc'}
print(invert_dict(d))

11-3

Python
known = dict()
def ack(m, n):
    key = (m,n)
    if key in known:
        return known[key]
    elif m == 0:
        res = n+1
        known[key] = res
        return res
    elif m > 0 and n == 0:
        res = ack(m-1, 1)
        known[key] = res
        return res
    elif m > 0 and n > 0:
        res = ack(m-1, ack(m, n-1))
        known[key] = res
        return res
    else:
       return('No result')
print(ack(3, 4))

既然能用元组解决,我为什么还要想其他解决方式。

尝试更大的数据:

RecursionError: maximum recursion depth exceeded in comparison

11-4

Python
def has_duplicates(t):
    d = dict()
    n = 1
    for key in t:
        a = d.setdefault(key, n)
        if not a == n:
            return True
        n = n + 1
    return False 
print(has_duplicates([2,3,4,6]))

11-5

Python
def rotate_word(original_str, key):
    '''轮转单词
    original_str:初始字符串
    key:轮转的格数'''
    after_str = ''
    for a in original_str:
        n = ord(a) + 1
        after_str = after_str + chr(n)
    return after_str

def make_word_dict():
    '''将单词表转化为以单词为key的字典'''
    fin = open('words.txt')
    d =  dict()
    for letter in fin:
        d[letter.strip()] = None
    return d

def rotated_pairs(word, word_dict):
    '''寻找单个单词在字典的key中,轮转词的数量并打印
    word:字符串
    word_dict:用words做key的字典'''
    for i in range(1, 14):
        rotated = rotate_word(word, i)
        if rotated in word_dict:
            print(word, i, rotated)
            
word_dict = make_word_dict()
for word in word_dict:
    rotated_pairs(word, word_dict)

11-6

Python
from __future__ import print_function, division


def read_dictionary(filename='069s'):
    """Reads from a file and builds a dictionary that maps from
    each word to a string that describes its primary pronunciation.

    Secondary pronunciations are added to the dictionary with
    a number, in parentheses, at the end of the key, so the
    key for the second pronunciation of "abdominal" is "abdominal(2)".

    filename: string
    returns: map from string to pronunciation
    """
    d = dict()
    fin = open(filename)
    for line in fin:

        # skip over the comments
        if line[0] == '#': continue

        t = line.split()
        word = t[0].lower()
        pron = ' '.join(t[1:])
        d[word] = pron

    return d

def homophone(word, word_list):
    '''在发音字典中寻找满足特定条件的单词
    word:字符串
    word_list:发音表的python字典'''
    word2 = word[1:]
    word3 = word[0:1]+word[2:]
    if word2 in word_list and word3 in word_list:
        if word_list[word] == word_list[word2] ==word_list[word3]:
            print(word)

d = read_dictionary()
for word in d:
    homophone(word, d)

最后编辑于
文章链接: http://pheustal.com/2019/12-10/thinkpython7
本作品采用CC-BY-SA许可。