# Sets and Dictionaries

## Created by C. Woodford

#### We know how to create and use lists quite efficiently at this point, but they can't do everything and are by no means the only method of storing and using information. The definition for a set is "an unordered collection with no duplicate elements". One of the major differences between sets and lists is how they're formed. You can either use the "set()" function or define your set with curly braces "{}".

In [2]:
# Let's try some simple operations with simple sets before moving on to dictionaries.

basket1 = {'apple','orange','mango','papaya','grape'}
basket2 = {'papaya','strawberry','kiwi','grapefruit'}

#Membership Testing
#'orange' in basket1  

'orange' in basket2

False

In [7]:
#Differences in the sets, what do these print?

#basket1-basket2 #Subtracts elements from basket 2 from basket 1

#basket2-basket1 #Subtracts elements from basket 1 from basket 2

#basket1|basket2 #Union of basket 1 and basket 2

#basket1&basket2 #Common elements (intersection)

#basket1^basket2 #Everything that is not in both (ie. Union subtract intersection)

{'apple', 'grape', 'grapefruit', 'kiwi', 'mango', 'orange', 'strawberry'}

In [8]:
#You can also define a set using the set function

a = set('abracadabra') #What does this set look like? What can you do with it?

In [9]:
a

{'a', 'b', 'c', 'd', 'r'}

#### Now that we understand sets a little better, let's move on to dictionaries. Dictionaries are "associative lists", can can be indexed by any variable type except bools (why do you think that is?). A dictionary is still a set, but a little more complicated: it is defined as an unordered collection of key:value pairs without duplication in the keys.

In [15]:
tel = {'jack':4098, 'sape':4139} #dictionary using strings as keys and integers as values

#Add more items simply with 
tel['guido'] = 4190 #automatically sorts the elements by their key (ie. numerically if numerical keys, alphabetically if strings)

In [21]:

tel['john'] = 1000

In [22]:
#How might we verify if an element is in the dictionary tel?
'jack' in tel

True

In [23]:
#You can print the keys or the values of tel using tuples:

print("tel.keys = "+repr(tel.keys()))
print("tel.values = "+repr(tel.values()))

#What do you notice about the representations of these?

tel.keys = dict_keys(['jack', 'sape', 'guido', 'john'])
tel.values = dict_values([4098, 4139, 4190, 1000])


In [27]:
#What if you want to sort the dictionary? We can use "sorted" on either the keys or values, depending on what you want.
#Try it out!
l = [1,26,12,3,75]
l1 = ['afg','as','le1','geq']

print(sorted(l)) #Sorted for a regular numerical list
print(sorted(l1)) #Sorted for a regular string list

#Now try it for the dictionary tel - what do you notice?
sorted(tel.values())

[1, 3, 12, 26, 75]
['afg', 'as', 'geq', 'le1']


[1000, 4098, 4139, 4190]

In [36]:
#You can also create a dictionary using "dict()", just like we could for sets

tel1 = dict([('dan',3464),('jim',5752),('sam',5730)])
tel2 = dict(dan=3464,jim=5752,sam=5730) #Only works when the keys are simple strings

#Are these two dictionaries different from one another? How could you compare them to see?
print(repr(tel1==tel2))

#Try adding a key:value pair to tel1
tel1['richard'] = 4567
print(tel1)

#Try removing a key:value pair from tel2
del tel2['sam']
print(tel2)

True
{'dan': 3464, 'jim': 5752, 'sam': 5730, 'richard': 4567}
{'dan': 3464, 'jim': 5752}


In [46]:
#We can use dictionaries in recursive functions to hold data and prevent over-calculation.
#In class, we looked at factorials, the fibonacci sequence, the greatest common divisor, 
#sum of the first n natural numbers, and the Towers of Hanoi as recursive functions. Here is how you
#could implement dictionaries into a few of these functions:

#Fibonacci sequence
memo_fib = {0:0, 1:1}  #We know the first 2 terms, so can put them in there
def fib_d(n):
    if not n in memo_fib:
        memo_fib[n] = fib_d(n-1)+fib_d(n-2)
    print("Hey!")  #So you can see how many times the function completes - is it the same for every call?
    return memo_fib[n]

def fib(n):
    if n==0:
        return 0
    elif n==1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

In [58]:
#greatest common denominator

def gcd(x,y):
    if x<0 or y<0:
        print("input positive integers")
        return
    if y == 0:
        return x 
    else:
        return gcd( y, x%y)

memo_gcd = dict()  #empty dictionary, as we don't know what the first elements could be
                    #Note that you need to redefine memo_gcd before each gcd_d function call! - Why?
def gcd_d(x,y):
    if not y in memo_gcd:
        memo_gcd[y] = gcd(y,x%y)
    return memo_gcd[y]

In [63]:
memo_gcd = dict() #Why do I need this additional line for the function call?
gcd_d(58,48)

2