Functions¶

Function is an important component for any programming languages. A function is a set of code which can be run at once when called.

First, functions lead to a better reusability. Remember some built-in functions (e.g., int() function turning a variable into a integer)? Do you know the details of the function? Did you happen to copy the source code into your notebook?
Second, function allows you to logically divide your project into a set of different sub-tasks, which is more efficient and easier to manage your project.

The syntax of a function:

#Here is the definition of a function named func
def func(a, b): # func is a function name which you will use for reference. a and b are parameters
    statement1
    statement2
    return result

# here is how we call a function
func(3,4) # 3, 4 are two (positional) arguments which we pass to the fucntion func, and which correspond to the two parameters by position.
func(a=3, b=4) # this time we explicitly assign two arguments to the two parameters.
print(x)
print(y)

Glossary:
Parameter Vs. Argument: Parameters are variables used by a function definition to receive passed-in values when the function is called. Arguments are values that are passed into a function when the function is invoked/called. Oftentimes these two terms are used interchangeably.

Functions with no returns¶

#let's define a function that take 2 arguments to do an addition operation
def f(a,b):
    c = a+b
    print(c)

# Now let's call this function to calcuate 3+4
f(3,4)

7

f(a=3, b=4)

7

f(b=4, a=3)

7

Functions with returns¶

def with_return(a,b):
    return (a, b)

x = with_return(3,4)

print(x,type(x))

(3, 4) <class 'tuple'>

Parameters can have default values.

def add_or_sub(a, b, mode = 'add'): # mode has a default value.
    if mode == 'add':
        print(a+b) # if mode is True, do addition
    else:
        print(a-b) # if mode is False, do subtraction

add_or_sub(4,3)

7

add_or_sub(4,3,'add')

7

add_or_sub(4,3,'else')

1

Anonymous Function (Pythonic)¶

In Python, a function can have no function name, which shortens your code (highly efficient and heavily used in Pandas). In order to create anonymous functions, we will need to create a lambda statement.

Syntax:

varName = lambda argument_list: expression

It is noted that a lambda can take more than 1 arguments, but can only have one expression. That expression will be evaluated and returned.

plus2 = lambda x: x+2 # basically we assign a function to a variable
plus2(3)

5

The anonymous function above is actually equivalent to the code below written in a traditional programming way.

def plus2(x):
    return x+2
plus2(3)

5

We will re-visit the lambda expression when we talk about Pandas.

Built-in Functions¶

Python has a number of ready-to-use functions that are always available for use, without importing a package. They are called built-in functions. For instance, the int(), str(), bool() functions we used. A full list can be found in here.

#Return the absolute value of a number
abs(-4.2)

4.2

#Create a dictionary object
x = dict()
print(type(x))

<class 'dict'>

# create a set object
x = set([3,3,4])
print(x)

{3, 4}

#Returns an enumerate object
list_x = ['Tom',"Jerry","Kevin"]# list_x is a list
enumerate_x = enumerate(list_x) # enumerate_x is turned into an enumerate object

for x in list_x:
    print(x)

Tom
Jerry
Kevin

for x in enumerate_x:
    print(x)

(0, 'Tom')
(1, 'Jerry')
(2, 'Kevin')

#get the length of an object (such as string, tuple, list, dictionary,set)
a = [2,3,4]
print(len(a))

3

#get a sum of all the elements in a sequence
a = [3,4,5]
print(sum(a))

12

#get the type of an object
print(type('time'))

<class 'str'>

#returns the largest item in a iterable object or a set of numbers
max(2,3,4)

4

max([2,3,4])

4

#returns the smallest item
min(2,3,4)

2

min([2,3,4])

2

map function, map(), a very powerful built-in function, is used when you want to apply a function (usually a lambda function) to all the elements of an iterable. The syntax is:

map_object = map(function, iterable)

#assuming we have a list object, [1,2,3,4,5]
#my goal is to apply a arithmetic operation on each element of this list, x*10+1
#The expected result is [11,21,31,41,51]

a = [1,2,3,4,5]
result = map(lambda x: x*10, a)
for item in result: 
    print(item)

10
20
30
40
50

Exercise: Functions¶

Exercise: Functions

Solution:Functions

Scope of Variables¶

A concept that is closely connected to functions is the scope of variables. Scope refers to the visibility of variables. In other words, which part of your code can see or use which variables. Every variable has an associated scope. There are two kinds of scopes, global scope and local scope.

Variables in the global scope can be seen and used in any places of you program.
Variables in a local scope can only be seen and used in its local scope where it is created. That's exactly why we need return statement to pass out the results that are created inside a function.

a = 5 # a is a global variable
def func():
    print(a) #Let's see if a is accessible from inside a function.
func() #let's call this function

5

def func():
    thislocal = 6 #thislocal is a local variable that can only seen inside its own function.
    print(thislocal)
func()
print(thislocal)

6

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-e010cab26fbc> in <module>
      3     print(thislocal)
      4 func()
----> 5 print(thislocal)

NameError: name 'thislocal' is not defined

As you can tell, a Name error was thrown out by Python on print(thislocal) saying b is not defined

global_a = 5
def func():
    global_a = 4
    print(global_a) #Guess which a is called? or the two as are referring to the same value?
func()
print(global_a)

4
5

What if we want to change the global values from within a function? Use global keyword

a = 5
def func():
    global a # From this moment on, this local variable a is linking to the global one.
    a = 4
    print(a)
func()
print(a)

4
4

Rules:

local variable can have the same variable names as global variables
when local variables have the same names as global variables, local variables overshadow the global variables.
use global keyword if you want to refer to the global variable from within a local scope.

Exercise: Scope¶

Exercise: Scope

Solution: Scope

Operations on List¶

Get the length¶

a = [1,2,3]
b = [10,20,30]

#ge the length of a list
len(a)

3

Indexing and slicing¶

In python, the indexing starts from 0, which may be differnt from other languages (e.g.,R and Matlab ).

a = [1,2,3]
print(a[0])

1

What if the list is too long ,and I want to print out the last element? Please use -1

a = [1,2,3,4,6,7]
a[-1]

7

b = a[0:2] # start from index-0 and ends at index-3(excluded)
print(b)

[1, 2]

#or the start of a slice can be absent
a = [2,3,4,5]
a[:2] #equals a[0:2]

[2, 3]

#what if the start and the end of a slice are absent? 
a = [2,3,4,5]
a[:] # you got the whole list

[2, 3, 4, 5]

What if I want the last 3 elements?

a = [1,2,3,4,5,6,7]
a[-3:]

[5, 6, 7]

a = [1,2,3,4,5,6,7]
a[-3:-1] # the element on the position -1 is excluded. This is how slicing works in Python

[5, 6]

#finds the index of a certain element
a.index(3)

2

Since a list object is mutable, let's learn how to change certain elements by using indexing and slicing techniques

let's use indexing to change an element on a certain position.

a = ['a','b','c','d']
a[1] = 2
a

['a', 2, 'c', 'd']

Now let's use slicing to change elements in a row of a list. It is noted if the replacement has the same size as the slice, the replacement is just positioned accordingly. otherwise, the list might be shrunk or expanded.

a = [2,3,4,5]
a[0:2] = ['t','x']
a

['t', 'x', 4, 5]

#list is shrunk
a = [2,3,4,5]
a[0:2] = ['t']
a

['t', 4, 5]

# list is expanded
a = [2,3,4,5]
a[0:2] = [20,30,40,50,60]
a

[20, 30, 40, 50, 60, 4, 5]

Can we delete certain elements from a list? yes, you can! Please use del keyword

a = [2,3,4,5,6]
del a[0]
a

[3, 4, 5, 6]

a = [2,3,4,5]
del a[:2]
a

[4, 5]

Concatentation¶

The plus sign can be simply used to combine two lists. The second list is appended to the end of the first list.

a = [2,3,4]
b = [5,6,7,8]
a + b

[2, 3, 4, 5, 6, 7, 8]

b + a

[5, 6, 7, 8, 2, 3, 4]

Repetition¶

a = ['Tom',"Jerry"] 
b = a * 4
b

['Tom', 'Jerry', 'Tom', 'Jerry', 'Tom', 'Jerry', 'Tom', 'Jerry']

Methods for lists¶

Wait! What are methods? Methods are a similar concept as functions, with the exception that methods are associated with certain classes or its instance, objects. Remember how to call a built-in function? len(A)
when we call a method, it was preceded by a class or object. e.g., A.method()

# append
a = [2,3,4] # this is a list object
a.append(5) # we are calling a list method named append
a

[2, 3, 4, 5]

# extend
a = [2,3,4,5]
b = [10,20,30]
a.extend(b)
a

[2, 3, 4, 5, 10, 20, 30]

a.append(b)
a

[2, 3, 4, 5, 10, 20, 30, [10, 20, 30]]

help(list.sort)

Help on method_descriptor:

sort(self, /, *, key=None, reverse=False)
    Stable sort *IN PLACE*.

#sort
a = [23,68,1,10]
a.sort()
a

[1, 10, 23, 68]

# sort method Vs. sort built-in function
a = [23,68,1,10]
x = sorted(a)

x

[1, 10, 23, 68]

a # a is not changed by the built-in fucntion because the opertion is not associted with a object

[23, 68, 1, 10]

#count
a = [2,3,4,5, 4]
a.count(4)

2

List comprehension (Pythonic)¶

List comprehension provides a concise way to generate a list object. It is noted list comprehension should only be used when you try to get a list.
[function_expression for x in S]
[function_expression for x in s if condition_expression]

Did you still remember the example we used when introducing map function?

a = [1,2,3,4]
result = list(map(lambda x: x*10,a))
result

let's implement this algorithm using list comprehension.

a = [1,2,3,4]
result = [x*10+1 for x in a]
result

[11, 21, 31, 41]

We can also integrate a if-statement inside a list comprehension to add more controls on your workflow.

#This time, we only apply the arithmetic operstion to elements that are larger than 2, and keep the result. 
a = [1,2,3,4]
result = [x*10+1 for x in a if x>2]
result

[31, 41]

Operations on Tuple¶

Unlike list objects, tuple objects are immutable, which means once it is initialized, it cannot be changed. In other words, it is only readable, not writable.

a = (2,3,4)#define a tuple object.

The operations mentioned in the section of list can also be used for tuple objects. The only difference is it is illegal to change a tuple object. See an example below which I try to change elements of a tuple.

a[0] = 6

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-80-3c1153c8f9a9> in <module>
----> 1 a[0] = 6

TypeError: 'tuple' object does not support item assignment

Operations of Dictionary¶

#ways to create a dict
#Way-1, turns a list of list into a diction by using dict() built-in function. The first item is used as a key. 
phonelist = [['Kevin','2456'],['Mike','3456'],["Sue",'5567']]
dict_phones = dict(phonelist)
print(dict_phones)

{'Kevin': '2456', 'Mike': '3456', 'Sue': '5567'}

# way-2, uses curly bracket to generate it
dict_phone_list = {'Kevin': '2456', 'Mike': '3456', 'Sue': '5567'}
dict_phone_list

{'Kevin': '2456', 'Mike': '3456', 'Sue': '5567'}

Get the length of a dict object¶

dict_phone_list = {'Kevin': '2456', 'Mike': '3456', 'Sue': '5567'}
len(dict_phone_list)

3

Indexing¶

The concept of order does not exist for dictionary objects. Due to that,we cannot use positional index to refer to elements. Instead, we use keys to get/set values.

dict_1 = {'apple':4, 'grapefruit':10}
dict_1['apple']

4

What if we use a non-existent key? It will invoke a KeyError, which terminates your program. So, do indexing with caution.

dict_1 = {'apple':4, 'grapefruit':10}
dict_1['banana']
print(dict_1['apple'])

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-99-556835b7d3bb> in <module>
      1 dict_1 = {'apple':4, 'grapefruit':10}
----> 2 dict_1['banana']
      3 print(dict_1['apple'])

KeyError: 'banana'

Is there a way that we could detect if a key is inside a dictionary object?

dict_1 = {'apple':4, 'grapefruit':10}
'apple' in dict_1.keys()

True

'apple' in dict_1

True

help(list)

Help on class list in module builtins:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __mul__(self, value, /)
 |      Return self*value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __reversed__(self, /)
 |      Return a reverse iterator over the list.
 |  
 |  __rmul__(self, value, /)
 |      Return value*self.
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __sizeof__(self, /)
 |      Return the size of the list in memory, in bytes.
 |  
 |  append(self, object, /)
 |      Append object to the end of the list.
 |  
 |  clear(self, /)
 |      Remove all items from list.
 |  
 |  copy(self, /)
 |      Return a shallow copy of the list.
 |  
 |  count(self, value, /)
 |      Return number of occurrences of value.
 |  
 |  extend(self, iterable, /)
 |      Extend list by appending elements from the iterable.
 |  
 |  index(self, value, start=0, stop=9223372036854775807, /)
 |      Return first index of value.
 |      
 |      Raises ValueError if the value is not present.
 |  
 |  insert(self, index, object, /)
 |      Insert object before index.
 |  
 |  pop(self, index=-1, /)
 |      Remove and return item at index (default last).
 |      
 |      Raises IndexError if list is empty or index is out of range.
 |  
 |  remove(self, value, /)
 |      Remove first occurrence of value.
 |      
 |      Raises ValueError if the value is not present.
 |  
 |  reverse(self, /)
 |      Reverse *IN PLACE*.
 |  
 |  sort(self, /, *, key=None, reverse=False)
 |      Stable sort *IN PLACE*.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __hash__ = None

Get all the keys & values & items in a dict object¶

dict_phone_list.keys() # is this a method or function?

dict_keys(['Kevin', 'Mike', 'Sue'])

dict_phone_list.values()

dict_values(['2456', '3456', '5567'])

dict_phone_list.items()

dict_items([('Kevin', '2456'), ('Mike', '3456'), ('Sue', '5567')])

Remove all the elements of a dict object¶

dict_phone_list.clear()
len(dict_phone_list)

0

Update a dic object using another dict object¶

dict_1 = {'apple':2, 'orange':4, 'banana':10}
dict_2 = {'orange':12, 'grapefruit':1}
dict_1.update(dict_2)
dict_1

{'apple': 2, 'orange': 12, 'banana': 10, 'grapefruit': 1}

Operations on Set¶

Set objects are unordered collections of unique elements. You could get ordered set object with the help of packages
Like other collections, sets support x in set, len(set), and for x in set. Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.
All operations of sets can be found here

#way-1 to create a set object, using built-in function, set()
a = [3,4,5,4,5]
b = set(a)
b

{3, 4, 5}

#way-2 to create a set object, using curly brackets
a = {3,4,5,6,4}
a

{3, 4, 5, 6}

Get the length of a set object¶

a = {3,4,5,6,4} # it has 5 elements during the initilization 
len(a)

4

membership operations¶

a = {3,4,5,6,4}
7 in a

False

a.__contains__({3,4})

False

help(set)

Union two sets¶

a = {2,3,4}
b = {3,4,5}
x = a | b 
x

{2, 3, 4, 5}

Get the common elements occurred in both sets¶

a = {2,3,4}
b = {3,4,5}
x = a & b 
x

{3, 4}

Get elements in either set but not in both¶

a = {2,3,4}
b = {3,4,5}
x = a ^ b
x

{2, 5}

Get the elements in a but not in b¶

a = {2,3,4}
b = {3,4,5}
x = a -b
x

{2}

Add & discard an element from a set object¶

a = {3,4,5,6,7}
a.add(8)
a

{3, 4, 5, 6, 7, 8}

a.discard(8)
a

{3, 4, 5, 6, 7}

Clear all elements¶

a = {3,4,5,6,7,8}
a.clear()
a

set()

help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Return self^=value.
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __or__(self, value, /)
 |      Return self|value.
 |  
 |  __rand__(self, value, /)
 |      Return value&self.
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __ror__(self, value, /)
 |      Return value|self.
 |  
 |  __rsub__(self, value, /)
 |      Return value-self.
 |  
 |  __rxor__(self, value, /)
 |      Return value^self.
 |  
 |  __sizeof__(...)
 |      S.__sizeof__() -> size of S in memory, in bytes
 |  
 |  __sub__(self, value, /)
 |      Return self-value.
 |  
 |  __xor__(self, value, /)
 |      Return self^value.
 |  
 |  add(...)
 |      Add an element to a set.
 |      
 |      This has no effect if the element is already present.
 |  
 |  clear(...)
 |      Remove all elements from this set.
 |  
 |  copy(...)
 |      Return a shallow copy of a set.
 |  
 |  difference(...)
 |      Return the difference of two or more sets as a new set.
 |      
 |      (i.e. all elements that are in this set but not the others.)
 |  
 |  difference_update(...)
 |      Remove all elements of another set from this set.
 |  
 |  discard(...)
 |      Remove an element from a set if it is a member.
 |      
 |      If the element is not a member, do nothing.
 |  
 |  intersection(...)
 |      Return the intersection of two sets as a new set.
 |      
 |      (i.e. all elements that are in both sets.)
 |  
 |  intersection_update(...)
 |      Update a set with the intersection of itself and another.
 |  
 |  isdisjoint(...)
 |      Return True if two sets have a null intersection.
 |  
 |  issubset(...)
 |      Report whether another set contains this set.
 |  
 |  issuperset(...)
 |      Report whether this set contains another set.
 |  
 |  pop(...)
 |      Remove and return an arbitrary set element.
 |      Raises KeyError if the set is empty.
 |  
 |  remove(...)
 |      Remove an element from a set; it must be a member.
 |      
 |      If the element is not a member, raise a KeyError.
 |  
 |  symmetric_difference(...)
 |      Return the symmetric difference of two sets as a new set.
 |      
 |      (i.e. all elements that are in exactly one of the sets.)
 |  
 |  symmetric_difference_update(...)
 |      Update a set with the symmetric difference of itself and another.
 |  
 |  union(...)
 |      Return the union of sets as a new set.
 |      
 |      (i.e. all elements that are in either set.)
 |  
 |  update(...)
 |      Update a set with the union of itself and others.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __hash__ = None

Modules and Packages¶

Imagine you are building a versatile file reader which you want to sell. This tool must be versatile to be impressive, which means it can deal with a variety of file types. e.g., txt files, csv files, excel files, stata files, and so son.

Okay. Now you can think of this file-reader we intend to develop is a package. Then, the next question becomes how many or which file types this package is able to deal with. It's like dividing your big project into a bunch of separate, smaller sub-tasks. This kind of thinking is exactly the practice of modular programming, and each subtask now can be termed as a module.

Each module should have an unique name, which in most cases is the the name of a Python file. This also leads to another programming concept, namespace . Let's put off explaining this concept for now.

Each module is nothing special but a Python file where your code snippets reside. The code snippet is filled with a large number of variables, functions as the code we've written.

It is highly likely that in two modules/files, there are two functions/variables that have coincidentally have the same name. (e..g, different modules are usually developed by different people in a team.) Okay, Here another problem arises. When people use our package, how could they know which function is from which module. Okay, the solution is to associate the function/variable name with the module name to get an unique identifier for each thing in our code.

In many occasions, packages and modules are used interchangeably. For example, you might have heard of that people saying I have a module that contains a bunch of sub-modules. Or, I have a package that contains many modules.

How to import modules?¶

Still remember how to install packages through Anaconda Navigator or conda commands? Now let's learn how to bring modules into our script.

Import module_name¶

This is the simplest form for importing a module.

import math
math

<module 'math' (built-in)>

Then how could we use functions inside a module? Hint: namespace.fun_name

a = 4
math.sqrt(a)

2.0

From module_name import name(s)¶

What if we do not want to load the whole module or how could we get rid of the namespace prefix?

from math import sqrt # now sqrt is added to the current namespace. Yes, your working file has also a namespace

sqrt(4)

2.0

Is it wonderful, isn't it? You can import a bunch of names at once, e.g., from math import sqrt, log2, log10
or you could use from module_name import *, which imports all the variables/functions into your current namespace. (DANAGEROUS!!!!)

Then what is the namespace of the file I am working on?

__name__ #double underscores before and after

'__main__'

From module_name import name as alt_name¶

Why do we need to give it a alternative name or alias? It is because we want to give it an unique name in case my namespace has a funtion that has the same name as the imported function.

from math import sqrt as my_sqrt

my_sqrt(4)

2.0

from math import sqrt as my_sqrt, log2 as my_log2
a = my_sqrt(4)
b = my_log2(4)
print(a,b)

2.0 2.0