COP 3223H meeting -*- Outline -*- * Loop Statements in Python Reference for the material on while loops: chapters 9 and 10 of Edward Cohen's "Programming in the 1990s" (Springer-Verlag, 1990). Loops are a key way to process multiple data items (or to generate multiple data items) ** while loops ------------------------------------------ WHILE LOOPS IN PYTHON Syntax: ::= while : | ... Example Semantics: ------------------------------------------ ... num = int(input("positive integer? ")) i = 0 while i <= num: print(i) i = i + 1 test the condition: if it is True, then execute the body (the ), then start over if it is not true, then finish while P: S --> if P: S while P: S else: pass ------------------------------------------ SYNTAX IN C ::= while () | { } | ... Example: int i = 0; while (i <= num) { printf("%n\n", i); i = i+1; } ------------------------------------------ *** writing while loops in order to give a systematic explanation of how to write while loops, it helps to think about the predicates that characterize states in a while loop ------------------------------------------ REASONING ABOUT WHILE LOOPS What is true after successfully executing? assert I while P: S assert I ------------------------------------------ ... not P is true also I is still true, provided S didn't change it. so not P and I are true at the end: assert not P and I I is called a (loop) invariant analogy: backing a car out of a parking space: invariant: you haven't hit anything guard: you're not out of the parking spot initialization: (start the car in reverse, holding down the brake) body: move outwards (slowly), looking around **** method 1: deleting a conjunct In this method, we look at the desired postcondition, find 2 conjuncts, and make one the invariant, the negation of the other the guard ----------------------------------------- EXAMPLE LOOP: INT DIVISION Compute the quotient (quot) and remainder (rem) of x divided by y, without using // and % Precondition: x>= 0 and y > 0 Desired postcondition: 0 <= rem and rem < y and quot*y + rem == x Heuristic: ------------------------------------------ ... 0. split the postcondition into two conjuncts, 1. make the invariant one of them and make the loop's guard the negation of the other (choose to: a. make the invariant the one that involves the most variables, and b. make the guard the one that is hardest to make true initially) 2. develop an initialization that makes the invariant true 3. develop a body that makes progress towards falsifying the guard while preserving the invariant ------------------------------------------ CODING THE EXAMPLE def int_division(x, y): """Requires: x >= 0 and y > 0 Ensures, result is a pair (quot, rem) such that 0 <= rem and rem < y and quot*y+rem == x.""" assert x >= 0 and y > 0 assert 0 <= rem and rem < y and quot*y+rem == x return (quot, rem) ------------------------------------------ ... start as above, with the goals Q: How can we achieve that last assertion? use a loop Q: Why? Because it's not clear how to achieve this all at once (with an expression, since we can't use // and %) So we split the postcondition into an invariant: 0 <= rem and quot*y+rem == x and a guard: not (rem < y) Why this split? Because the guard condition is hard to achieve (we don't know y) and the rest involve both variables rem and quot. (If that doesn't work, we may have to reconsider.) so that makes the outline look like: def int_division(x, y): """Requires: x >= 0 and y > 0 Ensures, result is a pair (quot, rem) such that 0 <= rem and rem < y and quot*y+rem == x.""" assert x >= 0 and y > 0 #initialization # the loop invariant follows assert 0 <= rem and quot*y+rem == x while rem >= y: # decrease rem while maintaining the invariant assert 0 <= rem and quot*y+rem == x # now the negation of the guard and the invariant hold assert 0 <= rem and rem < y and quot*y+rem == x return (quot, rem) This leaves us with two blocks of code to fill in... ------------------------------------------ INITIALIZATION assert x >= 0 and y > 0 assert 0 <= rem and quot*y+rem == x ------------------------------------------ Q: How can we assign to quot and rem to make the assertion true? Starting with the complex one, we can note that if quot == 0 then we are left with rem == x and we know x >= 0, so that will make rem >= 0. ------------------------------------------ LOOP BODY assert 0 <= rem and quot*y+rem == x while rem >= y: # decrease rem while maintaining the invariant assert 0 <= rem and quot*y+rem == x ------------------------------------------ Q: Why should we decrease rem? to get us closer to finishing the loop Q: Why do we need to maintain the loop invariant? So that the overall postcondition will hold at the end Q: What can we change about quot and rem that will help? we can subtract y from rem, since rem >= y, and that will still keep 0 <= rem, and we can keep the rest of the invariant true if we increase quot by 1. So... rem = rem - y quot = quot + 1 Show the completed function and test it. **** method 2: introduce new (local) variables This example also involves introducing a new variable ------------------------------------------ LINEAR SEARCH EXAMPLE Write a function, findname(schools, name), that, given a (Python) list of strings, schools, returns the index, i, of the first element in schools such that schools[i] == name, or if no such i exists, then return -1. type: (list(string), string) -> int Ensures: ------------------------------------------ ... informally either schools[result] == name or result == -1 and for all i:: schools[i] != name Q: How do we say that the result is the first index? use a quantifier, like: (for all i : 0 <= i < len(schools) : i < result ==> schools[i] != name) Q: How do we make that disjunction into a conjunction? We can use a conjunction of implications (0 <= result < len(schools) ==> schools[result] == name) and (result >= len(schools) ==> result == -1) So we get something like: result < len(schools) and (for all i : 0 <= i < result : schools[i] != name) and (0 <= result ==> schools[result] == name) and (result == -1 ==> (for all i : 0 <= i < len(schools) : schools[i] != name)) Which can be translated into Python using implies() and a helper allIntsIn() (see the file allIntsIn.py) as follows: result < len(schools) and allIntsIn(range(0,result), (lambda i: schools[i] != name)) and implies(0 <= result, schools[result] == name) and implies(result == -1, allIntsIn(range(0,len(schools)), (lambda i: schools[i] != name))) ------------------------------------------ GOAL-DIRECTED DEVELOPMENT Overall outline will be: def findname(schools, name): """Ensures: result < len(schools) and allIntsIn(range(0,result), (lambda i: implies(i < result, schools[i] != name))) and implies(0 <= result, schools[result] == name) and implies(result == -1, allIntsIn(range(0,len(schools)), (lambda i: schools[i] != name)))""" if ___________________: return -1 else: return ______ ------------------------------------------ Q: What has to be true when returning -1? allIntsIn(range(0,len(schools)), (lambda i: schools[i] != name)) so can add that as an assertion... Q: What has to be true when returning some other number? schools[result] == name where result is the number returned We could use a variable named result (or similar, like res) but traditionally this index variable is called i The idea is to progressively define this variable's value Another way to see this is that we introduce a new variable in order to get a conjunct in the postcondition ------------------------------------------ NAMING THE INDEX i def findname(schools, name): """Ensures: result < len(schools) and allIntsIn(range(0,result), (lambda i: implies(i < result, schools[i] != name))) and implies(0 <= result, schools[result] == name) and implies(result == -1, allIntsIn(range(0,len(schools)), (lambda i: schools[i] != name)))""" i = _______ if i == len(schools): assert i == len(schools) assert allIntsIn(range(0, i), (lambda i: schools[i] != name)) return -1 else: assert i < len(schools) and schools[i] == name assert allIntsIn(range(0, i), (lambda i: schools[i] != name)) return i ------------------------------------------ Q: What must be true before the if for that to work? assert i >= len(schools) or schools[i] == name \ and allIntsIn(range(0, i), (lambda i: schools[i] != name)) and i <= len(schools) write that above the if Q: How can we achieve that condition? use a loop, since we must look at all the elements in schools So... Q: Which part should be the invariant? allIntsIn(range(0, i), (lambda i: schools[i] != name)) and i <= len(schools) Why? because it's more complex and because it involves all the variables Q: Which conjunct should be the guard? not (i >= len(schools) or schools[i] == name) What is that if we use De Morgan's law? i < len(schools) and schools[i] != name So we arrive at: ------------------------------------------ def findname(schools, name): """Ensures: result < len(schools) and allIntsIn(range(0,result), (lambda i: implies(i < result, schools[i] != name))) and implies(0 <= result, schools[result] == name) and implies(result == -1, allIntsIn(range(0,len(schools)), (lambda i: schools[i] != name)))""" i = __________ assert allIntsIn(range(0,i), (lambda i: schools[i] != name)) \ and i <= len(schools) while i < len(schools) and schools[i] != name: # change i towards len(schools) while preserving invariant assert allIntsIn(range(0,i), (lambda i: schools[i] != name)) \ and i <= len(schools) assert i >= len(schools) or schools[i] == name \ and allIntsIn(range(0,i), (lambda i: schools[i] != name)) and i <= len(schools) if i >= len(schools): assert i == len(schools) assert allIntsIn(range(0,i), (lambda i: schools[i] != name)) return -1 else: assert allIntsIn(range(0, i), (lambda i: schools[i] != name)) assert schools[i] == name return i ------------------------------------------ Q: What can we assign to i that makes the invariant true? 0, as it makes the range empty, so the allIntsIn is trivially true, and 0 <= len(schools), since a length is at least 0. Q: How can we change i towards len(schools) when the guard is true, while preserving the invariant? i = i+1 works, since we know schools[i] != name ** break statements ------------------------------------------ BREAK STATEMENT Syntax: ::= break | ... Static semantics: A break statement must occur inside a while or for loop Example: def multPyList(lst): """type: list(number) -> number Ensures: result is the product of the numbers in lst.""" prod = 1 i = 0 while i < len(lst): prod = prod * lst[i] if lst[i] == 0: break else: i = i + 1 return prod Semantics: ------------------------------------------ (technically break can't occur inside a nested method or class also) ... finish (jump out of) the closest surrounding while (or for) loop enclosing the break statement Q: C also has a break statment, what would it's syntax look like? The same, but with a semicolon at the end. ------------------------------------------ REASONING ABOUT LOOPS WITH BREAK What is true after executing a break? assert I while P: S1 if P2: assert P2 S2 assert Q break S3 assert I assert ------------------------------------------ ... break doesn't change the state. We can reach the end of the loop either by P becoming false or by P2 being true so it's: assert not P or Q Q: What if P is True? Then what is true is Q (since not True or Q is Q) Q: What is S2 is not present? Then it's not P or P2 (Q is always implied by P2) See the annotated code in multPyList.py ** for loops in Python ------------------------------------------ FOR LOOPS Syntax: ::= for in : | ... Example: Semantics: ------------------------------------------ ... num = int(input("positive integer? ")) for i in range(0,num): print(i) ... 1. evaluate the expression (to obtain a sequence) 2. assign each element of the sequence, in order, to the identifier and with that binding, execute the suite for X in R: S --> _iter = R.__iter__() # assuming _iter is a fresh variable name! while True: try: X = _iter.__next__() S except StopIteration: break Notes: - This semantics relies on the ability of an assignment of X in a nested statement (like the try) to persist, since the scoping for variables is only for the function not for a nested statement (it's dynamic) - If R's value is empty, then X is never assigned, as the exception interrupts the assignment - At the end, X has the value last assigned to it (due to the scoping described above) When R denotes a sequence, then R's value is finite, and so the loop cannot be infinite! (However, technically, for loops can work on iterators, which can be infinite, so when R is an iterator, the for loop can be infinite.) ------------------------------------------ SYNTAX IN C ::= for ( ; ; ) Example: int num; num = myinput(); for (int i = 0; i < num; i++) { printf("%n\n", i); } ------------------------------------------ *** reasoning about for loops based on Shaw, Wulf, London's paper "Abstraction and verification in Alphard: defining and specifying iteration and generators" (CACM, Vol 20, Num 8, Aug. 1977) ------------------------------------------ REASONING ABOUT FOR LOOPS OVER LISTS Let R be a sequence expression (of type list) which has precondition R_pre Let _r, _already, and _rest be fresh identifiers Let I be a function of type: list -> bool assert R_pre _r = R _already = [] _rest = r assert I(_already) for X in R: assert _r == _already + [X] + _rest[1:] S _already = _already + [X] _rest = _rest[1:] I(_already) assert I(_r) and _already == _r ------------------------------------------ this reasoning schema can also be generalized to other sequences (and iterators), but we don't have convenient ways to express [] for more general sequences We would only use the _r, _already, and _rest variables when checking the code (they are "ghost" variables). ------------------------------------------ EXAMPLE Function product(lst) of type: list(number) -> number that returns the product of all the elements of lst. ------------------------------------------ ... def product(lst): """type: list(number) -> number Ensures: result is the product of all the numbers in lst.""" # result is the product of all the numbers in lst return _______ Q: What should we return? introduce a variable (an accumulator), prod def product(lst): """type: list(number) -> number Ensures: result is the product of all the numbers in lst.""" prod = ________ 1 _________ # result is the product of all the numbers in lst return prod Q: What should that be initialized to? 1, as that is what should be returned when lst is empty Q: How can we make the last condition true? use a loop def product(lst): """type: list(number) -> number Ensures: result is the product of all the numbers in lst.""" prod = 1 # already = [] # prod is the product of the numbers in already for elem in lst: prod *= elem # already = already + [elem] # prod is the product of the numbers in already # prod is the product of all the numbers in lst and lst == already return prod See product.py for an annotated version. ** continue statements ------------------------------------------ CONTINUE STATEMENT Syntax: ::= continue | ... Static semantics: A continue statement must occur inside a while or for loop, but not inside a finally clause inside that loop Example: Semantics: ------------------------------------------ (technically break can't occur inside a nested method or class also) ... finish (jump out of) the closest surrounding while (or for) loop enclosing the break statement ... def sumReciprocals(lst): """type: list(number) -> number Ensures: result is the sum of the reciprocals of the non-zero elements of lst.""" tot = 0 for elem in lst: if elem == 0: continue tot += 1/elem return tot Q: What would you guess the continue statement's syntax is in C? Same as in Python, but with a semicolon at the end ------------------------------------------ REASONING ABOUT LOOPS WITH CONTINUE What is true after executing a continue? _r = R _already = [] _rest = _r assert I(_already) for X in _r: assert _r == _already + [X] + _rest S1 if P2: assert P2 S2 _already = _already + [X] _rest = _rest[1:] assert I(_already) continue S3 _already = _already + [X] _rest = _rest[1:] assert I(_already) assert I(_already) and (_already == _r or Q) ------------------------------------------ ... continue doesn't change the state. it just starts the next iteration, skipping over the rest of the body of the loop Q: So, what has to be true when the code continues? The loop invariant for the element being processed See the annotated code in sumReciprocals.py