In [14]:
%%html
<style>
.example1 {
 height: 50px;	
 overflow: hidden;
 position: relative;
}
.example1 h3 {
 font-size: 3em;
 color: red;
 position: absolute;
 width: 100%;
 height: 100%;
 margin: 0;
 line-height: 50px;
 text-align: center;
 /* Starting position */
 -moz-transform:translateX(100%);
 -webkit-transform:translateX(100%);	
 transform:translateX(100%);
 /* Apply animation to this element */	
 -moz-animation: example1 15s linear infinite;
 -webkit-animation: example1 15s linear infinite;
 animation: example1 15s linear infinite;
}
/* Move it (define the animation) */
@-moz-keyframes example1 {
 0%   { -moz-transform: translateX(100%); }
 100% { -moz-transform: translateX(-100%); }
}
@-webkit-keyframes example1 {
 0%   { -webkit-transform: translateX(100%); }
 100% { -webkit-transform: translateX(-100%); }
}
@keyframes example1 {
 0%   { 
 -moz-transform: translateX(100%); /* Firefox bug fix */
 -webkit-transform: translateX(100%); /* Firefox bug fix */
 transform: translateX(100%); 		
 }
 100% { 
 -moz-transform: translateX(-100%); /* Firefox bug fix */
 -webkit-transform: translateX(-100%); /* Firefox bug fix */
 transform: translateX(-100%); 
 }
}
</style>

Introduction to Computing for Engineers and Computer Scientists

Chapters 6, 7: Files, Exceptions, Lists

Questions, Discussion

From Class

Piazza

Introduction to Exceptions (Continued)

Overview

From Punch and Enbody, chpt. 6.

  • Most modern languages provide methods to deal with ‘exceptional’ situations
  • Gives the programmer the option to keep the user from having the program stop (die) without warning or explanation. The programmer codes the program to:
    • Retry failed condition.
    • Terminate gracefully and with an understandable explanation instead of cryptic system error.
  • "Again, this is not about fundamental CS, but about doing a better job as a programmer."
    • I strongly disagree.
    • This is fundamental. Software and programs are everywhere and critical.
    • Robust, reliable programs can be the difference between life and death.
    • This is what makes us engineers.
Weinberg's Second Law
  • What constitutes an exception?
    • There are many common ones that programs commonly experience
      • User enters invalid data.
      • Variable is wrong type.
      • for loop range index outside a list.
      • File not found on open.
      • ... ...
    • Developers can define their own exceptions to handle failures or errors in their application logic or input data.
  • Error and exceptions have specific names. You have seen many when your programs fail to execute. You have posted many of the errors on Piazza.
Error Names (Punch and Enbody)
  • Programmers can extend with application defined exceptions.
  • Fully understanding the hierarchy concept and programmer defined extensions requires understanding python classes (Punch and Enbody, chapters 11,12). We will cover in future lectures.
(Partial) Exception Hierarchy

Exceptions and Exception Handling

  • Basic idea:
    • Brackets and keep watch on a particular section (block, suite) of code.
    • If we get an exception, raise/throw that exception (let it be known)
    • Look for a catcher that can handle that kind of exception
    • If found, handle it, otherwise let Python handle it (which usually halts the program)
  • For example,
    • We have assumed that the input we receive is correct (from a file, from the user).
    • This is almost never true. There is always the chance that the input could be wrong.
    • Our programs should be able to handle this.
  • "Writing Secure Code”, by Howard and LeBlanc, ISBN 978-0735617223.
    • “All input is evil until proven otherwise.”
    • Many security holes in programs are based on assumptions programmers make about input.
    • Secure programs protect themselves from evil input.

< DFF-Non-Textbook-Digression >

  • For example,
    • "Fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program."
    • "Fuzzing is used mostly as an automated technique to expose vulnerabilities in security-critical programs that might be exploited with malicious intent."
    • Both "good guys" and "bad guys" use fuzzing.

< /DFF-Non-Textbook-Digression >

In [1]:
%%HTML
<!-- HTML -->	
<div style="height:85px; color:red" class="example1">
<h3>Rule 7: All input is evil until proven otherwise.</h3>
</div>

Rule 7: All input is evil until proven otherwise.

Implementing (Handling) |Exceptions

Exceptions Form 1
  • try suite
    • The try suite contains code that we want to monitor for errors during its execution.
    • If an error occurs anywhere in that try suite, Python looks for a handler that can deal with the error.
    • If no special handler exists, Python handles it, meaning the program halts and with an error message as we have seen so many times
  • except suite
    • An except suite (perhaps multiple except suites) is associated with a try suite.
    • Each exception names a type of exception it is monitoring for.
    • If the error that occurs in the try suite matches the type of exception, then that except suite is activated.

Example 1: Extending Our First Program

In [4]:
# This program allows a user to input the radius on a circle.
# We want to teach the formula to young children. So, we only
# allow the radius to be an integer.

# Almost every program you write will use "programs" others have written.
# Your successful programs will become programs that others use.
# Any non-trivial program requires a team. The team members assemble
# the solution from individual subcomponents they build.
# The subcomponents and reusable parts are called modules.

import math     # We just imported our first module.

# Programs, like mathematical functions, are only useful if they
# operate on many user provided inputs. To start, we will get the input from
# the "command line."

done = False
error_count = 0
self_destruct = False

while not done:
    try:
        # If the self-destruct sequence is initiated. Trigger an exception
        # that does not have an except clause. NOTE: This is a joke.
        # No one every programs this way. The code would raise an exception.
        if self_destruct:
            x = 1 / 0
            
        # Print a prompt asking for the radius.
        # Set a variable to the input value.
        radius_str = input('Enter the radius of a circle: ')

        # We are going to do 'math' on the input. So, we should
        # covert it to an Integer.
        radius_int = int(radius_str)

        # The circumfrence is 2 times pi time the radius.
        # The area is pi * r squared.
        circumference = 2 * math.pi * radius_int
        area = math.pi * (radius_int ** 2)
        
        # Python conventions do not like lines that are too long.
        # \ means that we will continue the command on the next line.
        print ("The cirumference is:",circumference,  \
              ", and the area is:",area)
        done = True
        
    # Hand the case where converting to an integer failed. This is an example of
    # "Duck Typing." I try to treat the value as an int, and catch an exception
    # if it fails.
    except ValueError as e1:
        error_count = error_count + 1
        if (error_count == 1):
            print("\nInvalid type. You need to enter an integer.")  
        elif error_count == 2:
            print("\nInvalid type. What part of integer did you not understand?")
        elif error_count == 3:
            print("\nSeriously? I cannot work like this. I am a professional.")
            print("Next time you do this I divide by 0!")
        else:
            print("\n Self-destruct sequence started.")
            self_destruct = True
            
    # Again, this is a joke. No one programs this way.
    except Exception as e:
        print("\nException = ", e)
        print("Self-destruct complete.")
        break
        
    finally:
        print('\n Finally always called.')
Enter the radius of a circle: b

Invalid type. You need to enter an integer.

 Finally always called.
Enter the radius of a circle: c

Invalid type. What part of integer did you not understand?

 Finally always called.
Enter the radius of a circle: d

Seriously? I cannot work like this. I am a professional.
Next time you do this I divide by 0!

 Finally always called.
Enter the radius of a circle: e

 Self-destruct sequence started.

 Finally always called.

Exception =  division by zero
Self-destruct complete.

 Finally always called.

Example 2: Generic Input Function

In [5]:
# Validates integer input. Reusable in many places in a program or other programs.
# Prompt is the message for soliciting user input.
# lbound is the lower bound.
# ubound is the upper bound.
# patience is the number of times to let the user input a value.
def safe_get_int(prompt, lbound, ubound, patience):
    done = False
    result = None
    temp = None
    tries = 0

    # Loop until successful input or have exhausted all the tries.
    while (not done) and (tries <= patience):
        # The try ... except implements toleration for non-integer inputs.
        try:
            tries += 1
            temp = input(prompt + ":")
            
            # This is the statement that would trigger the exception if not an int.
            temp = int(temp)
            
            # The value is an int -> No exception thrown. Is the value in the range.
            if (temp < lbound) or (temp > ubound):
                print("Valid range is " + str(lbound) + " to " + str(ubound) + " Try again.")
            else:
                done = True
                result = temp
                
        except TypeError as ve:
            # Not an integer. Will try again.
            # Will print a specific error message.
            print("Input must be an integer.")
            
        except Exception as e:
            # Not sure what happened but will try again.
            print("Got an expected exception. Trying again.")

            
    # Did the function fail in getting a valid input?
    # Will re-raise value error. Typically, we would raise a programmer defined exception.
    if not done:
        # Not my finest error message.
        print("Prepare to die fool!")
        raise ValueError("Die fool!")

    return result


# Prompts for a string input. The inputs are:
# - prompt message
# - An optional list of valid inputs.
# - The number of base inputs to tolerate.
def safe_get_string(prompt, valid_values, patience):
    done = False
    result = None
    temp = None
    tries = 0

    # Loop until valid input or too many failed attempts.
    while (not done) and (tries <= patience):
        try:
            tries += 1
            temp = input(prompt + ":")
            if not temp in valid_values:
                print("Valid values are ", valid_values)
            else:
                done = True
                result = temp

        # Should narrow this exception to something more specific
        except Exception as e:
            print("Got exception. Trying again.")

    # Raise input value failure.
    if not done:
        print("Prepare to die fool!")
        raise ValueError("Die you string entering fool!")

    return result
In [6]:
x = safe_get_int("Please enter an integer. I am not going to tell you the range", 1, 252, 3)
print("Input was = ", x)
Please enter an integer. I am not going to tell you the range:-9
Valid range is 1 to 252 Try again.
Please enter an integer. I am not going to tell you the range:c
Got an expected exception. Trying again.
Please enter an integer. I am not going to tell you the range:-3
Valid range is 1 to 252 Try again.
Please enter an integer. I am not going to tell you the range:b
Got an expected exception. Trying again.
Prepare to die fool!
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-df42fdc56809> in <module>()
----> 1 x = safe_get_int("Please enter an integer. I am not going to tell you the range", 1, 252, 3)
      2 print("Input was = ", x)

<ipython-input-5-4f4149bbb784> in safe_get_int(prompt, lbound, ubound, patience)
     42         # Not my finest error message.
     43         print("Prepare to die fool!")
---> 44         raise ValueError("Die fool!")
     45 
     46     return result

ValueError: Die fool!

What Does This Have to Do With Files?

  • File not found is a very common error.
  • Bad data inside the file is also a common problem.
In [ ]:
done = False

while not done:
    try:
        print("Whatever you do, do not choose file L5_collation.jpeg.")
        fn = input("Please enter a file name?")
        print("You entered ...", fn)
        f = open(fn,"r")
        print("\nReading file.")
        for s in f:
            print("A line = ",s)
        print("Done")
        done = True
    except FileNotFoundError:
        print("File not found. I will be patient.")
    except UnicodeDecodeError as ue:
        print("Seriously? I mean really?")
        print("Is not a text file dude. Error = ", ue)
    except Exception as e:
        print("Something else happened. e= ",e)
        break
    
Whatever you do, do not choose file L5_collation.jpeg.
Please enter a file name?L5_collation.jpeg
You entered ... L5_collation.jpeg

Reading file.
Seriously? I mean really?
Is not a text file dude. Error =  'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
Whatever you do, do not choose file L5_collation.jpeg.

Reminder: Rules so Far, Plus a Rule about Don

Reminder: Rules do far

Think before you program!

A program is a human-readable essay on problem solving that also happens to execute on a computer.

The best way to improve your programming and problem solving skills is to practice!

A foolish consistency is the hobgoblin of little minds

Test your code, often and thoroughly

If it was hard to write, it is probably hard to read. Add a comment.

All input is evil, unless proven otherwise.

Don get's bored easily.

HW3: A Simple Web Application that Pulls Concepts Together

Overview

  • Section 6.8 - GUI to COUNT POKER HANDS pulls some of the concepts together. Specifically,
    • Reading a text file.
    • Some use of exceptions.
    • Very simple string operations.
    • Simple functions.
  • The sample application also is a web (GUI) application. So, you cannot blame me for "Not following the book."
  • The book's sample program "counts (scores) a poker hand." You will sound lame if you talk about this program on an internship interview.
  • Instead, you will be able to say,

"I wrote a simple AngularJS, Model-View-Controller web UI that invoked a Python/Flask based server implementing a REST API. The REST API implemented a set of simple algorithms based on Levenshtein distance applied to strings to suggest possible spelling corrections for misspelled words. The Python/Flask service recorded common misspellings to augment the LD heuristics. The professor was going to have us implement a simple learning algorithm based on a Multilayer Perceptron (MLP) Neural Network to heuristically learn a person's common errors and corrections. We revolted and stuffed him in a dumpster until common sense prevailed."

Modified Example from Textbook

  • Modified example from Punch and Enbody, section 6.8
  • Note: The code below will not execute in a Jupyter notebook if debug is set to True. The program needs to be "the main program in this case."

Demo

Demo goes here.

Code

In [1]:
# Copyright 2017, 2013, 2011 Pearson Education, Inc., W.F. Punch & R.J.Enbody
# Modified by Donald F. Ferguson, Columbia University, 2018


# Import some frameworks that help us implement a web application.
from flask import Flask, render_template_string, request
from wtforms import Form, validators, TextField
import string


##############################################################################################################
# These are the two functions you will write.
# You will implement in a separate Python file and access via an import statement.
# The code here is a just a placeholder.


# 1. Check a dictionary to determine if word is correctly spelled.
# 2. If not, call a set of functions that generate "near by, correctly spelled words."
# 3. Return the 5 "best suggested corrections."
def check_word(word):
    # Your code and called functions go here.
    return "floccinaucinihilipilification, sesquipedalianism?"


# The user selected a correction, or entered a new correct spelling.
# We will record the correct spelling and score as a possible common correction for user.
def update_corrections(original_word, corrected_word):
    # Your code goes here.
    print("Correction for " + original_word + " is " + "corrected_word")

# End of where your code will go.
##############################################################################################################

# Include and initialize the Flask framework.
app = Flask(__name__)


# html page is the view. Putting templates directly in the application is a massive anti-pattern.
# Also, most programmers and applications do not use static HTML templates like this one.
# I will give you the HTML pages to "serve" in your application.
#
page = '''
<html>
   <head>
      <title>HW3 -- The Spelling Correction Suggester!</title>
      <script>
        function myFunction() {
            var x = document.getElementById("told_you_so");
            if (x.style.display === "none") {
                x.style.display = "block";
            }
            else {
                x.style.display = "none";
            }
        }
        </script>
   </head>

   <body>
      <h1>HW3 -- The Spelling Police</h1>
      <h2>Our Motto is, "To correct and serve!"</h2>

      <form method=post action="">
         So, you think you can spell?
      <br>
      Enter a word.
         {{ template_form.text_field }}
      <br>
        {% if result != None %}
        <br>
           Did you possibly mean? {{ result }}
        <br>
        {% endif %}
      <br>
        <input type=submit value=Check>
      </form>
       <button onclick="myFunction()">What does this button do?</button>
       <div id="told_you_so" style="display:none;">
        <p>
        <span style="color:red;font-size: 32px;">
        Told you web apps are in the textbook.
        </span>
        </div>
   </body>
</html>
'''


# InputForm and below is our controller
# form with a single TextField.
# This is part of the framework and you do not need to worry about it.
class InputForm(Form):
    text_field = TextField(validators=[validators.InputRequired()])


# This is the core of the web application server and implementing the page delivery and REST API.
@app.route('/', methods=['GET', 'POST'])
def index():
    spell_result = None
    form = InputForm(request.form)
    if request.method == 'POST' and form.validate():
        input_val = form.text_field.data
        spell_result = check_word(input_val)
    return render_template_string(page, template_form=form, result = spell_result)


if __name__ == '__main__':
    app.run(debug=True)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-1-341b1c88266f> in <module>()
    106 
    107 if __name__ == '__main__':
--> 108     app.run(debug=True)

~/anaconda3/lib/python3.6/site-packages/flask/app.py in run(self, host, port, debug, **options)
    839         options.setdefault('use_debugger', self.debug)
    840         try:
--> 841             run_simple(host, port, self, **options)
    842         finally:
    843             # reset the first request information if the development server

~/anaconda3/lib/python3.6/site-packages/werkzeug/serving.py in run_simple(hostname, port, application, use_reloader, use_debugger, use_evalex, extra_files, reloader_interval, reloader_type, threaded, processes, request_handler, static_files, passthrough_errors, ssl_context)
    718             s = socket.socket(address_family, socket.SOCK_STREAM)
    719             s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
--> 720             s.bind((hostname, port))
    721             if hasattr(s, 'set_inheritable'):
    722                 s.set_inheritable(True)

OSError: [Errno 48] Address already in use

Explanation

Simple Web Application
  • In our scenario, there will be two applications running (in processes) on a single laptop
    • Browser (Chrome) in demo.
    • Python program running Flask and listening for requests.
  • All executing programs, e.g. browser, Word, PyCharm, Jupyter Notebook server, ... execute in an operating system process.
OS Processes
  • Request processing flow
    1. Browser sends a GET HTTP message to the computer at address 127:0.0.1
    2. Operating system routes the message to application listening on port 5000.
    3. Application responds with a message containing an HTML text document.
    4. The browser interprets the text and renders the document in graphical form in browser window.
Web Request
  • For HW2:
    • I will give you all of the HTML and Flask program.
    • Your program will simply receive requests routed to the functions.
    • Implement two functions and return.

Homework 2: Spelling Correction

  • You will implement two functions, which will use Python libraries and other functions you develop for implementation.
# 1. Check a dictionary to determine if word is correctly spelled.
# 2. If not, call a set of functions that generate "near by, correctly spelled words."
# 3. Return the 5 "best suggested corrections."
def check_word(word):
    # Your code and called functions go here.
    return "floccinaucinihilipilification, sesquipedalianism?"


# The user selected a correction, or entered a new correct spelling.
# We will record the correct spelling and score as a possible common correction for user.
def update_corrections(original_word, corrected_word):
    # Your code goes here.
    print("Correction for " + original_word + " is " + "corrected_word")

Application functions

  • Internal function: Read a CSV file that is a dictionary of correctly spelled words.
    • A line is of the form "word,frequency,user_created,selected
    • The fields mean
      • word is a correctly spelled word.
      • frequency is the relative frequency of the word in English and is a float between 0.0 and 1.0
      • User created is the string "true" if the user entered this word as a correct and is missing other wise.
      • Selected is the number of times the user has selected the word over all executions of your program.
    • For example:
      • "the,0,8,,4" $\rightarrow$ "the" has frequency 0.8, was not created by user and has been selected as a correction 4 times.
      • "tho,,TRUE,1" means the user entered this word as a correction, and has done so once.
Dictionary Example
  • Function _checkword(w):
    1. If the dictionary is not loaded, calls the internal function to load.
    2. Checks the list holding the correctly spelled words.
    3. A single entry is of the form [word, frequency, user_created, selected]
    4. The "dictionary" is a list in which each entry is a list like the one above.
    5. If the word is in the dictionary, returns True.
    6. If the word is not in the dictionary. The function generates a set of possible corrections:
      1. Letters transposed.
      2. Letter missing.
      3. Space missing (word needs to be split).
      4. Letter needs to be changed.
      5. Letter needs to be removed.
    7. Checks the list of possible corrections in the dictionary.
      • If a word is found, adds to a list of suggestions.
      • If a word is not found, adds to a list of failed suggestions.
    8. If the list of suggestions has less than 10 suggestions
      • Repeat the steps above for every incorrect word in failed suggestions. This generates a list of words that require two corrections.
      • Add any correctly spelled words to the suggestion list.
    9. The returned result of the function is a list containing:
      1. Any user created word.
      2. The top 10 most likely suggested words. The rule is:
        • If there are 5 or more words with one correction, return the ones with the 5 highest frequencies.
        • If there are less than 10, return the most likely words (using frequency) with two corrections.
        • If the total is still less than 10, just return the suggestions.
  • Function _update_correction(old_word, newword):
    • If new_word is in the dictionary, increment the number of times selected.
    • If new word is NOT in the dictionary, add as a user created word.
    • Update the dictionary file.

Data Structure, Lists, Tuples

Data Structures

Lists

Mutability

Some Examples

Sorting, Iterating, Joining

In [20]:
# Define a test string.
str1="quickbrownfox"

# A concise approach for expanding a string.
l1 = [a for a in str1]
print("The characters in str1 are: ", l1, "\n")

# Sort the elements in the list.
l1.sort()
print("Sorted list is ", l1, "\n")

# Convert back to a string.
str1 = "".join(l1)
print("Converted back to a string = ", str1)
The characters in str1 are:  ['q', 'u', 'i', 'c', 'k', 'b', 'r', 'o', 'w', 'n', 'f', 'o', 'x'] 

Sorted list is  ['b', 'c', 'f', 'i', 'k', 'n', 'o', 'o', 'q', 'r', 'u', 'w', 'x'] 

Converted back to a string =  bcfiknooqruwx

Delimited Strings

In [36]:
line="21-Feb-2018;13:10;E1006;415 SIPA"
elements = line.split(";")
print("The individual elements are: ", elements)
The individual elements are:  ['21-Feb-2018', '13:10', 'E1006', '415 SIPA']

Mutation

In [37]:
print("Elements still = ", elements, "\n")
elements[2] = "E1006 - Introduction to Computing"
print("Elements is now = ", elements, "\n")
elements.append("Donald F. Ferguson")
print("Elements is now = ", elements, "\n")
d = elements.pop(-1)
print("I popped ", d, "making elements = ", elements, "\n")
elements = [d] + elements
print("I added the thing I popped and elements. Now = ", elements, "\n")
print("Note that I had to take the single element and make a list for addition.")
Elements still =  ['21-Feb-2018', '13:10', 'E1006', '415 SIPA'] 

Elements is now =  ['21-Feb-2018', '13:10', 'E1006 - Introduction to Computing', '415 SIPA'] 

Elements is now =  ['21-Feb-2018', '13:10', 'E1006 - Introduction to Computing', '415 SIPA', 'Donald F. Ferguson'] 

I popped  Donald F. Ferguson making elements =  ['21-Feb-2018', '13:10', 'E1006 - Introduction to Computing', '415 SIPA'] 

I added the thing I popped and elements. Now =  ['Donald F. Ferguson', '21-Feb-2018', '13:10', 'E1006 - Introduction to Computing', '415 SIPA'] 

Note that I had to take the single element and make a list for addition.

Manipulating a List

In [38]:
d2 = elements.pop(1)
print("I removed element at 1 = ", d2, " leaving list = ", elements)
I removed element at 1 =  21-Feb-2018  leaving list =  ['Donald F. Ferguson', '13:10', 'E1006 - Introduction to Computing', '415 SIPA']
In [45]:
x = [0, 1, 2, "mouse", 4, 5, 6, "cat", 8, 9]
print("I have picked on elements enough. New list = ", x, "\n")
del[x[2]]
print("Element at position 2 has gone to the void. x = ", x, "\n")
x.remove("mouse")
print("I removed the element with value 'mouse. x = ", x, "\n")

try:
    print("Is there another 'mouse?' If so, remove it.")
    x.remove("mouse")
except ValueError as ve:
    print("Got a value error. There were no mice.")
    
print("After attempting mouse extraction, x = ", x, "\n")

print("Where is the cat?")
print("The cat is at ", x.index('cat'), "\n")
print("Transfiguring 'cat' to 'mouse.")
x[x.index('cat')]='mouse'
print("x = ", x, "\n")

print("cats follow mice.")
x.insert(x.index('mouse')+1, 'cat')
print("Thus, x = ", x)
I have picked on elements enough. New list =  [0, 1, 2, 'mouse', 4, 5, 6, 'cat', 8, 9] 

Element at position 2 has gone to the void. x =  [0, 1, 'mouse', 4, 5, 6, 'cat', 8, 9] 

I removed the element with value 'mouse. x =  [0, 1, 4, 5, 6, 'cat', 8, 9] 

Is there another 'mouse?' If so, remove it.
Got a value error. There were no mice.
After attempting mouse extraction, x =  [0, 1, 4, 5, 6, 'cat', 8, 9] 

Where is the cat?
The cat is at  5 

Transfiguring 'cat' to 'mouse.
x =  [0, 1, 4, 5, 6, 'mouse', 8, 9] 

cats follow mice.
Thus, x =  [0, 1, 4, 5, 6, 'mouse', 'cat', 8, 9]

How do you learn all of this and understand how to use it?

The same way you get to Carnegie Hall. Practice.

Example: Words and Unique Words

In [8]:
# Copyright 2017, 2013, 2011 Pearson Education, Inc., W.F. Punch & R.J.Enbody
# Gettysburg address analysis
# count words, unique words

def make_word_list(a_file):
    """Create a list of words from the file."""
    word_list = []      # list of speech words: initialized to be empty

    for line_str in a_file:           # read file line by line
        line_list = line_str.split()  # split each line into a list of words
        for word in line_list:        # get words one at a time from list
            word = word.lower()       # make words lower case
            word = word.strip('.,')   # strip off commas and periods            
            if word != "--":          # if the word is not "--"
                word_list.append(word)   # add the word to the speech list
    return word_list

def make_unique(word_list):
    """Create a list of unique words."""
    unique_list = []  # list of unique words: initialized to be empty

    for word in word_list:        # get words one at a time from speech
        if word not in unique_list: # if word is not already in unique list,
            unique_list.append(word)# add word to unique list

    return unique_list


################################
                
gba_file = open("gettysburg.txt", "r")
speech_list = make_word_list(gba_file)
print("Speech words = ", speech_list)
print("Speech Length: ", len(speech_list))
unique_list = make_unique(speech_list)
# print the speech and its lengths
print("\n\n")
print(unique_list)          
print("Unique Length: ", len(make_unique(unique_list)))
Speech words =  ['four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated', 'to', 'the', 'proposition', 'that', 'all', 'men', 'are', 'created', 'equal', 'now', 'we', 'are', 'engaged', 'in', 'a', 'great', 'civil', 'war', 'testing', 'whether', 'that', 'nation', 'or', 'any', 'nation', 'so', 'conceived', 'and', 'so', 'dedicated', 'can', 'long', 'endure', 'we', 'are', 'met', 'on', 'a', 'great', 'battlefield', 'of', 'that', 'war', 'we', 'have', 'come', 'to', 'dedicate', 'a', 'portion', 'of', 'that', 'field', 'as', 'a', 'final', 'resting', 'place', 'for', 'those', 'who', 'here', 'gave', 'their', 'lives', 'that', 'that', 'nation', 'might', 'live', 'it', 'is', 'altogether', 'fitting', 'and', 'proper', 'that', 'we', 'should', 'do', 'this', 'but', 'in', 'a', 'larger', 'sense', 'we', 'can', 'not', 'dedicate', 'we', 'can', 'not', 'consecrate', 'we', 'can', 'not', 'hallow', 'this', 'ground', 'the', 'brave', 'men', 'living', 'and', 'dead', 'who', 'struggled', 'here', 'have', 'consecrated', 'it', 'far', 'above', 'our', 'poor', 'power', 'to', 'add', 'or', 'detract', 'the', 'world', 'will', 'little', 'note', 'nor', 'long', 'remember', 'what', 'we', 'say', 'here', 'but', 'it', 'can', 'never', 'forget', 'what', 'they', 'did', 'here', 'it', 'is', 'for', 'us', 'the', 'living', 'rather', 'to', 'be', 'dedicated', 'here', 'to', 'the', 'unfinished', 'work', 'which', 'they', 'who', 'fought', 'here', 'have', 'thus', 'far', 'so', 'nobly', 'advanced', 'it', 'is', 'rather', 'for', 'us', 'to', 'be', 'here', 'dedicated', 'to', 'the', 'great', 'task', 'remaining', 'before', 'us', 'that', 'from', 'these', 'honored', 'dead', 'we', 'take', 'increased', 'devotion', 'to', 'that', 'cause', 'for', 'which', 'they', 'gave', 'the', 'last', 'full', 'measure', 'of', 'devotion', 'that', 'we', 'here', 'highly', 'resolve', 'that', 'these', 'dead', 'shall', 'not', 'have', 'died', 'in', 'vain', 'that', 'this', 'nation', 'under', 'god', 'shall', 'have', 'a', 'new', 'birth', 'of', 'freedom', 'and', 'that', 'government', 'of', 'the', 'people', 'by', 'the', 'people', 'for', 'the', 'people', 'shall', 'not', 'perish', 'from', 'the', 'earth']
Speech Length:  271



['four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'dedicated', 'to', 'the', 'proposition', 'that', 'all', 'men', 'are', 'created', 'equal', 'now', 'we', 'engaged', 'great', 'civil', 'war', 'testing', 'whether', 'or', 'any', 'so', 'can', 'long', 'endure', 'met', 'battlefield', 'of', 'have', 'come', 'dedicate', 'portion', 'field', 'as', 'final', 'resting', 'place', 'for', 'those', 'who', 'here', 'gave', 'their', 'lives', 'might', 'live', 'it', 'is', 'altogether', 'fitting', 'proper', 'should', 'do', 'but', 'larger', 'sense', 'not', 'consecrate', 'hallow', 'ground', 'brave', 'living', 'dead', 'struggled', 'consecrated', 'far', 'above', 'poor', 'power', 'add', 'detract', 'world', 'will', 'little', 'note', 'nor', 'remember', 'what', 'say', 'never', 'forget', 'they', 'did', 'us', 'rather', 'be', 'unfinished', 'work', 'which', 'fought', 'thus', 'nobly', 'advanced', 'task', 'remaining', 'before', 'from', 'these', 'honored', 'take', 'increased', 'devotion', 'cause', 'last', 'full', 'measure', 'highly', 'resolve', 'shall', 'died', 'vain', 'under', 'god', 'birth', 'freedom', 'government', 'people', 'by', 'perish', 'earth']
Unique Length:  138

More on Mutability

In [9]:
a_list = [1,2,3]
b_list = [4,5,6]
a_list.append(b_list)
c_list = a_list
print("C_list = ", c_list)
C_list =  [1, 2, 3, [4, 5, 6]]
In [10]:
b_list[2] = "foo"
print("C_list = ", c_list)
C_list =  [1, 2, 3, [4, 5, 'foo']]
In [11]:
import copy
c_list = copy.deepcopy(a_list)
print("C_list = ", c_list)
C_list =  [1, 2, 3, [4, 5, 'foo']]
In [12]:
a_list[0]='Cat'
a_list[3][0]="Begin"
In [13]:
print("B_list = ", b_list)
B_list =  ['Begin', 5, 'foo']
In [ ]:
print("C_list = ", c_list)

Tuples

Data Structures

Lists and Python, Comprehensives