Introduction

Today we begin a more in-depth discussion of strings. We’ve already done quite a bit with strings, but Python has a lot of built-in functions and methods that give us a lot of powerful tools to use with strings/text.

To see a list of these various functions, we need to go to the Python documentation (https://docs.python.org). In the Quick Search box, type “strings”. In the list of search results, click on the one named “string - Common string operations”. This link takes us to the documentation for the library called string; the first section lists several useful string constants. These constants are actually variables in the string library that return commonly used sets of characters:

>>> import string
>>> string.ascii_lowercase	# we don't need parentheses because this is a constant
'abcdefghijklmnopqrstuvwxyz'
>>> string.ascii_uppercase
>>> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
>>> '0123456789'
>>> string.punctuation
>>> '!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~'
>>> "a" in string.ascii_lowercase       # this use of `in` is pretty handy
>>> True
>>> "0" not in string.punctuation
>>> True

Vowel Count Problem

Write a function that takes one string parameter and counts how many vowels occur in the string.

# vowel_count.py

def num_vowels(string):
  vowels = 0
  for i in range(len(string)):	# range matches the indexes for the string
    if string[i] in "aeiouAEIOU":	# if a specific character in the string is a vowel
      vowels = vowels + 1
  return vowels
      
def num_vowels(string):		# 2nd approach
  vowels = 0

  for character in string:		# iterate through string one character at a time
    if character in "aeiouAEIOU":
      vowels = vowels + 1

  return vowels  

print(num_vowels("Petra Academy"))

As helpful as string constants to us, the most powerful built-in string tools in Python are the string methods. Click on the String Methods link at the top of the string documentation page.

These specialized functions are called dot methods because they are used by appending them (i.e., adding them to the end) to an actual string or string variable using a dot (“.”). We have already used two of these string methods in Project 2: ljust() and rjust().

Let’s take a look at the find() method:

>>> school = "Petra Academy"
>>> school.find("e")
1				# this is the index of the first occurrence of "e" in "Petra Academy"
>>> school[1]
'e'			# we can select a single character from the string by using its index
>>> school.find("P")
0				# index numbers start at 0
>>> school[0]
'P'
>>> school[-1]
'y'			# we can also use negative index numbers to count backward through the string
>>> school.find("z")
-1			# the find() method returns -1 if the substring is not in the string
>>> "A" in school
True
>>> "e" in "Petra"
True
>>> "z" in "aeiou"
False

String Slices

What if we want to refer to a subset of characters in a string, like “Petra” in “Petra Academy”? Using just indexes, we would have to code school[0] + school[1] + school[2] + school[3] + school[4], which is really inefficient. Thankfully, Python has a way to refer to a subset of characters, also known as a slice. Instead of a single index, slices use a starting index, a colon, and an ending index (the ending index is not included in the slice, similar to the range() function). So, to refer to the substring “Petra” in the string “Petra Academy”, we could write school[0:5]; in this slice, we go up to but do not include the 5th index.

It can be tedious to write the 0 everytime we want a slice that begins with the first character; it’s even more tedious to refer to the last letter in the string. Python allows us to omit the starting index and end index in such situations:

>>> school = "Petra Academy"
>>> school[:5]	# slice from the start of the string to the 5th index
'Petra'
>>> school[7:]	# slice from the 7th index to the end of the string
'Academy'	

Notice that slices are similar to the start and stop parameters in the range() function. And like the range() function, slices can also take a step parameter: school[0:5:2] returns 'Pta'. We will dig into slices a lot more in a little while.

The Chop Function

Use your knowledge of strings and slices to explain what the following function does.

def chop(some_string):
    for i in range(len(some_string)):
            print(some_string[:i] + some_string[i+1:])

This function iterates through the string one index at a time, but uses the indexes to “chop” or delete the index out of the string. Calling the chop() function with the string “petra” returns the following:

etra
ptra
pera
peta
petr

Exercise 8

Write a function that removes all occurrences of a given letter from a string.

def extract(letter, message):
    new_string = ""
    for i in range(len(message)):
        if message[i] != letter:
            new_string = new_string + message[i]

    return new_string
      
      
def extract(letter, string):	# 2nd approach
  new_string = ""
  for character in string:
    if character != letter:
      new_string = new_string + character

  return new_string

print(extract("e", "Petra Academy"))

Exercise 6

Write a function that reverses its string argument.

def mirror_loop(string):
  mirror_string = ""

  for character in string:
    mirror_string = character + mirror_string

  return mirror_string

def mirror(string):
  return string[-1::-1]

print(mirror("Petra Academy"))

HW:

Read 9.1-9.7 in HTCS